driver: gpu: switch to mali vendor driver

This commit is contained in:
Mauro (mdrjr) Ribeiro
2024-01-25 13:22:45 -03:00
parent 56cb3cf9e6
commit db3f0eb142
480 changed files with 183004 additions and 17 deletions

View File

@@ -0,0 +1,349 @@
/*
*
* (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation) and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program) and can also be obtained
* from Free Software Foundation) Inc.) 51 Franklin Street) Fifth Floor)
* Boston) MA 02110-1301) USA.
*
*/
What: /sys/class/misc/mali%u/device/core_mask
Description:
This attribute is used to restrict the number of shader cores
available in this instance, is useful for debugging purposes.
Reading this attribute provides us mask of all cores available.
Writing to it will set the current core mask. Doesn't
allow disabling all the cores present in this instance.
What: /sys/class/misc/mali%u/device/debug_command
Description:
This attribute is used to issue debug commands that supported
by the driver. On reading it provides the list of debug commands
that are supported, and writing back one of those commands will
enable that debug option.
What: /sys/class/misc/mali%u/device/dvfs_period
Description:
This is used to set the DVFS sampling period to be used by the
driver, On reading it provides the current DVFS sampling period,
on writing a value we set the DVFS sampling period.
What: /sys/class/misc/mali%u/device/dummy_job_wa_info
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU that requires a GPU workaround
to execute the dummy fragment job on all shader cores to
workaround a hang issue.
Its a readonly attribute and on reading gives details on the
options used with the dummy workaround.
What: /sys/class/misc/mali%u/device/fw_timeout
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. This attribute is
used to set the duration value in milliseconds for the
waiting timeout used for a GPU status change request being
acknowledged by the FW.
What: /sys/class/misc/mali%u/device/gpuinfo
Description:
This attribute provides description of the present Mali GPU.
Its a read only attribute provides details like GPU family, the
number of cores, the hardware version and the raw product id.
What: /sys/class/misc/mali%u/device/idle_hysteresis_time
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. This attribute is
used to configure the timeout value in microseconds for the
GPU idle handling. If GPU has been idle for this timeout
period, then it is put to sleep for GPUs where sleep feature
is supported or is powered down after suspending command
stream groups.
What: /sys/class/misc/mali%u/device/idle_hysteresis_time_ns
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. This attribute is
used to configure the timeout value in nanoseconds for the
GPU idle handling. If GPU has been idle for this timeout
period, then it is put to sleep for GPUs where sleep feature
is supported or is powered down after suspending command
stream groups.
What: /sys/class/misc/mali%u/device/js_ctx_scheduling_mode
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU. This attribute is used to set
context scheduling priority for a job slot.
On Reading it provides the currently set job slot context
priority.
Writing 0 to this attribute sets it to the mode were
higher priority atoms will be scheduled first, regardless of
the context they belong to. Newly-runnable higher priority atoms
can preempt lower priority atoms currently running on the GPU,
even if they belong to a different context.
Writing 1 to this attribute set it to the mode were the
highest-priority atom will be chosen from each context in turn
using a round-robin algorithm, so priority only has an effect
within the context an atom belongs to. Newly-runnable higher
priority atoms can preempt the lower priority atoms currently
running on the GPU, but only if they belong to the same context.
What: /sys/class/misc/mali%u/device/js_scheduling_period
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU. Used to set the job scheduler
tick period in nano-seconds. The Job Scheduler determines the
jobs that are run on the GPU, and for how long, Job Scheduler
makes decisions at a regular time interval determined by value
in js_scheduling_period.
What: /sys/class/misc/mali%u/device/js_softstop_always
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU. Soft-stops are disabled when
only a single context is present, this attribute is used to
enable soft-stop when only a single context is present can be
used for debug and unit-testing purposes.
What: /sys/class/misc/mali%u/device/js_timeouts
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU. It used to set the soft stop
and hard stop times for the job scheduler.
Writing value 0 causes no change, or -1 to restore the
default timeout.
The format used to set js_timeouts is
"<soft_stop_ms> <soft_stop_ms_cl> <hard_stop_ms_ss>
<hard_stop_ms_cl> <hard_stop_ms_dumping> <reset_ms_ss>
<reset_ms_cl> <reset_ms_dumping>"
What: /sys/class/misc/mali%u/device/lp_mem_pool_max_size
Description:
This attribute is used to set the maximum number of large pages
memory pools that the driver can contain. Large pages are of
size 2MB. On read it displays all the max size of all memory
pools and can be used to modify each individual pools as well.
What: /sys/class/misc/mali%u/device/lp_mem_pool_size
Description:
This attribute is used to set the number of large memory pages
which should be populated, changing this value may cause
existing pages to be removed from the pool, or new pages to be
created and then added to the pool. On read it will provide
pool size for all available pools and we can modify individual
pool.
What: /sys/class/misc/mali%u/device/mem_pool_max_size
Description:
This attribute is used to set the maximum number of small pages
for memory pools that the driver can contain. Here small pages
are of size 4KB. On read it will display the max size for all
available pools and allows us to set max size of
individual pools.
What: /sys/class/misc/mali%u/device/mem_pool_size
Description:
This attribute is used to set the number of small memory pages
which should be populated, changing this value may cause
existing pages to be removed from the pool, or new pages to
be created and then added to the pool. On read it will provide
pool size for all available pools and we can modify individual
pool.
What: /sys/class/misc/mali%u/device/device/mempool/ctx_default_max_size
Description:
This attribute is used to set maximum memory pool size for
all the memory pool so that the maximum amount of free memory
that each pool can hold is identical.
What: /sys/class/misc/mali%u/device/device/mempool/lp_max_size
Description:
This attribute is used to set the maximum number of large pages
for all memory pools that the driver can contain.
Large pages are of size 2MB.
What: /sys/class/misc/mali%u/device/device/mempool/max_size
Description:
This attribute is used to set the maximum number of small pages
for all the memory pools that the driver can contain.
Here small pages are of size 4KB.
What: /sys/class/misc/mali%u/device/pm_poweroff
Description:
This attribute contains the current values, represented as the
following space-separated integers:
• PM_GPU_POWEROFF_TICK_NS.
• PM_POWEROFF_TICK_SHADER.
• PM_POWEROFF_TICK_GPU.
Example:
echo 100000 4 4 > /sys/class/misc/mali0/device/pm_poweroff
Sets the following new values: 100,000ns tick, four ticks
for shader power down, and four ticks for GPU power down.
What: /sys/class/misc/mali%u/device/power_policy
Description:
This attribute is used to find the current power policy been
used, reading will list the power policies available and
enclosed in square bracket is the current one been selected.
Example:
cat /sys/class/misc/mali0/device/power_policy
[demand] coarse_demand always_on
To switch to a different policy at runtime write the valid entry
name back to the attribute.
Example:
echo "coarse_demand" > /sys/class/misc/mali0/device/power_policy
What: /sys/class/misc/mali%u/device/progress_timeout
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. This attribute
is used to set the progress timeout value and read the current
progress timeout value.
Progress timeout value is the maximum number of GPU cycles
without forward progress to allow to elapse before terminating a
GPU command queue group.
What: /sys/class/misc/mali%u/device/mcu_shader_pwroff_timeout
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. The duration value unit
is in micro-seconds and is used for configuring MCU shader Core power-off
timer. The configured MCU shader Core power-off timer will only have
effect when the host driver has delegated the shader cores
power management to MCU. The supplied value will be
recorded internally without any change. But the actual field
value will be subject to core power-off timer source frequency
scaling and maximum value limiting. The default source will be
SYSTEM_TIMESTAMP counter. But in case the platform is not able
to supply it, the GPU CYCLE_COUNTER source will be used as an
alternative.
If we set the value to zero then MCU-controlled shader/tiler
power management will be disabled.
What: /sys/class/misc/mali%u/device/mcu_shader_pwroff_timeout_ns
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. The duration value unit
is in nanoseconds and is used for configuring MCU shader Core power-off
timer. The configured MCU shader Core power-off timer will only have
effect when the host driver has delegated the shader cores
power management to MCU. The supplied value will be
recorded internally without any change. But the actual field
value will be subject to core power-off timer source frequency
scaling and maximum value limiting. The default source will be
SYSTEM_TIMESTAMP counter. But in case the platform is not able
to supply it, the GPU CYCLE_COUNTER source will be used as an
alternative.
If we set the value to zero then MCU-controlled shader/tiler
power management will be disabled.
What: /sys/class/misc/mali%u/device/csg_scheduling_period
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. The duration value unit
is in milliseconds and is used for configuring csf scheduling
tick duration.
What: /sys/class/misc/mali%u/device/reset_timeout
Description:
This attribute is used to set the number of milliseconds to
wait for the soft stop to complete for the GPU jobs before
proceeding with the GPU reset.
What: /sys/class/misc/mali%u/device/soft_job_timeout
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU. It used to set the timeout
value for waiting for any soft event to complete.
What: /sys/class/misc/mali%u/device/scheduling/serialize_jobs
Description:
This attribute is available only with platform device that
supports a Job Manager based GPU.
Various options available under this are:
• none - for disabling serialization.
• intra-slot - Serialize atoms within a slot, only one
atom per job slot.
• inter-slot - Serialize atoms between slots, only one
job slot running at any time.
• full - it a combination of both inter and intra slot,
so only one atom and one job slot running
at any time.
• full-reset - full serialization and Reset the GPU after
each atom completion
These options are useful for debugging and investigating
failures and gpu hangs to narrow down atoms that could cause
troubles.
What: /sys/class/misc/mali%u/device/firmware_config/Compute iterator count/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. Its a read-only attribute
which indicates the maximum number of Compute iterators
supported by the GPU.
What: /sys/class/misc/mali%u/device/firmware_config/CSHWIF count/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. Its a read-only
attribute which indicates the maximum number of CSHWIFs
supported by the GPU.
What: /sys/class/misc/mali%u/device/firmware_config/Fragment iterator count/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. Its a read-only
attribute which indicates the maximum number of
Fragment iterators supported by the GPU.
What: /sys/class/misc/mali%u/device/firmware_config/Scoreboard set count/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. Its a read-only
attribute which indicates the maximum number of
Scoreboard set supported by the GPU.
What: /sys/class/misc/mali%u/device/firmware_config/Tiler iterator count/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU. Its a read-only
attribute which indicates the maximum number of Tiler iterators
supported by the GPU.
What: /sys/class/misc/mali%u/device/firmware_config/Log verbosity/*
Description:
This attribute is available only with mali platform
device-driver that supports a CSF GPU.
Used to enable firmware logs, logging levels valid values
are indicated using 'min and 'max' attribute values
values that are read-only.
Log level can be set using the 'cur' read, write attribute,
we can use a valid log level value from min and max range values
and set a valid desired log level for firmware logs.

View File

@@ -0,0 +1,203 @@
/*
*
* (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation) and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program) and can also be obtained
* from Free Software Foundation) Inc.) 51 Franklin Street) Fifth Floor)
* Boston) MA 02110-1301) USA.
*
*/
What: /sys/bus/coresight/devices/mali-source-etm/enable_source
Description:
Attribute used to enable Coresight Source ETM.
What: /sys/bus/coresight/devices/mali-source-etm/is_enabled
Description:
Attribute used to check if Coresight Source ITM is enabled.
What: /sys/bus/coresight/devices/mali-source-etm/trcconfigr
Description:
Coresight Source ETM trace configuration to enable global
timestamping, and data value tracing.
What: /sys/bus/coresight/devices/mali-source-etm/trctraceidr
Description:
Coresight Source ETM trace ID.
What: /sys/bus/coresight/devices/mali-source-etm/trcvdarcctlr
Description:
Coresight Source ETM viewData include/exclude address
range comparators.
What: /sys/bus/coresight/devices/mali-source-etm/trcviiectlr
Description:
Coresight Source ETM viewInst include and exclude control.
What: /sys/bus/coresight/devices/mali-source-etm/trcstallctlr
Description:
Coresight Source ETM stall control register.
What: /sys/bus/coresight/devices/mali-source-itm/enable_source
Description:
Attribute used to enable Coresight Source ITM.
What: /sys/bus/coresight/devices/mali-source-itm/is_enabled
Description:
Attribute used to check if Coresight Source ITM is enabled.
What: /sys/bus/coresight/devices/mali-source-itm/dwt_ctrl
Description:
Coresight Source DWT configuration:
[0] = 1, enable cycle counter
[4:1] = 4, set PC sample rate pf 256 cycles
[8:5] = 1, set initial post count value
[9] = 1, select position of post count tap on the cycle counter
[10:11] = 1, enable sync packets
[12] = 1, enable periodic PC sample packets
What: /sys/bus/coresight/devices/mali-source-itm/itm_tcr
Description:
Coresight Source ITM configuration:
[0] = 1, Enable ITM
[1] = 1, Enable Time stamp generation
[2] = 1, Enable sync packet transmission
[3] = 1, Enable HW event forwarding
[11:10] = 1, Generate TS request approx every 128 cycles
[22:16] = 1, Trace bus ID
What: /sys/bus/coresight/devices/mali-source-ela/reset_regs
Description:
Attribute used to reset registers to zero.
What: /sys/bus/coresight/devices/mali-source-ela/enable_source
Description:
Attribute used to enable Coresight Source ELA.
What: /sys/bus/coresight/devices/mali-source-ela/is_enabled
Description:
Attribute used to check if Coresight Source ELA is enabled.
What: /sys/bus/coresight/devices/mali-source-ela/regs/TIMECTRL
Description:
Coresight Source ELA TIMECTRL register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/TSSR
Description:
Coresight Source ELA TSR register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/ATBCTRL
Description:
Coresight Source ELA ATBCTRL register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/PTACTION
Description:
Coresight Source ELA PTACTION register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/AUXCTRL
Description:
Coresight Source ELA AUXCTRL register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/CNTSEL
Description:
Coresight Source ELA CNTSEL register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/SIGSELn
Description:
Coresight Source ELA SIGSELn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/TRIGCTRLn
Description:
Coresight Source ELA TRIGCTRLn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/NEXTSTATEn
Description:
Coresight Source ELA NEXTSTATEn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/ACTIONn
Description:
Coresight Source ELA ACTIONn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/ALTNEXTSTATEn
Description:
Coresight Source ELA ALTNEXTSTATEn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/ALTACTIONn
Description:
Coresight Source ELA ALTACTIONn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/COMPCTRLn
Description:
Coresight Source ELA COMPCTRLn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/ALTCOMPCTRLn
Description:
Coresight Source ELA ALTCOMPCTRLn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/COUNTCOMPn
Description:
Coresight Source ELA COUNTCOMPn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/TWBSELn
Description:
Coresight Source ELA TWBSELn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/EXTMASKn
Description:
Coresight Source ELA EXTMASKn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/EXTCOMPn
Description:
Coresight Source ELA EXTCOMPn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/QUALMASKn
Description:
Coresight Source ELA QUALMASKn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/QUALCOMPn
Coresight Source ELA QUALCOMPn register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/SIGMASKn_0-7
Description:
Coresight Source ELA SIGMASKn_0-7 register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/SIGCOMPn_0-7
Description:
Coresight Source ELA SIGCOMPn_0-7 register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/SIGSELn_0-7
Description:
Coresight Source ELA SIGSELn_0-7 register set/get.
Refer to specification for more details.
What: /sys/bus/coresight/devices/mali-source-ela/regs/SIGMASKn_0-7
Description:
Coresight Source ELA SIGMASKn_0-7 register set/get.
Refer to specification for more details.

View File

@@ -0,0 +1,173 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2022-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
DebugFS interface:
------------------
A new per-kbase-context debugfs file called csf_sync has been implemented
which captures the current KCPU & GPU queue state of the not-yet-completed
operations and displayed through the debugfs file.
This file is at:
=======================================================
/sys/kernel/debug/mali0/ctx/<pid>_<context id>/csf_sync
=======================================================
Output Format:
----------------
The csf_sync file contains important data for the currently active queues.
This data is formatted into two segments, which are separated by a
pipe character: the common properties and the operation-specific properties.
Common Properties:
------------------
* Queue type: GPU or KCPU.
* kbase context id and the queue id.
* If the queue type is a GPU queue then the group handle is also noted,
in the middle of the other two IDs. The slot value is also dumped.
* Execution status, which can either be 'P' for pending or 'S' for started.
* Command type is then output which indicates the type of dependency
(i.e. wait or signal).
* Object address which is a pointer to the sync object that the
command operates on.
* The live value, which is the value of the synchronization object
at the time of dumping. This could help to determine why wait
operations might be blocked.
Operation-Specific Properties:
------------------------------
The operation-specific values for KCPU queue fence operations
are as follows: a unique timeline name, timeline context, and a fence
sequence number. The CQS WAIT and CQS SET are denoted in the sync dump
as their OPERATION counterparts, and therefore show the same operation
specific values; the argument value to wait on or set to, and operation type,
being (by definition) op:gt and op:set for CQS_WAIT and CQS_SET respectively.
There are only two operation-specific values for operations in GPU queues
which are always shown; the argument value to wait on or set/add to,
and the operation type (set/add) or wait condition (e.g. LE, GT, GE).
Examples
--------
GPU Queue Example
------------------
The following output is of a GPU queue, from a process that has a KCTX ID of 52,
is in Queue Group (CSG) 0, and has Queue ID 0. It has started and is waiting on
the object at address 0x0000007f81ffc800. The live value is 0,
as is the arg value. However, the operation "op" is GT, indicating it's waiting
for the live value to surpass the arg value:
======================================================================================================================================
queue:GPU-52-0-0 exec:S cmd:SYNC_WAIT slot:4 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
======================================================================================================================================
The following is an example of GPU queue dump, where the SYNC SET operation
is blocked by the preceding SYNC WAIT operation. This shows two GPU queues,
with the same KCTX ID of 8, Queue Group (CSG) 0, and Queue ID 0. The SYNC WAIT
operation has started, while the SYNC SET is pending, blocked by the SYNC WAIT.
Both operations are on the same slot, 2 and have live value of 0. The SYNC WAIT
is waiting on the object at address 0x0000007f81ffc800, while the SYNC SET will
set the object at address 0x00000000a3bad4fb when it is unblocked.
The operation "op" is GT for the SYNC WAIT, indicating it's waiting for the
live value to surpass the arg value, while the operation and arg value for the
SYNC SET is "set" and "1" respectively:
======================================================================================================================================
queue:GPU-8-0-0 exec:S cmd:SYNC_WAIT slot:2 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
queue:GPU-8-0-0 exec:P cmd:SYNC_SET slot:2 obj:0x00000000a3bad4fb live_value:0x0000000000000000 | op:set arg_value:0x0000000000000001
======================================================================================================================================
KCPU Queue Example
------------------
The following is an example of a KCPU queue, from a process that has
a KCTX ID of 0 and has Queue ID 1. It has started and is waiting on the
object at address 0x0000007fbf6f2ff8. The live value is currently 0 with
the "op" being GT indicating it is waiting on the live value to
surpass the arg value.
===============================================================================================================================
queue:KCPU-0-1 exec:S cmd:CQS_WAIT_OPERATION obj:0x0000007fbf6f2ff8 live_value:0x0000000000000000 | op:gt arg_value: 0x00000000
===============================================================================================================================
CSF Sync State Dump For Fence Signal Timeouts
---------------------------------------------
Summary
-------
A timer has been added to the KCPU queues which is checked to ensure
the queues have not "timed out" between the enqueuing of a fence signal command
and it's eventual execution. If this timeout happens then the CSF sync state
of all KCPU queues of the offending context is dumped. This feature is enabled
by default, but can be disabled/enabled later.
Explanation
------------
This new timer is created and destroyed alongside the creation and destruction
of each KCPU queue. It is started when a fence signal is enqueued, and cancelled
when the fence signal command has been processed. The timer times out after
10 seconds (at 100 MHz) if the execution of that fence signal event was never
processed. If this timeout occurs then the timer callback function identifies
the KCPU queue which the timer belongs to and invokes the CSF synchronisation
state dump mechanism, writing the sync state for the context of the queue
causing the timeout is dump to dmesg.
Fence Timeouts Controls
-----------------------
Disable/Enable Feature
----------------------
This feature is enabled by default, but can be disabled/ re-enabled via DebugFS
controls. The 'fence_signal_timeout_enable' debugfs entry is a global flag
which is written to, to turn this feature on and off.
Example:
--------
when writing to fence_signal_timeout_enable entry:
echo 1 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is enabled.
echo 0 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is disabled.
It is also possible to read from this file to check if the feature is currently
enabled or not checking the return value of fence_signal_timeout_enable.
Example:
--------
when reading from fence_signal_timeout_enable entry, if:
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 1 -> feature is enabled.
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 0 -> feature is disabled.
Update Timer Duration
---------------------
The timeout duration can be accessed through the 'fence_signal_timeout_ms'
debugfs entry. This can be read from to retrieve the current time in
milliseconds.
Example:
--------
cat /sys/kernel/debug/mali0/fence_signal_timeout_ms
The 'fence_signal_timeout_ms' debugfs entry can also be written to, to update
the time in milliseconds.
Example:
--------
echo 10000 > /sys/kernel/debug/mali0/fence_signal_timeout_ms

View File

@@ -0,0 +1,116 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
=====================================
ARM CoreSight Mali Source integration
=====================================
See Documentation/trace/coresight/coresight.rst for detailed information
about Coresight.
This documentation will cover Mali specific devicetree integration.
References to Sink ports are given as examples. Access to Sink is specific
to an implementation and would require dedicated kernel modules.
ARM Coresight Mali Source ITM
=============================
Required properties
-------------------
- compatible: Has to be "arm,coresight-mali-source-itm"
- gpu : phandle to a Mali GPU definition
- port:
- endpoint:
- remote-endpoint: phandle to a Coresight sink port
Example
-------
mali-source-itm {
compatible = "arm,coresight-mali-source-itm";
gpu = <&gpu>;
port {
mali_source_itm_out_port0: endpoint {
remote-endpoint = <&mali_sink_in_port0>;
};
};
};
ARM Coresight Mali Source ETM
=============================
Required properties
-------------------
- compatible: Has to be "arm,coresight-mali-source-etm"
- gpu : phandle to a Mali GPU definition
- port:
- endpoint:
- remote-endpoint: phandle to a Coresight sink port
Example
-------
mali-source-etm {
compatible = "arm,coresight-mali-source-etm";
gpu = <&gpu>;
port {
mali_source_etm_out_port0: endpoint {
remote-endpoint = <&mali_sink_in_port1>;
};
};
};
ARM Coresight Mali Source ELA
=============================
Required properties
-------------------
- compatible: Has to be "arm,coresight-mali-source-ela"
- gpu : phandle to a Mali GPU definition
- port:
- endpoint:
- remote-endpoint: phandle to a Coresight sink port
Example: Split JCN request/response channel
--------------------------------------------
This examples applies to implementations with a total of 5 signal groups,
where JCN request and response are assigned to independent or shared
channels depending on the GPU model.
mali-source-ela {
compatible = "arm,coresight-mali-source-ela";
gpu = <&gpu>;
port {
mali_source_ela_out_port0: endpoint {
remote-endpoint = <&mali_sink_in_port2>;
};
};
};
SysFS Configuration
--------------------------------------------
The register values used by CoreSight for ELA can be configured using SysFS
interfaces. This implicitly includes configuring the ELA for independent or
shared JCN request and response channels.

View File

@@ -0,0 +1,244 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2013-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
* ARM Mali Midgard / Bifrost devices
Required properties:
- compatible : Should be mali<chip>, replacing digits with x from the back,
until malit<Major>xx, and it must end with one of: "arm,malit6xx" or
"arm,mali-midgard" or "arm,mali-bifrost"
- reg : Physical base address of the device and length of the register area.
- interrupts : Contains the three IRQ lines required by T-6xx devices
- interrupt-names : Contains the names of IRQ resources in the order they were
provided in the interrupts property. Must contain: "JOB, "MMU", "GPU".
Optional:
- clocks : One or more pairs of phandle to clock and clock specifier
for the Mali device. The order is important: the first clock
shall correspond to the "clk_mali" source, while the second clock
(that is optional) shall correspond to the "shadercores" source.
- clock-names : Shall be set to: "clk_mali", "shadercores".
- mali-supply : Phandle to the top level regulator for the Mali device.
Refer to
Documentation/devicetree/bindings/regulator/regulator.txt for details.
- shadercores-supply : Phandle to shader cores regulator for the Mali device.
This is optional.
- operating-points-v2 : Refer to Documentation/devicetree/bindings/power/mali-opp.txt
for details.
- quirks-gpu : Used to write to the JM_CONFIG or CSF_CONFIG register.
Should be used with care. Options passed here are used to override
certain default behavior. Note: This will override 'idvs-group-size'
field in devicetree and module param 'corestack_driver_control',
therefore if 'quirks-gpu' is used then 'idvs-group-size' and
'corestack_driver_control' value should be incorporated into 'quirks-gpu'.
- quirks-sc : Used to write to the SHADER_CONFIG register.
Should be used with care. Options passed here are used to override
certain default behavior.
- quirks-tiler : Used to write to the TILER_CONFIG register.
Should be used with care. Options passed here are used to
disable or override certain default behavior.
- quirks-mmu : Used to write to the L2_CONFIG register.
Should be used with care. Options passed here are used to
disable or override certain default behavior.
- power-model : Sets the power model parameters. Defined power models include:
"mali-simple-power-model", "mali-g51-power-model", "mali-g52-power-model",
"mali-g52_r1-power-model", "mali-g71-power-model", "mali-g72-power-model",
"mali-g76-power-model", "mali-g77-power-model", "mali-tnax-power-model",
"mali-tbex-power-model" and "mali-tbax-power-model".
- mali-simple-power-model: this model derives the GPU power usage based
on the GPU voltage scaled by the system temperature. Note: it was
designed for the Juno platform, and may not be suitable for others.
- compatible: Should be "arm,mali-simple-power-model"
- dynamic-coefficient: Coefficient, in pW/(Hz V^2), which is
multiplied by v^2*f to calculate the dynamic power consumption.
- static-coefficient: Coefficient, in uW/V^3, which is
multiplied by v^3 to calculate the static power consumption.
- ts: An array containing coefficients for the temperature
scaling factor. This is used to scale the static power by a
factor of tsf/1000000,
where tsf = ts[3]*T^3 + ts[2]*T^2 + ts[1]*T + ts[0],
and T = temperature in degrees.
- thermal-zone: A string identifying the thermal zone used for
the GPU
- temp-poll-interval-ms: the interval at which the system
temperature is polled
- mali-g*-power-model(s): unless being stated otherwise, these models derive
the GPU power usage based on performance counters, so they are more
accurate.
- compatible: Should be, as examples, "arm,mali-g51-power-model" /
"arm,mali-g72-power-model".
- scale: the dynamic power calculated by the power model is
multiplied by a factor of 'scale'. This value should be
chosen to match a particular implementation.
- min_sample_cycles: Fall back to the simple power model if the
number of GPU cycles for a given counter dump is less than
'min_sample_cycles'. The default value of this should suffice.
* Note: when IPA is used, two separate power models (simple and counter-based)
are used at different points so care should be taken to configure
both power models in the device tree (specifically dynamic-coefficient,
static-coefficient and scale) to best match the platform.
- power-policy : Sets the GPU power policy at probe time. Available options are
"coarse_demand" and "always_on". If not set, then "coarse_demand" is used.
- system-coherency : Sets the coherency protocol to be used for coherent
accesses made from the GPU.
If not set then no coherency is used.
- 0 : ACE-Lite
- 1 : ACE
- 31 : No coherency
- ipa-model : Sets the IPA model to be used for power management. GPU probe will fail if the
model is not found in the registered models list. If no model is specified here,
a gpu-id based model is picked if available, otherwise the default model is used.
- mali-simple-power-model: Default model used on mali
- idvs-group-size : Override the IDVS group size value. Tasks are sent to
cores in groups of N + 1, so i.e. 0xF means 16 tasks.
Valid values are between 0 to 0x3F (including).
- l2-size : Override L2 cache size on GPU that supports it
- l2-hash : Override L2 hash function on GPU that supports it
- l2-hash-values : Override L2 hash function using provided hash values, on GPUs that supports it.
It is mutually exclusive with 'l2-hash'. Only one or the other must be
used in a supported GPU.
- arbiter-if : Phandle to the arbif platform device, used to provide KBASE with an interface
to the Arbiter. This is required when using arbitration; setting to a non-NULL
value will enable arbitration.
If arbitration is in use, then there should be no external GPU control.
When arbiter-if is in use then the following must not be:
- power-model (no IPA allowed with arbitration)
- #cooling-cells
- operating-points-v2 (no dvfs in kbase with arbitration)
- system-coherency with a value of 1 (no full coherency with arbitration)
- int-id-override: list of <ID Setting[7:0]> tuples defining the IDs needed to be
set and the setting coresponding to the SYSC_ALLOC register.
- propagate-bits: Used to write to L2_CONFIG.PBHA_HWU. This bitset establishes which
PBHA bits are propagated on the AXI bus.
Example for a Mali GPU with 1 clock and 1 regulator:
gpu@0xfc010000 {
compatible = "arm,malit602", "arm,malit60x", "arm,malit6xx", "arm,mali-midgard";
reg = <0xfc010000 0x4000>;
interrupts = <0 36 4>, <0 37 4>, <0 38 4>;
interrupt-names = "JOB", "MMU", "GPU";
clocks = <&pclk_mali>;
clock-names = "clk_mali";
mali-supply = <&vdd_mali>;
operating-points-v2 = <&gpu_opp_table>;
power_model@0 {
compatible = "arm,mali-simple-power-model";
static-coefficient = <2427750>;
dynamic-coefficient = <4687>;
ts = <20000 2000 (-20) 2>;
thermal-zone = "gpu";
};
power_model@1 {
compatible = "arm,mali-g71-power-model";
scale = <5>;
};
idvs-group-size = <0x7>;
l2-size = /bits/ 8 <0x10>;
l2-hash = /bits/ 8 <0x04>; /* or l2-hash-values = <0x12345678 0x8765 0xAB>; */
};
gpu_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@533000000 {
opp-hz = /bits/ 64 <533000000>;
opp-microvolt = <1250000>;
};
opp@450000000 {
opp-hz = /bits/ 64 <450000000>;
opp-microvolt = <1150000>;
};
opp@400000000 {
opp-hz = /bits/ 64 <400000000>;
opp-microvolt = <1125000>;
};
opp@350000000 {
opp-hz = /bits/ 64 <350000000>;
opp-microvolt = <1075000>;
};
opp@266000000 {
opp-hz = /bits/ 64 <266000000>;
opp-microvolt = <1025000>;
};
opp@160000000 {
opp-hz = /bits/ 64 <160000000>;
opp-microvolt = <925000>;
};
opp@100000000 {
opp-hz = /bits/ 64 <100000000>;
opp-microvolt = <912500>;
};
};
Example for a Mali GPU with 2 clocks and 2 regulators:
gpu: gpu@6e000000 {
compatible = "arm,mali-midgard";
reg = <0x0 0x6e000000 0x0 0x200000>;
interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
interrupt-names = "JOB", "MMU", "GPU";
clocks = <&clk_mali 0>, <&clk_mali 1>;
clock-names = "clk_mali", "shadercores";
mali-supply = <&supply0_3v3>;
shadercores-supply = <&supply1_3v3>;
system-coherency = <31>;
operating-points-v2 = <&gpu_opp_table>;
};
gpu_opp_table: opp_table0 {
compatible = "operating-points-v2", "operating-points-v2-mali";
opp@0 {
opp-hz = /bits/ 64 <50000000>;
opp-hz-real = /bits/ 64 <50000000>, /bits/ 64 <45000000>;
opp-microvolt = <820000>, <800000>;
opp-core-mask = /bits/ 64 <0xf>;
};
opp@1 {
opp-hz = /bits/ 64 <40000000>;
opp-hz-real = /bits/ 64 <40000000>, /bits/ 64 <35000000>;
opp-microvolt = <720000>, <700000>;
opp-core-mask = /bits/ 64 <0x7>;
};
opp@2 {
opp-hz = /bits/ 64 <30000000>;
opp-hz-real = /bits/ 64 <30000000>, /bits/ 64 <25000000>;
opp-microvolt = <620000>, <700000>;
opp-core-mask = /bits/ 64 <0x3>;
};
};
Example for a Mali GPU supporting PBHA configuration via DTB (default):
gpu@0xfc010000 {
...
pbha {
int-id-override = <2 0x32>, <9 0x05>, <16 0x32>;
propagate-bits = /bits/ 4 <0x03>;
};
...
};

View File

@@ -0,0 +1,48 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
* Arm memory group manager for Mali GPU device drivers
Required properties:
- compatible: Must be "arm,physical-memory-group-manager"
An example node:
gpu_physical_memory_group_manager: physical-memory-group-manager {
compatible = "arm,physical-memory-group-manager";
};
It must be referenced by the GPU as well, see physical-memory-group-manager:
gpu: gpu@0x6e000000 {
compatible = "arm,mali-midgard";
reg = <0x0 0x6e000000 0x0 0x200000>;
interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
interrupt-names = "JOB", "MMU", "GPU";
clocks = <&scpi_dvfs 2>;
clock-names = "clk_mali";
system-coherency = <31>;
physical-memory-group-manager = <&gpu_physical_memory_group_manager>;
operating-points = <
/* KHz uV */
50000 820000
>;
};

View File

@@ -0,0 +1,48 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
* Arm priority control manager for Mali GPU device drivers
Required properties:
- compatible: Must be "arm,priority-control-manager"
An example node:
gpu_priority_control_manager: priority-control-manager {
compatible = "arm,priority-control-manager";
};
It must be referenced by the GPU as well, see priority-control-manager:
gpu: gpu@0x6e000000 {
compatible = "arm,mali-midgard";
reg = <0x0 0x6e000000 0x0 0x200000>;
interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
interrupt-names = "JOB", "MMU", "GPU";
clocks = <&scpi_dvfs 2>;
clock-names = "clk_mali";
system-coherency = <31>;
priority-control-manager = <&gpu_priority_control_manager>;
operating-points = <
/* KHz uV */
50000 820000
>;
};

View File

@@ -0,0 +1,68 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
* Arm protected memory allocator for Mali GPU device drivers
Required properties:
- compatible: Must be "arm,protected-memory-allocator"
The protected memory allocator manages allocation of physical pages of a
reserved memory region of protected memory, therefore its device node shall
reference a reserved memory region.
In addition to that, the protected memory allocator shall be referenced
by the GPU.
A complete example configuration for the device tree:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
mali_protected: mali_protected@c0000000 {
compatible = "mali-reserved";
reg = <0x0 0xc0000000 0x0 0x1000000>;
};
};
gpu_protected_memory_allocator: protected-memory-allocator {
compatible = "arm,protected-memory-allocator";
memory-region = <&mali_protected>;
};
gpu_fpga: gpu@0x6e000000 {
compatible = "arm,mali-midgard";
reg = <0x0 0x6e000000 0x0 0x200000>;
interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
interrupt-names = "JOB", "MMU", "GPU";
clocks = <&scpi_dvfs 2>;
clock-names = "clk_mali";
protected-memory-allocator = <&gpu_protected_memory_allocator>;
operating-points = <
/* KHz uV */
50000 820000
>;
};
The protected memory allocator is gpu_protected_memory_allocator.
It references the mali_protected reserved memory region and, in turn,
it is referenced by the GPU as protected-memory-allocator.

View File

@@ -0,0 +1,201 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2017, 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
* ARM Mali Midgard OPP
* OPP Table Node
This describes the OPPs belonging to a device. This node can have following
properties:
Required properties:
- compatible: Allow OPPs to express their compatibility. It should be:
"operating-points-v2", "operating-points-v2-mali".
- OPP nodes: One or more OPP nodes describing voltage-current-frequency
combinations. Their name isn't significant but their phandle can be used to
reference an OPP.
* OPP Node
This defines voltage-current-frequency combinations along with other related
properties.
Required properties:
- opp-hz: Nominal frequency in Hz, expressed as a 64-bit big-endian integer.
This should be treated as a relative performance measurement, taking both GPU
frequency and core mask into account.
Optional properties:
- opp-hz-real: List of one or two real frequencies in Hz, expressed as 64-bit
big-endian integers. They shall correspond to the clocks declared under
the Mali device node, and follow the same order.
- opp-core-mask: Shader core mask. If neither this or opp-core-count are present
then all shader cores will be used for this OPP.
- opp-core-count: Number of cores to use for this OPP. If this is present then
the driver will build a core mask using the available core mask provided by
the GPU hardware. An opp-core-count value of 0 is not permitted.
If neither this nor opp-core-mask are present then all shader cores will be
used for this OPP.
If both this and opp-core-mask are present then opp-core-mask is ignored.
- opp-microvolt: List of one or two voltages in micro Volts. They shall correspond
to the regulators declared under the Mali device node, and follow the order:
"toplevel", "shadercores".
A single regulator's voltage is specified with an array of size one or three.
Single entry is for target voltage and three entries are for <target min max>
voltages.
Entries for multiple regulators must be present in the same order as
regulators are specified in device's DT node.
- opp-microvolt-<name>: Named opp-microvolt property. This is exactly similar to
the above opp-microvolt property, but allows multiple voltage ranges to be
provided for the same OPP. At runtime, the platform can pick a <name> and
matching opp-microvolt-<name> property will be enabled for all OPPs. If the
platform doesn't pick a specific <name> or the <name> doesn't match with any
opp-microvolt-<name> properties, then opp-microvolt property shall be used, if
present.
- opp-microamp: The maximum current drawn by the device in microamperes
considering system specific parameters (such as transients, process, aging,
maximum operating temperature range etc.) as necessary. This may be used to
set the most efficient regulator operating mode.
Should only be set if opp-microvolt is set for the OPP.
Entries for multiple regulators must be present in the same order as
regulators are specified in device's DT node. If this property isn't required
for few regulators, then this should be marked as zero for them. If it isn't
required for any regulator, then this property need not be present.
- opp-microamp-<name>: Named opp-microamp property. Similar to
opp-microvolt-<name> property, but for microamp instead.
- clock-latency-ns: Specifies the maximum possible transition latency (in
nanoseconds) for switching to this OPP from any other OPP.
- turbo-mode: Marks the OPP to be used only for turbo modes. Turbo mode is
available on some platforms, where the device can run over its operating
frequency for a short duration of time limited by the device's power, current
and thermal limits.
- opp-suspend: Marks the OPP to be used during device suspend. Only one OPP in
the table should have this.
- opp-mali-errata-1485982: Marks the OPP to be selected for suspend clock.
This will be effective only if MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE is
enabled. It needs to be placed in any OPP that has proper suspend clock for
the HW workaround.
- opp-supported-hw: This enables us to select only a subset of OPPs from the
larger OPP table, based on what version of the hardware we are running on. We
still can't have multiple nodes with the same opp-hz value in OPP table.
It's an user defined array containing a hierarchy of hardware version numbers,
supported by the OPP. For example: a platform with hierarchy of three levels
of versions (A, B and C), this field should be like <X Y Z>, where X
corresponds to Version hierarchy A, Y corresponds to version hierarchy B and Z
corresponds to version hierarchy C.
Each level of hierarchy is represented by a 32 bit value, and so there can be
only 32 different supported version per hierarchy. i.e. 1 bit per version. A
value of 0xFFFFFFFF will enable the OPP for all versions for that hierarchy
level. And a value of 0x00000000 will disable the OPP completely, and so we
never want that to happen.
If 32 values aren't sufficient for a version hierarchy, than that version
hierarchy can be contained in multiple 32 bit values. i.e. <X Y Z1 Z2> in the
above example, Z1 & Z2 refer to the version hierarchy Z.
- status: Marks the node enabled/disabled.
Example for a Juno with 1 clock and 1 regulator:
gpu_opp_table: opp_table0 {
compatible = "operating-points-v2", "operating-points-v2-mali";
opp@112500000 {
opp-hz = /bits/ 64 <112500000>;
opp-hz-real = /bits/ 64 <450000000>;
opp-microvolt = <820000>;
opp-core-mask = /bits/ 64 <0x1>;
opp-suspend;
opp-mali-errata-1485982;
};
opp@225000000 {
opp-hz = /bits/ 64 <225000000>;
opp-hz-real = /bits/ 64 <450000000>;
opp-microvolt = <820000>;
opp-core-count = <2>;
};
opp@450000000 {
opp-hz = /bits/ 64 <450000000>;
opp-hz-real = /bits/ 64 <450000000>;
opp-microvolt = <820000>;
opp-core-mask = /bits/ 64 <0xf>;
};
opp@487500000 {
opp-hz = /bits/ 64 <487500000>;
opp-microvolt = <825000>;
};
opp@525000000 {
opp-hz = /bits/ 64 <525000000>;
opp-microvolt = <850000>;
};
opp@562500000 {
opp-hz = /bits/ 64 <562500000>;
opp-microvolt = <875000>;
};
opp@600000000 {
opp-hz = /bits/ 64 <600000000>;
opp-microvolt = <900000>;
};
};
Example for a Juno with 2 clocks and 2 regulators:
gpu_opp_table: opp_table0 {
compatible = "operating-points-v2", "operating-points-v2-mali";
opp@0 {
opp-hz = /bits/ 64 <50000000>;
opp-hz-real = /bits/ 64 <50000000>, /bits/ 64 <45000000>;
opp-microvolt = <820000>, <800000>;
opp-core-mask = /bits/ 64 <0xf>;
};
opp@1 {
opp-hz = /bits/ 64 <40000000>;
opp-hz-real = /bits/ 64 <40000000>, /bits/ 64 <35000000>;
opp-microvolt = <720000>, <700000>;
opp-core-mask = /bits/ 64 <0x7>;
};
opp@2 {
opp-hz = /bits/ 64 <30000000>;
opp-hz-real = /bits/ 64 <30000000>, /bits/ 64 <25000000>;
opp-microvolt = <620000>, <700000>;
opp-core-mask = /bits/ 64 <0x3>;
};
};

View File

@@ -0,0 +1,42 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2013, 2020-2022 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
=====================
dma-buf-test-exporter
=====================
Overview
--------
The dma-buf-test-exporter is a simple exporter of dma_buf objects.
It has a private API to allocate and manipulate the buffers which are represented as dma_buf fds.
The private API allows:
* simple allocation of physically non-contiguous buffers
* simple allocation of physically contiguous buffers
* query kernel side API usage stats (number of attachments, number of mappings, mmaps)
* failure mode configuration (fail attach, mapping, mmap)
* kernel side memset of buffers
The buffers support all of the dma_buf API, including mmap.
It supports being compiled as a module both in-tree and out-of-tree.
See include/uapi/base/arm/dma_buf_test_exporter/dma-buf-test-exporter.h for the ioctl interface.
See Documentation/dma-buf-sharing.txt for details on dma_buf.

View File

@@ -2475,19 +2475,26 @@
};
};
mali: gpu@ffe40000 {
compatible = "amlogic,meson-g12a-mali", "arm,mali-bifrost";
reg = <0x0 0xffe40000 0x0 0x40000>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 162 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 161 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 160 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "job", "mmu", "gpu";
clocks = <&clkc CLKID_MALI>;
resets = <&reset RESET_DVALIN_CAPB3>, <&reset RESET_DVALIN>;
operating-points-v2 = <&gpu_opp_table>;
#cooling-cells = <2>;
};
mali: gpu@ffe40000 {
compatible = "arm,mali-midgard";
reg = <0x0 0xffe40000 0x0 0x40000>,
<0 0xFFD01000 0 0x01000>,
<0 0xFF800000 0 0x01000>,
<0 0xFF63c000 0 0x01000>,
<0 0xFFD01000 0 0x01000>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 160 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 161 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 162 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "GPU", "MMU", "JOB";
clocks = <&clkc CLKID_MALI>;
clock-names = "clk_mali";
resets = <&reset RESET_DVALIN_CAPB3>, <&reset RESET_DVALIN>;
operating-points-v2 = <&gpu_opp_table>;
#cooling-cells = <2>;
};
};
thermal-zones {

View File

@@ -139,7 +139,8 @@
};
&mali {
dma-coherent;
system-coherency = <0>;
power_policy = "always_on";
};
&pmu {

View File

@@ -27,12 +27,12 @@ CONFIG_THREAD_INFO_IN_TASK=y
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
# CONFIG_WERROR is not set
CONFIG_WERROR=y
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="@DEVICENAME@"
CONFIG_DEFAULT_HOSTNAME="odroid"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_SYSVIPC_COMPAT=y
@@ -205,7 +205,7 @@ CONFIG_INITRAMFS_PRESERVE_MTIME=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_LD_ORPHAN_WARN=y
CONFIG_LD_ORPHAN_WARN_LEVEL="warn"
CONFIG_LD_ORPHAN_WARN_LEVEL="error"
CONFIG_SYSCTL=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
@@ -6505,6 +6505,30 @@ CONFIG_PM_OPP=y
# CONFIG_PECI is not set
# CONFIG_HTE is not set
# CONFIG_CDX_BUS is not set
#
# ARM GPU Configuration
#
CONFIG_MALI_MIDGARD=m
CONFIG_MALI_PLATFORM_NAME="meson"
CONFIG_MALI_REAL_HW=y
#
# Platform specific options
#
# end of Platform specific options
# CONFIG_MALI_CSF_SUPPORT is not set
CONFIG_MALI_DEVFREQ=y
CONFIG_MALI_GATOR_SUPPORT=y
# CONFIG_MALI_MIDGARD_ENABLE_TRACE is not set
# CONFIG_MALI_ARBITER_SUPPORT is not set
# CONFIG_MALI_DMA_BUF_MAP_ON_DEMAND is not set
# CONFIG_MALI_DMA_BUF_LEGACY_COMPAT is not set
# CONFIG_MALI_EXPERT is not set
# CONFIG_MALI_ARBITRATION is not set
CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD=y
# end of ARM GPU Configuration
# end of Device Drivers
#

View File

@@ -242,5 +242,6 @@ source "drivers/peci/Kconfig"
source "drivers/hte/Kconfig"
source "drivers/cdx/Kconfig"
source "drivers/gpu/arm/Kconfig"
endmenu

View File

@@ -0,0 +1,57 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
load(
"//build/kernel/kleaf:kernel.bzl",
"kernel_module",
)
filegroup(
name = "base_kconfig",
srcs = glob([
"**/*Kconfig",
]),
visibility = [
"//common:__pkg__",
"//common-modules/mali:__subpackages__",
],
)
_base_modules = []
kernel_module(
name = "base",
srcs = glob([
"**/*.c",
"**/*.h",
"**/*.S",
"**/*Kbuild",
"**/*Makefile",
]) + [
"//common:kernel_headers",
"//common-modules/mali:headers",
],
outs = _base_modules,
kernel_build = "//common:kernel_aarch64",
visibility = [
"//common:__pkg__",
"//common-modules/mali:__subpackages__",
],
)

35
drivers/base/arm/Kbuild Normal file
View File

@@ -0,0 +1,35 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
#
# ccflags
#
src:=$(if $(patsubst /%,,$(src)),$(srctree)/$(src),$(src))
ccflags-y += -I$(src)/../../../include
subdir-ccflags-y += $(ccflags-y)
#
# Kernel modules
#
obj-$(CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER) += dma_buf_test_exporter/
obj-$(CONFIG_MALI_MEMORY_GROUP_MANAGER) += memory_group_manager/
obj-$(CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR) += protected_memory_allocator/

64
drivers/base/arm/Kconfig Normal file
View File

@@ -0,0 +1,64 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menuconfig MALI_BASE_MODULES
bool "Mali Base extra modules"
default n
help
Enable this option to build support for a Arm Mali base modules.
Those modules provide extra features or debug interfaces and,
are optional for the use of the Mali GPU modules.
config DMA_SHARED_BUFFER_TEST_EXPORTER
bool "Build dma-buf framework test exporter module"
depends on MALI_BASE_MODULES && DMA_SHARED_BUFFER
default y
help
This option will build the dma-buf framework test exporter module.
Usable to help test importers.
Modules:
- dma-buf-test-exporter.ko
config MALI_MEMORY_GROUP_MANAGER
bool "Build Mali Memory Group Manager module"
depends on MALI_BASE_MODULES
default y
help
This option will build the memory group manager module.
This is an example implementation for allocation and release of pages
for memory pools managed by Mali GPU device drivers.
Modules:
- memory_group_manager.ko
config MALI_PROTECTED_MEMORY_ALLOCATOR
bool "Build Mali Protected Memory Allocator module"
depends on MALI_BASE_MODULES && MALI_CSF_SUPPORT
default y
help
This option will build the protected memory allocator module.
This is an example implementation for allocation and release of pages
of secure memory intended to be used by the firmware
of Mali GPU device drivers.
Modules:
- protected_memory_allocator.ko

156
drivers/base/arm/Makefile Normal file
View File

@@ -0,0 +1,156 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
#
# Paths
#
KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
KDIR ?= $(KERNEL_SRC)
M ?= $(shell pwd)
ifeq ($(KDIR),)
$(error Must specify KDIR to point to the kernel to target))
endif
CONFIGS :=
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
#
# Default configuration values
#
CONFIG_MALI_BASE_MODULES ?= n
ifeq ($(CONFIG_MALI_BASE_MODULES),y)
CONFIG_MALI_CSF_SUPPORT ?= n
ifneq ($(CONFIG_DMA_SHARED_BUFFER),n)
CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER ?= y
else
# Prevent misuse when CONFIG_DMA_SHARED_BUFFER=n
CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER = n
endif
CONFIG_MALI_MEMORY_GROUP_MANAGER ?= y
ifneq ($(CONFIG_MALI_CSF_SUPPORT), n)
CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR ?= y
endif
else
# Prevent misuse when CONFIG_MALI_BASE_MODULES=n
CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER = n
CONFIG_MALI_MEMORY_GROUP_MANAGER = n
CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR = n
endif
CONFIGS += \
CONFIG_MALI_BASE_MODULES \
CONFIG_MALI_CSF_SUPPORT \
CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER \
CONFIG_MALI_MEMORY_GROUP_MANAGER \
CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR \
endif
#
# MAKE_ARGS to pass the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs and values.
# $(value config) is the name of the CONFIG option.
# $(value $(value config)) is its value (y, m).
# When the CONFIG is not set to y or m, it defaults to n.
MAKE_ARGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
$(value config)=$(value $(value config)), \
$(value config)=n))
#
# EXTRA_CFLAGS to define the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs defines with values from CONFIGS.
# $(value config) is the name of the CONFIG option.
# When set to y or m, the CONFIG gets defined to 1.
EXTRA_CFLAGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
-D$(value config)=1))
CFLAGS_MODULE += -Wall -Werror
ifeq ($(CONFIG_GCOV_KERNEL), y)
CFLAGS_MODULE += $(call cc-option, -ftest-coverage)
CFLAGS_MODULE += $(call cc-option, -fprofile-arcs)
EXTRA_CFLAGS += -DGCOV_PROFILE=1
endif
ifeq ($(CONFIG_MALI_KCOV),y)
CFLAGS_MODULE += $(call cc-option, -fsanitize-coverage=trace-cmp)
EXTRA_CFLAGS += -DKCOV=1
EXTRA_CFLAGS += -DKCOV_ENABLE_COMPARISONS=1
endif
# The following were added to align with W=1 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wextra -Wunused -Wno-unused-parameter
CFLAGS_MODULE += -Wmissing-declarations
CFLAGS_MODULE += -Wmissing-format-attribute
CFLAGS_MODULE += -Wmissing-prototypes
CFLAGS_MODULE += -Wold-style-definition
# The -Wmissing-include-dirs cannot be enabled as the path to some of the
# included directories change depending on whether it is an in-tree or
# out-of-tree build.
CFLAGS_MODULE += $(call cc-option, -Wunused-but-set-variable)
CFLAGS_MODULE += $(call cc-option, -Wunused-const-variable)
CFLAGS_MODULE += $(call cc-option, -Wpacked-not-aligned)
CFLAGS_MODULE += $(call cc-option, -Wstringop-truncation)
# The following turn off the warnings enabled by -Wextra
CFLAGS_MODULE += -Wno-sign-compare
CFLAGS_MODULE += -Wno-shift-negative-value
# This flag is needed to avoid build errors on older kernels
CFLAGS_MODULE += $(call cc-option, -Wno-cast-function-type)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
# The following were added to align with W=2 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wdisabled-optimization
# The -Wshadow flag cannot be enabled unless upstream kernels are
# patched to fix redefinitions of certain built-in functions and
# global variables.
CFLAGS_MODULE += $(call cc-option, -Wlogical-op)
CFLAGS_MODULE += -Wmissing-field-initializers
# -Wtype-limits must be disabled due to build failures on kernel 5.x
CFLAGS_MODULE += -Wno-type-limits
CFLAGS_MODULE += $(call cc-option, -Wmaybe-uninitialized)
CFLAGS_MODULE += $(call cc-option, -Wunused-macros)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN2
# This warning is disabled to avoid build failures in some kernel versions
CFLAGS_MODULE += -Wno-ignored-qualifiers
all:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) EXTRA_CFLAGS="$(EXTRA_CFLAGS)" KBUILD_EXTRA_SYMBOLS="$(EXTRA_SYMBOLS)" modules
modules_install:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) modules_install
clean:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) clean

64
drivers/base/arm/Mconfig Normal file
View File

@@ -0,0 +1,64 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menuconfig MALI_BASE_MODULES
bool "Mali Base extra modules"
default y if BACKEND_KERNEL
help
Enable this option to build support for a Arm Mali base modules.
Those modules provide extra features or debug interfaces and,
are optional for the use of the Mali GPU modules.
config DMA_SHARED_BUFFER_TEST_EXPORTER
bool "Build dma-buf framework test exporter module"
depends on MALI_BASE_MODULES
default y
help
This option will build the dma-buf framework test exporter module.
Usable to help test importers.
Modules:
- dma-buf-test-exporter.ko
config MALI_MEMORY_GROUP_MANAGER
bool "Build Mali Memory Group Manager module"
depends on MALI_BASE_MODULES
default y
help
This option will build the memory group manager module.
This is an example implementation for allocation and release of pages
for memory pools managed by Mali GPU device drivers.
Modules:
- memory_group_manager.ko
config MALI_PROTECTED_MEMORY_ALLOCATOR
bool "Build Mali Protected Memory Allocator module"
depends on MALI_BASE_MODULES && GPU_HAS_CSF
default y
help
This option will build the protected memory allocator module.
This is an example implementation for allocation and release of pages
of secure memory intended to be used by the firmware
of Mali GPU device drivers.
Modules:
- protected_memory_allocator.ko

View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012, 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
ifeq ($(CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER), y)
obj-m += dma-buf-test-exporter.o
endif

View File

@@ -0,0 +1,36 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2017, 2020-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
bob_kernel_module {
name: "dma-buf-test-exporter",
defaults: [
"kernel_defaults",
],
srcs: [
"Kbuild",
"dma-buf-test-exporter.c",
],
enabled: false,
dma_shared_buffer_test_exporter: {
kbuild_options: ["CONFIG_DMA_SHARED_BUFFER_TEST_EXPORTER=y"],
enabled: true,
},
}

View File

@@ -0,0 +1,822 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <uapi/base/arm/dma_buf_test_exporter/dma-buf-test-exporter.h>
#include <linux/dma-buf.h>
#include <linux/miscdevice.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
#include <linux/version.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/atomic.h>
#include <linux/mm.h>
#include <linux/highmem.h>
#include <linux/dma-mapping.h>
#if KERNEL_VERSION(5, 5, 0) <= LINUX_VERSION_CODE
#include <linux/dma-resv.h>
#endif
#include <linux/version_compat_defs.h>
#define DMA_BUF_TE_VER_MAJOR 1
#define DMA_BUF_TE_VER_MINOR 0
/* Maximum size allowed in a single DMA_BUF_TE_ALLOC call */
#define DMA_BUF_TE_ALLOC_MAX_SIZE ((8ull << 30) >> PAGE_SHIFT) /* 8 GB */
/* Since kernel version 5.0 CONFIG_ARCH_NO_SG_CHAIN replaced CONFIG_ARCH_HAS_SG_CHAIN */
#if KERNEL_VERSION(5, 0, 0) > LINUX_VERSION_CODE
#if (!defined(ARCH_HAS_SG_CHAIN) && !defined(CONFIG_ARCH_HAS_SG_CHAIN))
#define NO_SG_CHAIN
#endif
#elif defined(CONFIG_ARCH_NO_SG_CHAIN)
#define NO_SG_CHAIN
#endif
#ifndef CSTD_UNUSED
#define CSTD_UNUSED(x) ((void)(x))
#endif
struct dma_buf_te_alloc {
/* the real alloc */
size_t nr_pages;
struct page **pages;
/* the debug usage tracking */
int nr_attached_devices;
int nr_device_mappings;
int nr_cpu_mappings;
/* failure simulation */
int fail_attach;
int fail_map;
int fail_mmap;
bool contiguous;
dma_addr_t contig_dma_addr;
void *contig_cpu_addr;
/* @lock: Used internally to serialize list manipulation, attach/detach etc. */
struct mutex lock;
};
struct dma_buf_te_attachment {
struct sg_table *sg;
bool attachment_mapped;
};
static struct miscdevice te_device;
#if (KERNEL_VERSION(4, 19, 0) > LINUX_VERSION_CODE)
static int dma_buf_te_attach(struct dma_buf *buf, struct device *dev,
struct dma_buf_attachment *attachment)
#else
static int dma_buf_te_attach(struct dma_buf *buf, struct dma_buf_attachment *attachment)
#endif
{
struct dma_buf_te_alloc *alloc;
alloc = buf->priv;
if (alloc->fail_attach)
return -EFAULT;
attachment->priv = kzalloc(sizeof(struct dma_buf_te_attachment), GFP_KERNEL);
if (!attachment->priv)
return -ENOMEM;
mutex_lock(&alloc->lock);
alloc->nr_attached_devices++;
mutex_unlock(&alloc->lock);
return 0;
}
/**
* dma_buf_te_detach - The detach callback function to release &attachment
*
* @buf: buffer for the &attachment
* @attachment: attachment data to be released
*/
static void dma_buf_te_detach(struct dma_buf *buf, struct dma_buf_attachment *attachment)
{
struct dma_buf_te_alloc *alloc = buf->priv;
struct dma_buf_te_attachment *pa = attachment->priv;
mutex_lock(&alloc->lock);
WARN(pa->attachment_mapped,
"WARNING: dma-buf-test-exporter detected detach with open device mappings");
alloc->nr_attached_devices--;
mutex_unlock(&alloc->lock);
kfree(pa);
}
static struct sg_table *dma_buf_te_map(struct dma_buf_attachment *attachment,
enum dma_data_direction direction)
{
struct sg_table *sg;
struct scatterlist *iter;
struct dma_buf_te_alloc *alloc;
struct dma_buf_te_attachment *pa = attachment->priv;
size_t i;
int ret;
alloc = attachment->dmabuf->priv;
if (alloc->fail_map)
return ERR_PTR(-ENOMEM);
if (WARN(pa->attachment_mapped, "WARNING: Attempted to map already mapped attachment."))
return ERR_PTR(-EBUSY);
#ifdef NO_SG_CHAIN
/* if the ARCH can't chain we can't have allocs larger than a single sg can hold */
if (alloc->nr_pages > SG_MAX_SINGLE_ALLOC)
return ERR_PTR(-EINVAL);
#endif /* NO_SG_CHAIN */
sg = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!sg)
return ERR_PTR(-ENOMEM);
/* from here we access the allocation object, so lock the dmabuf pointing to it */
mutex_lock(&alloc->lock);
if (alloc->contiguous)
ret = sg_alloc_table(sg, 1, GFP_KERNEL);
else
ret = sg_alloc_table(sg, alloc->nr_pages, GFP_KERNEL);
if (ret) {
mutex_unlock(&alloc->lock);
kfree(sg);
return ERR_PTR(ret);
}
if (alloc->contiguous) {
sg_dma_len(sg->sgl) = alloc->nr_pages * PAGE_SIZE;
sg_set_page(sg->sgl, pfn_to_page(PFN_DOWN(alloc->contig_dma_addr)),
alloc->nr_pages * PAGE_SIZE, 0);
sg_dma_address(sg->sgl) = alloc->contig_dma_addr;
} else {
for_each_sg(sg->sgl, iter, alloc->nr_pages, i)
sg_set_page(iter, alloc->pages[i], PAGE_SIZE, 0);
}
if (!dma_map_sg(attachment->dev, sg->sgl, (int)sg->nents, direction)) {
mutex_unlock(&alloc->lock);
sg_free_table(sg);
kfree(sg);
return ERR_PTR(-ENOMEM);
}
alloc->nr_device_mappings++;
pa->attachment_mapped = true;
pa->sg = sg;
mutex_unlock(&alloc->lock);
return sg;
}
static void dma_buf_te_unmap(struct dma_buf_attachment *attachment, struct sg_table *sg,
enum dma_data_direction direction)
{
struct dma_buf_te_alloc *alloc;
struct dma_buf_te_attachment *pa = attachment->priv;
alloc = attachment->dmabuf->priv;
mutex_lock(&alloc->lock);
WARN(!pa->attachment_mapped, "WARNING: Unmatched unmap of attachment.");
alloc->nr_device_mappings--;
pa->attachment_mapped = false;
pa->sg = NULL;
mutex_unlock(&alloc->lock);
dma_unmap_sg(attachment->dev, sg->sgl, (int)sg->nents, direction);
sg_free_table(sg);
kfree(sg);
}
static void dma_buf_te_release(struct dma_buf *buf)
{
size_t i;
struct dma_buf_te_alloc *alloc;
alloc = buf->priv;
/* no need for locking */
mutex_destroy(&alloc->lock);
if (alloc->contiguous) {
dma_free_attrs(te_device.this_device, alloc->nr_pages * PAGE_SIZE,
alloc->contig_cpu_addr, alloc->contig_dma_addr,
DMA_ATTR_WRITE_COMBINE);
} else {
for (i = 0; i < alloc->nr_pages; i++)
__free_page(alloc->pages[i]);
}
#if (KERNEL_VERSION(4, 12, 0) <= LINUX_VERSION_CODE)
kvfree(alloc->pages);
#else
kfree(alloc->pages);
#endif
kfree(alloc);
}
static int dma_buf_te_sync(struct dma_buf *dmabuf, enum dma_data_direction direction,
bool start_cpu_access)
{
struct dma_buf_attachment *attachment;
struct dma_buf_te_alloc *alloc = dmabuf->priv;
/* Use the kernel lock to prevent the concurrent update of dmabuf->attachments */
#if KERNEL_VERSION(5, 5, 0) <= LINUX_VERSION_CODE
dma_resv_lock(dmabuf->resv, NULL);
#else
mutex_lock(&dmabuf->lock);
#endif
/* Use the internal lock to block the concurrent attach/detach calls */
mutex_lock(&alloc->lock);
list_for_each_entry(attachment, &dmabuf->attachments, node) {
struct dma_buf_te_attachment *pa = attachment->priv;
struct sg_table *sg = pa->sg;
if (!sg) {
dev_dbg(te_device.this_device, "no mapping for device %s\n",
dev_name(attachment->dev));
continue;
}
if (start_cpu_access) {
dev_dbg(te_device.this_device, "sync cpu with device %s\n",
dev_name(attachment->dev));
dma_sync_sg_for_cpu(attachment->dev, sg->sgl, (int)sg->nents, direction);
} else {
dev_dbg(te_device.this_device, "sync device %s with cpu\n",
dev_name(attachment->dev));
dma_sync_sg_for_device(attachment->dev, sg->sgl, (int)sg->nents, direction);
}
}
mutex_unlock(&alloc->lock);
#if KERNEL_VERSION(5, 5, 0) <= LINUX_VERSION_CODE
dma_resv_unlock(dmabuf->resv);
#else
mutex_unlock(&dmabuf->lock);
#endif
return 0;
}
static int dma_buf_te_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction)
{
return dma_buf_te_sync(dmabuf, direction, true);
}
static int dma_buf_te_end_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction)
{
return dma_buf_te_sync(dmabuf, direction, false);
}
static void dma_buf_te_mmap_open(struct vm_area_struct *vma)
{
struct dma_buf *dma_buf;
struct dma_buf_te_alloc *alloc;
dma_buf = vma->vm_private_data;
alloc = dma_buf->priv;
mutex_lock(&alloc->lock);
alloc->nr_cpu_mappings++;
mutex_unlock(&alloc->lock);
}
static void dma_buf_te_mmap_close(struct vm_area_struct *vma)
{
struct dma_buf *dma_buf;
struct dma_buf_te_alloc *alloc;
dma_buf = vma->vm_private_data;
alloc = dma_buf->priv;
mutex_lock(&alloc->lock);
BUG_ON(alloc->nr_cpu_mappings <= 0);
alloc->nr_cpu_mappings--;
mutex_unlock(&alloc->lock);
}
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
static int dma_buf_te_mmap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
#elif KERNEL_VERSION(5, 1, 0) > LINUX_VERSION_CODE
static int dma_buf_te_mmap_fault(struct vm_fault *vmf)
#else
static vm_fault_t dma_buf_te_mmap_fault(struct vm_fault *vmf)
#endif
{
struct dma_buf_te_alloc *alloc;
struct dma_buf *dmabuf;
struct page *pageptr;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
dmabuf = vma->vm_private_data;
#else
dmabuf = vmf->vma->vm_private_data;
#endif
alloc = dmabuf->priv;
if (vmf->pgoff > alloc->nr_pages)
return VM_FAULT_SIGBUS;
pageptr = alloc->pages[vmf->pgoff];
BUG_ON(!pageptr);
get_page(pageptr);
vmf->page = pageptr;
return 0;
}
static const struct vm_operations_struct dma_buf_te_vm_ops = { .open = dma_buf_te_mmap_open,
.close = dma_buf_te_mmap_close,
.fault = dma_buf_te_mmap_fault };
static int dma_buf_te_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
{
struct dma_buf_te_alloc *alloc;
alloc = dmabuf->priv;
if (alloc->fail_mmap)
return -ENOMEM;
vm_flags_set(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP);
vma->vm_ops = &dma_buf_te_vm_ops;
vma->vm_private_data = dmabuf;
/* we fault in the pages on access */
/* call open to do the ref-counting */
dma_buf_te_vm_ops.open(vma);
return 0;
}
#if KERNEL_VERSION(4, 19, 0) > LINUX_VERSION_CODE
static void *dma_buf_te_kmap_atomic(struct dma_buf *buf, unsigned long page_num)
{
/* IGNORE */
return NULL;
}
#endif
static void *dma_buf_te_kmap(struct dma_buf *buf, unsigned long page_num)
{
struct dma_buf_te_alloc *alloc;
alloc = buf->priv;
if (page_num >= alloc->nr_pages)
return NULL;
return kbase_kmap(alloc->pages[page_num]);
}
static void dma_buf_te_kunmap(struct dma_buf *buf, unsigned long page_num, void *addr)
{
struct dma_buf_te_alloc *alloc;
alloc = buf->priv;
if (page_num >= alloc->nr_pages)
return;
kbase_kunmap(alloc->pages[page_num], addr);
}
static struct dma_buf_ops dma_buf_te_ops = {
/* real handlers */
.attach = dma_buf_te_attach,
.detach = dma_buf_te_detach,
.map_dma_buf = dma_buf_te_map,
.unmap_dma_buf = dma_buf_te_unmap,
.release = dma_buf_te_release,
.mmap = dma_buf_te_mmap,
.begin_cpu_access = dma_buf_te_begin_cpu_access,
.end_cpu_access = dma_buf_te_end_cpu_access,
#if KERNEL_VERSION(4, 12, 0) > LINUX_VERSION_CODE
.kmap = dma_buf_te_kmap,
.kunmap = dma_buf_te_kunmap,
/* nop handlers for mandatory functions we ignore */
.kmap_atomic = dma_buf_te_kmap_atomic
#else
#if KERNEL_VERSION(5, 6, 0) > LINUX_VERSION_CODE
.map = dma_buf_te_kmap,
.unmap = dma_buf_te_kunmap,
#endif
#if KERNEL_VERSION(4, 19, 0) > LINUX_VERSION_CODE
/* nop handlers for mandatory functions we ignore */
.map_atomic = dma_buf_te_kmap_atomic
#endif
#endif
};
static int do_dma_buf_te_ioctl_version(struct dma_buf_te_ioctl_version __user *buf)
{
struct dma_buf_te_ioctl_version v;
if (copy_from_user(&v, buf, sizeof(v)))
return -EFAULT;
if (v.op != DMA_BUF_TE_ENQ)
return -EFAULT;
v.op = DMA_BUF_TE_ACK;
v.major = DMA_BUF_TE_VER_MAJOR;
v.minor = DMA_BUF_TE_VER_MINOR;
if (copy_to_user(buf, &v, sizeof(v)))
return -EFAULT;
else
return 0;
}
static int do_dma_buf_te_ioctl_alloc(struct dma_buf_te_ioctl_alloc __user *buf, bool contiguous)
{
struct dma_buf_te_ioctl_alloc alloc_req;
struct dma_buf_te_alloc *alloc;
struct dma_buf *dma_buf;
size_t i = 0;
size_t max_nr_pages = DMA_BUF_TE_ALLOC_MAX_SIZE;
int fd;
if (copy_from_user(&alloc_req, buf, sizeof(alloc_req))) {
dev_err(te_device.this_device, "%s: couldn't get user data", __func__);
goto no_input;
}
if (!alloc_req.size) {
dev_err(te_device.this_device, "%s: no size specified", __func__);
goto invalid_size;
}
#ifdef NO_SG_CHAIN
/* Whilst it is possible to allocate larger buffer, we won't be able to
* map it during actual usage (mmap() still succeeds). We fail here so
* userspace code can deal with it early than having driver failure
* later on.
*/
if (max_nr_pages > SG_MAX_SINGLE_ALLOC)
max_nr_pages = SG_MAX_SINGLE_ALLOC;
#endif /* NO_SG_CHAIN */
if (alloc_req.size > max_nr_pages) {
dev_err(te_device.this_device,
"%s: buffer size of %llu pages exceeded the mapping limit of %zu pages",
__func__, alloc_req.size, max_nr_pages);
goto invalid_size;
}
alloc = kzalloc(sizeof(struct dma_buf_te_alloc), GFP_KERNEL);
if (alloc == NULL) {
dev_err(te_device.this_device, "%s: couldn't alloc object", __func__);
goto no_alloc_object;
}
alloc->nr_pages = alloc_req.size;
alloc->contiguous = contiguous;
#if (KERNEL_VERSION(4, 12, 0) <= LINUX_VERSION_CODE)
alloc->pages = kvzalloc(sizeof(struct page *) * alloc->nr_pages, GFP_KERNEL);
#else
alloc->pages = kzalloc(sizeof(struct page *) * alloc->nr_pages, GFP_KERNEL);
#endif
if (!alloc->pages) {
dev_err(te_device.this_device, "%s: couldn't alloc %zu page structures", __func__,
alloc->nr_pages);
goto free_alloc_object;
}
if (contiguous) {
dma_addr_t dma_aux;
alloc->contig_cpu_addr = dma_alloc_attrs(
te_device.this_device, alloc->nr_pages * PAGE_SIZE, &alloc->contig_dma_addr,
GFP_KERNEL | __GFP_ZERO, DMA_ATTR_WRITE_COMBINE);
if (!alloc->contig_cpu_addr) {
dev_err(te_device.this_device,
"%s: couldn't alloc contiguous buffer %zu pages", __func__,
alloc->nr_pages);
goto free_page_struct;
}
dma_aux = alloc->contig_dma_addr;
for (i = 0; i < alloc->nr_pages; i++) {
alloc->pages[i] = pfn_to_page(PFN_DOWN(dma_aux));
dma_aux += PAGE_SIZE;
}
} else {
for (i = 0; i < alloc->nr_pages; i++) {
alloc->pages[i] = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (alloc->pages[i] == NULL) {
dev_err(te_device.this_device, "%s: couldn't alloc page", __func__);
goto no_page;
}
}
}
mutex_init(&alloc->lock);
/* alloc ready, let's export it */
{
struct dma_buf_export_info export_info = {
.exp_name = "dma_buf_te",
.owner = THIS_MODULE,
.ops = &dma_buf_te_ops,
.size = alloc->nr_pages << PAGE_SHIFT,
.flags = O_CLOEXEC | O_RDWR,
.priv = alloc,
};
dma_buf = dma_buf_export(&export_info);
}
if (IS_ERR_OR_NULL(dma_buf)) {
dev_err(te_device.this_device, "%s: couldn't export dma_buf", __func__);
goto no_export;
}
/* get fd for buf */
fd = dma_buf_fd(dma_buf, O_CLOEXEC);
if (fd < 0) {
dev_err(te_device.this_device, "%s: couldn't get fd from dma_buf", __func__);
goto no_fd;
}
return fd;
no_fd:
dma_buf_put(dma_buf);
no_export:
/* i still valid */
mutex_destroy(&alloc->lock);
no_page:
if (contiguous) {
dma_free_attrs(te_device.this_device, alloc->nr_pages * PAGE_SIZE,
alloc->contig_cpu_addr, alloc->contig_dma_addr,
DMA_ATTR_WRITE_COMBINE);
} else {
while (i-- > 0)
__free_page(alloc->pages[i]);
}
free_page_struct:
#if (KERNEL_VERSION(4, 12, 0) <= LINUX_VERSION_CODE)
kvfree(alloc->pages);
#else
kfree(alloc->pages);
#endif
free_alloc_object:
kfree(alloc);
no_alloc_object:
invalid_size:
no_input:
return -EFAULT;
}
static int do_dma_buf_te_ioctl_status(struct dma_buf_te_ioctl_status __user *arg)
{
struct dma_buf_te_ioctl_status status;
struct dma_buf *dmabuf;
struct dma_buf_te_alloc *alloc;
int res = -EINVAL;
if (copy_from_user(&status, arg, sizeof(status)))
return -EFAULT;
dmabuf = dma_buf_get(status.fd);
if (IS_ERR_OR_NULL(dmabuf))
return -EINVAL;
/* verify it's one of ours */
if (dmabuf->ops != &dma_buf_te_ops)
goto err_have_dmabuf;
/* ours, get the current status */
alloc = dmabuf->priv;
/* lock while reading status to take a snapshot */
mutex_lock(&alloc->lock);
status.attached_devices = alloc->nr_attached_devices;
status.device_mappings = alloc->nr_device_mappings;
status.cpu_mappings = alloc->nr_cpu_mappings;
mutex_unlock(&alloc->lock);
if (copy_to_user(arg, &status, sizeof(status)))
goto err_have_dmabuf;
/* All OK */
res = 0;
err_have_dmabuf:
dma_buf_put(dmabuf);
return res;
}
static int do_dma_buf_te_ioctl_set_failing(struct dma_buf_te_ioctl_set_failing __user *arg)
{
struct dma_buf *dmabuf;
struct dma_buf_te_ioctl_set_failing f;
struct dma_buf_te_alloc *alloc;
int res = -EINVAL;
if (copy_from_user(&f, arg, sizeof(f)))
return -EFAULT;
dmabuf = dma_buf_get(f.fd);
if (IS_ERR_OR_NULL(dmabuf))
return -EINVAL;
/* verify it's one of ours */
if (dmabuf->ops != &dma_buf_te_ops)
goto err_have_dmabuf;
/* ours, set the fail modes */
alloc = dmabuf->priv;
/* lock to set the fail modes atomically */
mutex_lock(&alloc->lock);
alloc->fail_attach = f.fail_attach;
alloc->fail_map = f.fail_map;
alloc->fail_mmap = f.fail_mmap;
mutex_unlock(&alloc->lock);
/* success */
res = 0;
err_have_dmabuf:
dma_buf_put(dmabuf);
return res;
}
static int dma_te_buf_fill(struct dma_buf *dma_buf, int value)
{
struct dma_buf_attachment *attachment;
struct sg_table *sgt;
struct scatterlist *sg;
unsigned int count;
int ret = 0;
size_t i;
attachment = dma_buf_attach(dma_buf, te_device.this_device);
if (IS_ERR_OR_NULL(attachment))
return -EBUSY;
#if (KERNEL_VERSION(6, 1, 55) <= LINUX_VERSION_CODE)
sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL);
#else
sgt = dma_buf_map_attachment(attachment, DMA_BIDIRECTIONAL);
#endif
if (IS_ERR_OR_NULL(sgt)) {
ret = PTR_ERR(sgt);
goto no_import;
}
ret = dma_buf_begin_cpu_access(dma_buf, DMA_BIDIRECTIONAL);
if (ret)
goto no_cpu_access;
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
for (i = 0; i < sg_dma_len(sg); i = i + PAGE_SIZE) {
void *addr = NULL;
#if KERNEL_VERSION(5, 6, 0) <= LINUX_VERSION_CODE
addr = dma_buf_te_kmap(dma_buf, i >> PAGE_SHIFT);
#else
addr = dma_buf_kmap(dma_buf, i >> PAGE_SHIFT);
#endif
if (!addr) {
ret = -EPERM;
goto no_kmap;
}
memset(addr, value, PAGE_SIZE);
#if KERNEL_VERSION(5, 6, 0) <= LINUX_VERSION_CODE
dma_buf_te_kunmap(dma_buf, i >> PAGE_SHIFT, addr);
#else
dma_buf_kunmap(dma_buf, i >> PAGE_SHIFT, addr);
#endif
}
}
no_kmap:
dma_buf_end_cpu_access(dma_buf, DMA_BIDIRECTIONAL);
no_cpu_access:
#if (KERNEL_VERSION(6, 1, 55) <= LINUX_VERSION_CODE)
dma_buf_unmap_attachment_unlocked(attachment, sgt, DMA_BIDIRECTIONAL);
#else
dma_buf_unmap_attachment(attachment, sgt, DMA_BIDIRECTIONAL);
#endif
no_import:
dma_buf_detach(dma_buf, attachment);
return ret;
}
static int do_dma_buf_te_ioctl_fill(struct dma_buf_te_ioctl_fill __user *arg)
{
struct dma_buf *dmabuf;
struct dma_buf_te_ioctl_fill f;
int ret;
if (copy_from_user(&f, arg, sizeof(f)))
return -EFAULT;
dmabuf = dma_buf_get(f.fd);
if (IS_ERR_OR_NULL(dmabuf))
return -EINVAL;
ret = dma_te_buf_fill(dmabuf, f.value);
dma_buf_put(dmabuf);
return ret;
}
static long dma_buf_te_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
CSTD_UNUSED(file);
switch (cmd) {
case DMA_BUF_TE_VERSION:
return do_dma_buf_te_ioctl_version((struct dma_buf_te_ioctl_version __user *)arg);
case DMA_BUF_TE_ALLOC:
return do_dma_buf_te_ioctl_alloc((struct dma_buf_te_ioctl_alloc __user *)arg,
false);
case DMA_BUF_TE_ALLOC_CONT:
return do_dma_buf_te_ioctl_alloc((struct dma_buf_te_ioctl_alloc __user *)arg, true);
case DMA_BUF_TE_QUERY:
return do_dma_buf_te_ioctl_status((struct dma_buf_te_ioctl_status __user *)arg);
case DMA_BUF_TE_SET_FAILING:
return do_dma_buf_te_ioctl_set_failing(
(struct dma_buf_te_ioctl_set_failing __user *)arg);
case DMA_BUF_TE_FILL:
return do_dma_buf_te_ioctl_fill((struct dma_buf_te_ioctl_fill __user *)arg);
default:
return -ENOTTY;
}
}
static const struct file_operations dma_buf_te_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = dma_buf_te_ioctl,
.compat_ioctl = dma_buf_te_ioctl,
};
static int __init dma_buf_te_init(void)
{
int res;
te_device.minor = MISC_DYNAMIC_MINOR;
te_device.name = "dma_buf_te";
te_device.fops = &dma_buf_te_fops;
res = misc_register(&te_device);
if (res) {
pr_warn("Misc device registration failed of 'dma_buf_te'\n");
return res;
}
te_device.this_device->coherent_dma_mask = DMA_BIT_MASK(32);
dev_info(te_device.this_device, "dma_buf_te ready\n");
return 0;
}
static void __exit dma_buf_te_exit(void)
{
misc_deregister(&te_device);
}
module_init(dma_buf_te_init);
module_exit(dma_buf_te_exit);
MODULE_LICENSE("GPL");
MODULE_INFO(import_ns, "DMA_BUF");

View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
ifeq ($(CONFIG_MALI_MEMORY_GROUP_MANAGER), y)
obj-m := memory_group_manager.o
endif

View File

@@ -0,0 +1,36 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
bob_kernel_module {
name: "memory_group_manager",
defaults: [
"kernel_defaults",
],
srcs: [
"Kbuild",
"memory_group_manager.c",
],
enabled: false,
mali_memory_group_manager: {
kbuild_options: ["CONFIG_MALI_MEMORY_GROUP_MANAGER=y"],
enabled: true,
},
}

View File

@@ -0,0 +1,477 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <linux/fs.h>
#include <linux/of.h>
#include <linux/slab.h>
#include <linux/platform_device.h>
#include <linux/version.h>
#include <linux/module.h>
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include <linux/debugfs.h>
#include <linux/version_compat_defs.h>
#endif
#include <linux/mm.h>
#include <linux/memory_group_manager.h>
#ifndef CSTD_UNUSED
#define CSTD_UNUSED(x) ((void)(x))
#endif
#if (KERNEL_VERSION(4, 20, 0) > LINUX_VERSION_CODE)
static inline vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot)
{
int err = vm_insert_pfn_prot(vma, addr, pfn, pgprot);
if (unlikely(err == -ENOMEM))
return VM_FAULT_OOM;
if (unlikely(err < 0 && err != -EBUSY))
return VM_FAULT_SIGBUS;
return VM_FAULT_NOPAGE;
}
#endif
#define PTE_PBHA_SHIFT (59)
#define PTE_PBHA_MASK ((uint64_t)0xf << PTE_PBHA_SHIFT)
#define PTE_RES_BIT_MULTI_AS_SHIFT (63)
#define IMPORTED_MEMORY_ID (MEMORY_GROUP_MANAGER_NR_GROUPS - 1)
/**
* struct mgm_group - Structure to keep track of the number of allocated
* pages per group
*
* @size: The number of allocated small pages of PAGE_SIZE bytes
* @lp_size: The number of allocated large(2MB) pages
* @insert_pfn: The number of calls to map pages for CPU access.
* @update_gpu_pte: The number of calls to update GPU page table entries.
*
* This structure allows page allocation information to be displayed via
* debugfs. Display is organized per group with small and large sized pages.
*/
struct mgm_group {
atomic_t size;
atomic_t lp_size;
atomic_t insert_pfn;
atomic_t update_gpu_pte;
};
/**
* struct mgm_groups - Structure for groups of memory group manager
*
* @groups: To keep track of the number of allocated pages of all groups
* @dev: device attached
* @mgm_debugfs_root: debugfs root directory of memory group manager
*
* This structure allows page allocation information to be displayed via
* debugfs. Display is organized per group with small and large sized pages.
*/
struct mgm_groups {
struct mgm_group groups[MEMORY_GROUP_MANAGER_NR_GROUPS];
struct device *dev;
#if IS_ENABLED(CONFIG_DEBUG_FS)
struct dentry *mgm_debugfs_root;
#endif
};
#if IS_ENABLED(CONFIG_DEBUG_FS)
static int mgm_size_get(void *data, u64 *val)
{
struct mgm_group *group = data;
*val = (u64)atomic_read(&group->size);
return 0;
}
static int mgm_lp_size_get(void *data, u64 *val)
{
struct mgm_group *group = data;
*val = (u64)atomic_read(&group->lp_size);
return 0;
}
static int mgm_insert_pfn_get(void *data, u64 *val)
{
struct mgm_group *group = data;
*val = (u64)atomic_read(&group->insert_pfn);
return 0;
}
static int mgm_update_gpu_pte_get(void *data, u64 *val)
{
struct mgm_group *group = data;
*val = (u64)atomic_read(&group->update_gpu_pte);
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_mgm_size, mgm_size_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_mgm_lp_size, mgm_lp_size_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_mgm_insert_pfn, mgm_insert_pfn_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_mgm_update_gpu_pte, mgm_update_gpu_pte_get, NULL, "%llu\n");
static void mgm_term_debugfs(struct mgm_groups *data)
{
debugfs_remove_recursive(data->mgm_debugfs_root);
}
#define MGM_DEBUGFS_GROUP_NAME_MAX 10
static int mgm_initialize_debugfs(struct mgm_groups *mgm_data)
{
int i;
struct dentry *e, *g;
char debugfs_group_name[MGM_DEBUGFS_GROUP_NAME_MAX];
/*
* Create root directory of memory-group-manager
*/
mgm_data->mgm_debugfs_root = debugfs_create_dir("physical-memory-group-manager", NULL);
if (IS_ERR_OR_NULL(mgm_data->mgm_debugfs_root)) {
dev_err(mgm_data->dev, "fail to create debugfs root directory\n");
return -ENODEV;
}
/*
* Create debugfs files per group
*/
for (i = 0; i < MEMORY_GROUP_MANAGER_NR_GROUPS; i++) {
scnprintf(debugfs_group_name, MGM_DEBUGFS_GROUP_NAME_MAX, "group_%d", i);
g = debugfs_create_dir(debugfs_group_name, mgm_data->mgm_debugfs_root);
if (IS_ERR_OR_NULL(g)) {
dev_err(mgm_data->dev, "fail to create group[%d]\n", i);
goto remove_debugfs;
}
e = debugfs_create_file("size", 0444, g, &mgm_data->groups[i], &fops_mgm_size);
if (IS_ERR_OR_NULL(e)) {
dev_err(mgm_data->dev, "fail to create size[%d]\n", i);
goto remove_debugfs;
}
e = debugfs_create_file("lp_size", 0444, g, &mgm_data->groups[i],
&fops_mgm_lp_size);
if (IS_ERR_OR_NULL(e)) {
dev_err(mgm_data->dev, "fail to create lp_size[%d]\n", i);
goto remove_debugfs;
}
e = debugfs_create_file("insert_pfn", 0444, g, &mgm_data->groups[i],
&fops_mgm_insert_pfn);
if (IS_ERR_OR_NULL(e)) {
dev_err(mgm_data->dev, "fail to create insert_pfn[%d]\n", i);
goto remove_debugfs;
}
e = debugfs_create_file("update_gpu_pte", 0444, g, &mgm_data->groups[i],
&fops_mgm_update_gpu_pte);
if (IS_ERR_OR_NULL(e)) {
dev_err(mgm_data->dev, "fail to create update_gpu_pte[%d]\n", i);
goto remove_debugfs;
}
}
return 0;
remove_debugfs:
mgm_term_debugfs(mgm_data);
return -ENODEV;
}
#else
static void mgm_term_debugfs(struct mgm_groups *data)
{
}
static int mgm_initialize_debugfs(struct mgm_groups *mgm_data)
{
return 0;
}
#endif /* CONFIG_DEBUG_FS */
#define ORDER_SMALL_PAGE 0
#define ORDER_LARGE_PAGE (__builtin_ffs(SZ_2M / PAGE_SIZE) - 1)
static void update_size(struct memory_group_manager_device *mgm_dev, unsigned int group_id,
unsigned int order, bool alloc)
{
struct mgm_groups *data = mgm_dev->data;
switch (order) {
case ORDER_SMALL_PAGE:
if (alloc)
atomic_inc(&data->groups[group_id].size);
else {
WARN_ON(atomic_read(&data->groups[group_id].size) == 0);
atomic_dec(&data->groups[group_id].size);
}
break;
case ORDER_LARGE_PAGE:
if (alloc)
atomic_inc(&data->groups[group_id].lp_size);
else {
WARN_ON(atomic_read(&data->groups[group_id].lp_size) == 0);
atomic_dec(&data->groups[group_id].lp_size);
}
break;
default:
dev_err(data->dev, "Unknown order(%u)\n", order);
break;
}
}
static struct page *example_mgm_alloc_page(struct memory_group_manager_device *mgm_dev,
unsigned int group_id, gfp_t gfp_mask,
unsigned int order)
{
struct mgm_groups *const data = mgm_dev->data;
struct page *p;
dev_dbg(data->dev, "%s(mgm_dev=%pK, group_id=%u gfp_mask=0x%x order=%u\n", __func__,
(void *)mgm_dev, group_id, gfp_mask, order);
if (WARN_ON(group_id >= MEMORY_GROUP_MANAGER_NR_GROUPS))
return NULL;
p = alloc_pages(gfp_mask, order);
if (p) {
update_size(mgm_dev, group_id, order, true);
} else {
struct mgm_groups *data = mgm_dev->data;
dev_err(data->dev, "alloc_pages failed\n");
}
return p;
}
static void example_mgm_free_page(struct memory_group_manager_device *mgm_dev,
unsigned int group_id, struct page *page, unsigned int order)
{
struct mgm_groups *const data = mgm_dev->data;
dev_dbg(data->dev, "%s(mgm_dev=%pK, group_id=%u page=%pK order=%u\n", __func__,
(void *)mgm_dev, group_id, (void *)page, order);
if (WARN_ON(group_id >= MEMORY_GROUP_MANAGER_NR_GROUPS))
return;
__free_pages(page, order);
update_size(mgm_dev, group_id, order, false);
}
static int example_mgm_get_import_memory_id(struct memory_group_manager_device *mgm_dev,
struct memory_group_manager_import_data *import_data)
{
struct mgm_groups *const data = mgm_dev->data;
dev_dbg(data->dev, "%s(mgm_dev=%pK, import_data=%pK (type=%d)\n", __func__, (void *)mgm_dev,
(void *)import_data, (int)import_data->type);
if (!WARN_ON(!import_data)) {
WARN_ON(!import_data->u.dma_buf);
WARN_ON(import_data->type != MEMORY_GROUP_MANAGER_IMPORT_TYPE_DMA_BUF);
}
return IMPORTED_MEMORY_ID;
}
static u64 example_mgm_update_gpu_pte(struct memory_group_manager_device *const mgm_dev,
unsigned int const group_id, int const mmu_level, u64 pte)
{
struct mgm_groups *const data = mgm_dev->data;
dev_dbg(data->dev, "%s(mgm_dev=%pK, group_id=%u, mmu_level=%d, pte=0x%llx)\n", __func__,
(void *)mgm_dev, group_id, mmu_level, pte);
if (WARN_ON(group_id >= MEMORY_GROUP_MANAGER_NR_GROUPS))
return pte;
pte |= ((u64)group_id << PTE_PBHA_SHIFT) & PTE_PBHA_MASK;
/* Address could be translated into a different bus address here */
pte |= ((u64)1 << PTE_RES_BIT_MULTI_AS_SHIFT);
atomic_inc(&data->groups[group_id].update_gpu_pte);
return pte;
}
static u64 example_mgm_pte_to_original_pte(struct memory_group_manager_device *const mgm_dev,
unsigned int const group_id, int const mmu_level,
u64 pte)
{
CSTD_UNUSED(mgm_dev);
CSTD_UNUSED(group_id);
CSTD_UNUSED(mmu_level);
/* Undo the group ID modification */
pte &= ~PTE_PBHA_MASK;
/* Undo the bit set */
pte &= ~((u64)1 << PTE_RES_BIT_MULTI_AS_SHIFT);
return pte;
}
static vm_fault_t example_mgm_vmf_insert_pfn_prot(struct memory_group_manager_device *const mgm_dev,
unsigned int const group_id,
struct vm_area_struct *const vma,
unsigned long const addr, unsigned long const pfn,
pgprot_t const prot)
{
struct mgm_groups *const data = mgm_dev->data;
vm_fault_t fault;
dev_dbg(data->dev,
"%s(mgm_dev=%pK, group_id=%u, vma=%pK, addr=0x%lx, pfn=0x%lx, prot=0x%llx)\n",
__func__, (void *)mgm_dev, group_id, (void *)vma, addr, pfn,
(unsigned long long)pgprot_val(prot));
if (WARN_ON(group_id >= MEMORY_GROUP_MANAGER_NR_GROUPS))
return VM_FAULT_SIGBUS;
fault = vmf_insert_pfn_prot(vma, addr, pfn, prot);
if (fault == VM_FAULT_NOPAGE)
atomic_inc(&data->groups[group_id].insert_pfn);
else
dev_err(data->dev, "vmf_insert_pfn_prot failed\n");
return fault;
}
static int mgm_initialize_data(struct mgm_groups *mgm_data)
{
int i;
for (i = 0; i < MEMORY_GROUP_MANAGER_NR_GROUPS; i++) {
atomic_set(&mgm_data->groups[i].size, 0);
atomic_set(&mgm_data->groups[i].lp_size, 0);
atomic_set(&mgm_data->groups[i].insert_pfn, 0);
atomic_set(&mgm_data->groups[i].update_gpu_pte, 0);
}
return mgm_initialize_debugfs(mgm_data);
}
static void mgm_term_data(struct mgm_groups *data)
{
int i;
for (i = 0; i < MEMORY_GROUP_MANAGER_NR_GROUPS; i++) {
if (atomic_read(&data->groups[i].size) != 0)
dev_warn(data->dev, "%d 0-order pages in group(%d) leaked\n",
atomic_read(&data->groups[i].size), i);
if (atomic_read(&data->groups[i].lp_size) != 0)
dev_warn(data->dev, "%d 9 order pages in group(%d) leaked\n",
atomic_read(&data->groups[i].lp_size), i);
}
mgm_term_debugfs(data);
}
static int memory_group_manager_probe(struct platform_device *pdev)
{
struct memory_group_manager_device *mgm_dev;
struct mgm_groups *mgm_data;
mgm_dev = kzalloc(sizeof(*mgm_dev), GFP_KERNEL);
if (!mgm_dev)
return -ENOMEM;
mgm_dev->owner = THIS_MODULE;
mgm_dev->ops.mgm_alloc_page = example_mgm_alloc_page;
mgm_dev->ops.mgm_free_page = example_mgm_free_page;
mgm_dev->ops.mgm_get_import_memory_id = example_mgm_get_import_memory_id;
mgm_dev->ops.mgm_vmf_insert_pfn_prot = example_mgm_vmf_insert_pfn_prot;
mgm_dev->ops.mgm_update_gpu_pte = example_mgm_update_gpu_pte;
mgm_dev->ops.mgm_pte_to_original_pte = example_mgm_pte_to_original_pte;
mgm_data = kzalloc(sizeof(*mgm_data), GFP_KERNEL);
if (!mgm_data) {
kfree(mgm_dev);
return -ENOMEM;
}
mgm_dev->data = mgm_data;
mgm_data->dev = &pdev->dev;
if (mgm_initialize_data(mgm_data)) {
kfree(mgm_data);
kfree(mgm_dev);
return -ENOENT;
}
platform_set_drvdata(pdev, mgm_dev);
dev_info(&pdev->dev, "Memory group manager probed successfully\n");
return 0;
}
static int memory_group_manager_remove(struct platform_device *pdev)
{
struct memory_group_manager_device *mgm_dev = platform_get_drvdata(pdev);
struct mgm_groups *mgm_data = mgm_dev->data;
mgm_term_data(mgm_data);
kfree(mgm_data);
kfree(mgm_dev);
dev_info(&pdev->dev, "Memory group manager removed successfully\n");
return 0;
}
static const struct of_device_id memory_group_manager_dt_ids[] = {
{ .compatible = "arm,physical-memory-group-manager" },
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, memory_group_manager_dt_ids);
static struct platform_driver
memory_group_manager_driver = { .probe = memory_group_manager_probe,
.remove = memory_group_manager_remove,
.driver = {
.name = "physical-memory-group-manager",
.of_match_table =
of_match_ptr(memory_group_manager_dt_ids),
/*
* Prevent the mgm_dev from being unbound and freed, as other's
* may have pointers to it and would get confused, or crash, if
* it suddenly disappear.
*/
.suppress_bind_attrs = true,
} };
module_platform_driver(memory_group_manager_driver);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("ARM Ltd.");
MODULE_VERSION("1.0");

View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
ifeq ($(CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR), y)
obj-m := protected_memory_allocator.o
endif

View File

@@ -0,0 +1,36 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
bob_kernel_module {
name: "protected_memory_allocator",
defaults: [
"kernel_defaults",
],
srcs: [
"Kbuild",
"protected_memory_allocator.c",
],
enabled: false,
mali_protected_memory_allocator: {
kbuild_options: ["CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR=y"],
enabled: true,
},
}

View File

@@ -0,0 +1,543 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <linux/version.h>
#include <linux/of.h>
#include <linux/of_reserved_mem.h>
#include <linux/platform_device.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/mm.h>
#include <linux/io.h>
#include <linux/protected_memory_allocator.h>
/* Size of a bitfield element in bytes */
#define BITFIELD_ELEM_SIZE sizeof(u64)
/* We can track whether or not 64 pages are currently allocated in a u64 */
#define PAGES_PER_BITFIELD_ELEM (BITFIELD_ELEM_SIZE * BITS_PER_BYTE)
/* Order 6 (ie, 64) corresponds to the number of pages held in a bitfield */
#define ORDER_OF_PAGES_PER_BITFIELD_ELEM 6
/**
* struct simple_pma_device - Simple implementation of a protected memory
* allocator device
* @pma_dev: Protected memory allocator device pointer
* @dev: Device pointer
* @allocated_pages_bitfield_arr: Status of all the physical memory pages within the
* protected memory region, one bit per page
* @rmem_base: Base physical address of the reserved memory region
* @rmem_size: Size of the reserved memory region, in pages
* @num_free_pages: Number of free pages in the memory region
* @rmem_lock: Lock to serialize the allocation and freeing of
* physical pages from the protected memory region
*/
struct simple_pma_device {
struct protected_memory_allocator_device pma_dev;
struct device *dev;
u64 *allocated_pages_bitfield_arr;
phys_addr_t rmem_base;
size_t rmem_size;
size_t num_free_pages;
spinlock_t rmem_lock;
};
/**
* ALLOC_PAGES_BITFIELD_ARR_SIZE() - Number of elements in array
* 'allocated_pages_bitfield_arr'
* If the number of pages required does not divide exactly by
* PAGES_PER_BITFIELD_ELEM, adds an extra page for the remainder.
* @num_pages: number of pages
*/
#define ALLOC_PAGES_BITFIELD_ARR_SIZE(num_pages) \
((PAGES_PER_BITFIELD_ELEM * (0 != (num_pages % PAGES_PER_BITFIELD_ELEM)) + num_pages) / \
PAGES_PER_BITFIELD_ELEM)
/**
* small_granularity_alloc() - Allocate 1-32 power-of-two pages.
* @epma_dev: protected memory allocator device structure.
* @alloc_bitfield_idx: index of the relevant bitfield.
* @start_bit: starting bitfield index.
* @order: bitshift for number of pages. Order of 0 to 5 equals 1 to 32 pages.
* @pma: protected_memory_allocation struct.
*
* Allocate a power-of-two number of pages, N, where
* 0 <= N <= ORDER_OF_PAGES_PER_BITFIELD_ELEM - 1. ie, Up to 32 pages. The routine
* fills-in a pma structure and sets the appropriate bits in the allocated-pages
* bitfield array but assumes the caller has already determined that these are
* already clear.
*
* This routine always works within only a single allocated-pages bitfield element.
* It can be thought of as the 'small-granularity' allocator.
*/
static void small_granularity_alloc(struct simple_pma_device *const epma_dev,
size_t alloc_bitfield_idx, size_t start_bit, size_t order,
struct protected_memory_allocation *pma)
{
size_t i;
size_t page_idx;
u64 *bitfield;
size_t alloc_pages_bitfield_size;
if (WARN_ON(!epma_dev) || WARN_ON(!pma))
return;
WARN(epma_dev->rmem_size == 0, "%s: rmem_size is 0", __func__);
alloc_pages_bitfield_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
WARN(alloc_bitfield_idx >= alloc_pages_bitfield_size, "%s: idx>bf_size: %zu %zu", __func__,
alloc_bitfield_idx, alloc_pages_bitfield_size);
WARN((start_bit + (1ULL << order)) > PAGES_PER_BITFIELD_ELEM,
"%s: start=%zu order=%zu ppbe=%zu", __func__, start_bit, order,
PAGES_PER_BITFIELD_ELEM);
bitfield = &epma_dev->allocated_pages_bitfield_arr[alloc_bitfield_idx];
for (i = 0; i < (1ULL << order); i++) {
/* Check the pages represented by this bit are actually free */
WARN(*bitfield & (1ULL << (start_bit + i)),
"in %s: page not free: %zu %zu %.16llx %zu\n", __func__, i, order, *bitfield,
alloc_pages_bitfield_size);
/* Mark the pages as now allocated */
*bitfield |= (1ULL << (start_bit + i));
}
/* Compute the page index */
page_idx = (alloc_bitfield_idx * PAGES_PER_BITFIELD_ELEM) + start_bit;
/* Fill-in the allocation struct for the caller */
pma->pa = epma_dev->rmem_base + (page_idx << PAGE_SHIFT);
pma->order = order;
}
/**
* large_granularity_alloc() - Allocate pages at multiples of 64 pages.
* @epma_dev: protected memory allocator device structure.
* @start_alloc_bitfield_idx: index of the starting bitfield.
* @order: bitshift for number of pages. Order of 6+ equals 64+ pages.
* @pma: protected_memory_allocation struct.
*
* Allocate a power-of-two number of pages, N, where
* N >= ORDER_OF_PAGES_PER_BITFIELD_ELEM. ie, 64 pages or more. The routine fills-in
* a pma structure and sets the appropriate bits in the allocated-pages bitfield array
* but assumes the caller has already determined that these are already clear.
*
* Unlike small_granularity_alloc, this routine can work with multiple 64-page groups,
* ie multiple elements from the allocated-pages bitfield array. However, it always
* works with complete sets of these 64-page groups. It can therefore be thought of
* as the 'large-granularity' allocator.
*/
static void large_granularity_alloc(struct simple_pma_device *const epma_dev,
size_t start_alloc_bitfield_idx, size_t order,
struct protected_memory_allocation *pma)
{
size_t i;
size_t num_pages_to_alloc = (size_t)1 << order;
size_t num_bitfield_elements_needed = num_pages_to_alloc / PAGES_PER_BITFIELD_ELEM;
size_t start_page_idx = start_alloc_bitfield_idx * PAGES_PER_BITFIELD_ELEM;
if (WARN_ON(!epma_dev) || WARN_ON(!pma))
return;
/*
* Are there anough bitfield array elements (groups of 64 pages)
* between the start element and the end of the bitfield array
* to fulfill the request?
*/
WARN((start_alloc_bitfield_idx + order) >=
ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size),
"%s: start=%zu order=%zu ms=%zu", __func__, start_alloc_bitfield_idx, order,
epma_dev->rmem_size);
for (i = 0; i < num_bitfield_elements_needed; i++) {
u64 *bitfield =
&epma_dev->allocated_pages_bitfield_arr[start_alloc_bitfield_idx + i];
/* We expect all pages that relate to this bitfield element to be free */
WARN((*bitfield != 0), "in %s: pages not free: i=%zu o=%zu bf=%.16llx\n", __func__,
i, order, *bitfield);
/* Mark all the pages for this element as not free */
*bitfield = ~0ULL;
}
/* Fill-in the allocation struct for the caller */
pma->pa = epma_dev->rmem_base + (start_page_idx << PAGE_SHIFT);
pma->order = order;
}
static struct protected_memory_allocation *
simple_pma_alloc_page(struct protected_memory_allocator_device *pma_dev, unsigned int order)
{
struct simple_pma_device *const epma_dev =
container_of(pma_dev, struct simple_pma_device, pma_dev);
struct protected_memory_allocation *pma;
size_t num_pages_to_alloc;
u64 *bitfields = epma_dev->allocated_pages_bitfield_arr;
size_t i;
size_t bit;
size_t count;
dev_dbg(epma_dev->dev, "%s(pma_dev=%px, order=%u\n", __func__, (void *)pma_dev, order);
/* This is an example function that follows an extremely simple logic
* and is very likely to fail to allocate memory if put under stress.
*
* The simple_pma_device maintains an array of u64s, with one bit used
* to track the status of each page.
*
* In order to create a memory allocation, the allocator looks for an
* adjacent group of cleared bits. This does leave the algorithm open
* to fragmentation issues, but is deemed sufficient for now.
* If successful, the allocator shall mark all the pages as allocated
* and increment the offset accordingly.
*
* Allocations of 64 pages or more (order 6) can be allocated only with
* 64-page alignment, in order to keep the algorithm as simple as
* possible. ie, starting from bit 0 of any 64-bit page-allocation
* bitfield. For this, the large-granularity allocator is utilised.
*
* Allocations of lower-order can only be allocated entirely within the
* same group of 64 pages, with the small-ganularity allocator (ie
* always from the same 64-bit page-allocation bitfield) - again, to
* keep things as simple as possible, but flexible to meet
* current needs.
*/
num_pages_to_alloc = (size_t)1 << order;
pma = devm_kzalloc(epma_dev->dev, sizeof(*pma), GFP_KERNEL);
if (!pma) {
dev_err(epma_dev->dev, "Failed to alloc pma struct");
return NULL;
}
spin_lock(&epma_dev->rmem_lock);
if (epma_dev->num_free_pages < num_pages_to_alloc) {
dev_err(epma_dev->dev, "not enough free pages\n");
devm_kfree(epma_dev->dev, pma);
spin_unlock(&epma_dev->rmem_lock);
return NULL;
}
/*
* For order 0-5 (ie, 1 to 32 pages) we always allocate within the same set of 64 pages
* Currently, most allocations will be very small (1 page), so the more likely path
* here is order < ORDER_OF_PAGES_PER_BITFIELD_ELEM.
*/
if (likely(order < ORDER_OF_PAGES_PER_BITFIELD_ELEM)) {
size_t alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
for (i = 0; i < alloc_pages_bitmap_size; i++) {
count = 0;
for (bit = 0; bit < PAGES_PER_BITFIELD_ELEM; bit++) {
if (0 == (bitfields[i] & (1ULL << bit))) {
if ((count + 1) >= num_pages_to_alloc) {
/*
* We've found enough free, consecutive pages with which to
* make an allocation
*/
small_granularity_alloc(epma_dev, i, bit - count,
order, pma);
epma_dev->num_free_pages -= num_pages_to_alloc;
spin_unlock(&epma_dev->rmem_lock);
return pma;
}
/* So far so good, but we need more set bits yet */
count++;
} else {
/*
* We found an allocated page, so nothing we've seen so far can be used.
* Keep looking.
*/
count = 0;
}
}
}
} else {
/**
* For allocations of order ORDER_OF_PAGES_PER_BITFIELD_ELEM and above (>= 64 pages), we know
* we'll only get allocations for whole groups of 64 pages, which hugely simplifies the task.
*/
size_t alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
/* How many 64-bit bitfield elements will be needed for the allocation? */
size_t num_bitfield_elements_needed = num_pages_to_alloc / PAGES_PER_BITFIELD_ELEM;
count = 0;
for (i = 0; i < alloc_pages_bitmap_size; i++) {
/* Are all the pages free for the i'th u64 bitfield element? */
if (bitfields[i] == 0) {
count += PAGES_PER_BITFIELD_ELEM;
if (count >= (1ULL << order)) {
size_t start_idx = (i + 1) - num_bitfield_elements_needed;
large_granularity_alloc(epma_dev, start_idx, order, pma);
epma_dev->num_free_pages -= 1ULL << order;
spin_unlock(&epma_dev->rmem_lock);
return pma;
}
} else {
count = 0;
}
}
}
spin_unlock(&epma_dev->rmem_lock);
devm_kfree(epma_dev->dev, pma);
dev_err(epma_dev->dev,
"not enough contiguous pages (need %zu), total free pages left %zu\n",
num_pages_to_alloc, epma_dev->num_free_pages);
return NULL;
}
static phys_addr_t simple_pma_get_phys_addr(struct protected_memory_allocator_device *pma_dev,
struct protected_memory_allocation *pma)
{
struct simple_pma_device *const epma_dev =
container_of(pma_dev, struct simple_pma_device, pma_dev);
dev_dbg(epma_dev->dev, "%s(pma_dev=%px, pma=%px, pa=%pK\n", __func__, (void *)pma_dev,
(void *)pma, (void *)pma->pa);
return pma->pa;
}
static void simple_pma_free_page(struct protected_memory_allocator_device *pma_dev,
struct protected_memory_allocation *pma)
{
struct simple_pma_device *const epma_dev =
container_of(pma_dev, struct simple_pma_device, pma_dev);
size_t num_pages_in_allocation;
size_t offset;
size_t i;
size_t bitfield_idx;
size_t bitfield_start_bit;
size_t page_num;
u64 *bitfield;
size_t alloc_pages_bitmap_size;
size_t num_bitfield_elems_used_by_alloc;
WARN_ON(pma == NULL);
dev_dbg(epma_dev->dev, "%s(pma_dev=%px, pma=%px, pa=%pK\n", __func__, (void *)pma_dev,
(void *)pma, (void *)pma->pa);
WARN_ON(pma->pa < epma_dev->rmem_base);
/* This is an example function that follows an extremely simple logic
* and is vulnerable to abuse.
*/
offset = (pma->pa - epma_dev->rmem_base);
num_pages_in_allocation = (size_t)1 << pma->order;
/* The number of bitfield elements used by the allocation */
num_bitfield_elems_used_by_alloc = num_pages_in_allocation / PAGES_PER_BITFIELD_ELEM;
/* The page number of the first page of the allocation, relative to rmem_base */
page_num = offset >> PAGE_SHIFT;
/* Which u64 bitfield refers to this page? */
bitfield_idx = page_num / PAGES_PER_BITFIELD_ELEM;
alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
/* Is the allocation within expected bounds? */
WARN_ON((bitfield_idx + num_bitfield_elems_used_by_alloc) >= alloc_pages_bitmap_size);
spin_lock(&epma_dev->rmem_lock);
if (pma->order < ORDER_OF_PAGES_PER_BITFIELD_ELEM) {
bitfield = &epma_dev->allocated_pages_bitfield_arr[bitfield_idx];
/* Which bit within that u64 bitfield is the lsb covering this allocation? */
bitfield_start_bit = page_num % PAGES_PER_BITFIELD_ELEM;
/* Clear the bits for the pages we're now freeing */
*bitfield &= ~(((1ULL << num_pages_in_allocation) - 1) << bitfield_start_bit);
} else {
WARN(page_num % PAGES_PER_BITFIELD_ELEM,
"%s: Expecting allocs of order >= %d to be %zu-page aligned\n", __func__,
ORDER_OF_PAGES_PER_BITFIELD_ELEM, PAGES_PER_BITFIELD_ELEM);
for (i = 0; i < num_bitfield_elems_used_by_alloc; i++) {
bitfield = &epma_dev->allocated_pages_bitfield_arr[bitfield_idx + i];
/* We expect all bits to be set (all pages allocated) */
WARN((*bitfield != ~0ULL),
"%s: alloc being freed is not fully allocated: of=%zu np=%zu bf=%.16llx\n",
__func__, offset, num_pages_in_allocation, *bitfield);
/*
* Now clear all the bits in the bitfield element to mark all the pages
* it refers to as free.
*/
*bitfield = 0ULL;
}
}
epma_dev->num_free_pages += num_pages_in_allocation;
spin_unlock(&epma_dev->rmem_lock);
devm_kfree(epma_dev->dev, pma);
}
static int protected_memory_allocator_probe(struct platform_device *pdev)
{
struct simple_pma_device *epma_dev;
struct device_node *np;
phys_addr_t rmem_base;
size_t rmem_size;
size_t alloc_bitmap_pages_arr_size;
#if (KERNEL_VERSION(4, 15, 0) <= LINUX_VERSION_CODE)
struct reserved_mem *rmem;
#endif
np = pdev->dev.of_node;
if (!np) {
dev_err(&pdev->dev, "device node pointer not set\n");
return -ENODEV;
}
np = of_parse_phandle(np, "memory-region", 0);
if (!np) {
dev_err(&pdev->dev, "memory-region node not set\n");
return -ENODEV;
}
#if (KERNEL_VERSION(4, 15, 0) <= LINUX_VERSION_CODE)
rmem = of_reserved_mem_lookup(np);
if (rmem) {
rmem_base = rmem->base;
rmem_size = rmem->size >> PAGE_SHIFT;
} else
#endif
{
of_node_put(np);
dev_err(&pdev->dev, "could not read reserved memory-region\n");
return -ENODEV;
}
of_node_put(np);
epma_dev = devm_kzalloc(&pdev->dev, sizeof(*epma_dev), GFP_KERNEL);
if (!epma_dev)
return -ENOMEM;
epma_dev->pma_dev.ops.pma_alloc_page = simple_pma_alloc_page;
epma_dev->pma_dev.ops.pma_get_phys_addr = simple_pma_get_phys_addr;
epma_dev->pma_dev.ops.pma_free_page = simple_pma_free_page;
epma_dev->pma_dev.owner = THIS_MODULE;
epma_dev->dev = &pdev->dev;
epma_dev->rmem_base = rmem_base;
epma_dev->rmem_size = rmem_size;
epma_dev->num_free_pages = rmem_size;
spin_lock_init(&epma_dev->rmem_lock);
alloc_bitmap_pages_arr_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
epma_dev->allocated_pages_bitfield_arr = devm_kzalloc(
&pdev->dev, alloc_bitmap_pages_arr_size * BITFIELD_ELEM_SIZE, GFP_KERNEL);
if (!epma_dev->allocated_pages_bitfield_arr) {
dev_err(&pdev->dev, "failed to allocate resources\n");
devm_kfree(&pdev->dev, epma_dev);
return -ENOMEM;
}
if (epma_dev->rmem_size % PAGES_PER_BITFIELD_ELEM) {
size_t extra_pages =
alloc_bitmap_pages_arr_size * PAGES_PER_BITFIELD_ELEM - epma_dev->rmem_size;
size_t last_bitfield_index = alloc_bitmap_pages_arr_size - 1;
/* Mark the extra pages (that lie outside the reserved range) as
* always in use.
*/
epma_dev->allocated_pages_bitfield_arr[last_bitfield_index] =
((1ULL << extra_pages) - 1) << (PAGES_PER_BITFIELD_ELEM - extra_pages);
}
platform_set_drvdata(pdev, &epma_dev->pma_dev);
dev_info(&pdev->dev, "Protected memory allocator probed successfully\n");
dev_info(&pdev->dev, "Protected memory region: base=%pK num pages=%zu\n", (void *)rmem_base,
rmem_size);
return 0;
}
static int protected_memory_allocator_remove(struct platform_device *pdev)
{
struct protected_memory_allocator_device *pma_dev = platform_get_drvdata(pdev);
struct simple_pma_device *epma_dev;
struct device *dev;
if (!pma_dev)
return -EINVAL;
epma_dev = container_of(pma_dev, struct simple_pma_device, pma_dev);
dev = epma_dev->dev;
if (epma_dev->num_free_pages < epma_dev->rmem_size) {
dev_warn(&pdev->dev, "Leaking %zu pages of protected memory\n",
epma_dev->rmem_size - epma_dev->num_free_pages);
}
platform_set_drvdata(pdev, NULL);
devm_kfree(dev, epma_dev->allocated_pages_bitfield_arr);
devm_kfree(dev, epma_dev);
dev_info(&pdev->dev, "Protected memory allocator removed successfully\n");
return 0;
}
static const struct of_device_id protected_memory_allocator_dt_ids[] = {
{ .compatible = "arm,protected-memory-allocator" },
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, protected_memory_allocator_dt_ids);
static struct platform_driver
protected_memory_allocator_driver = { .probe = protected_memory_allocator_probe,
.remove = protected_memory_allocator_remove,
.driver = {
.name = "simple_protected_memory_allocator",
.of_match_table = of_match_ptr(
protected_memory_allocator_dt_ids),
} };
module_platform_driver(protected_memory_allocator_driver);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("ARM Ltd.");
MODULE_VERSION("1.0");

View File

@@ -5,3 +5,4 @@
obj-y += host1x/ drm/ vga/
obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/
obj-$(CONFIG_TRACE_GPU_MEM) += trace/
obj-$(CONFIG_MALI_MIDGARD) += arm/

View File

@@ -0,0 +1,60 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
load(
"//build/kernel/kleaf:kernel.bzl",
"kernel_module",
)
filegroup(
name = "gpu_kconfig",
srcs = glob([
"**/*Kconfig",
]),
visibility = [
"//common:__pkg__",
"//common-modules/mali:__subpackages__",
],
)
_gpu_modules = []
kernel_module(
name = "gpu",
srcs = glob([
"**/*.c",
"**/*.h",
"**/*.S",
"**/*Kbuild",
"**/*Makefile",
]) + [
"//common:kernel_headers",
"//common-modules/mali:headers",
],
outs = _gpu_modules,
kernel_build = "//common:kernel_aarch64",
visibility = [
"//common:__pkg__",
"//common-modules/mali:__subpackages__",
],
deps = [
"//common-modules/mali/drivers/base/arm:base",
],
)

21
drivers/gpu/arm/Kbuild Normal file
View File

@@ -0,0 +1,21 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012, 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
obj-$(CONFIG_MALI_MIDGARD) += midgard/

23
drivers/gpu/arm/Kconfig Normal file
View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menu "ARM GPU Configuration"
source "$(MALI_KCONFIG_EXT_PREFIX)drivers/gpu/arm/midgard/Kconfig"
endmenu

23
drivers/gpu/arm/Makefile Normal file
View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
THIS_DIR := $(dir $(lastword $(MAKEFILE_LIST)))
include $(THIS_DIR)midgard/Makefile

View File

@@ -0,0 +1,248 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
# make $(src) as absolute path if it is not already, by prefixing $(srctree)
# This is to prevent any build issue due to wrong path.
src:=$(if $(patsubst /%,,$(src)),$(srctree)/$(src),$(src))
#
# Prevent misuse when Kernel configurations are not present by default
# in out-of-tree builds
#
ifneq ($(CONFIG_ANDROID),n)
ifeq ($(CONFIG_GPU_TRACEPOINTS),n)
$(error CONFIG_GPU_TRACEPOINTS must be set in Kernel configuration)
endif
endif
ifeq ($(CONFIG_DMA_SHARED_BUFFER),n)
$(error CONFIG_DMA_SHARED_BUFFER must be set in Kernel configuration)
endif
ifeq ($(CONFIG_PM_DEVFREQ),n)
$(error CONFIG_PM_DEVFREQ must be set in Kernel configuration)
endif
ifeq ($(CONFIG_DEVFREQ_THERMAL),n)
$(error CONFIG_DEVFREQ_THERMAL must be set in Kernel configuration)
endif
ifeq ($(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND),n)
$(error CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND must be set in Kernel configuration)
endif
ifeq ($(CONFIG_FW_LOADER), n)
$(error CONFIG_FW_LOADER must be set in Kernel configuration)
endif
ifeq ($(CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS), y)
ifneq ($(CONFIG_DEBUG_FS), y)
$(error CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS depends on CONFIG_DEBUG_FS to be set in Kernel configuration)
endif
endif
ifeq ($(CONFIG_MALI_FENCE_DEBUG), y)
ifneq ($(CONFIG_SYNC_FILE), y)
$(error CONFIG_MALI_FENCE_DEBUG depends on CONFIG_SYNC_FILE to be set in Kernel configuration)
endif
endif
#
# Configurations
#
# Driver version string which is returned to userspace via an ioctl
MALI_RELEASE_NAME ?= '"r47p0-01eac0"'
# Set up defaults if not defined by build system
ifeq ($(CONFIG_MALI_DEBUG), y)
MALI_UNIT_TEST = 1
MALI_CUSTOMER_RELEASE ?= 0
else
MALI_UNIT_TEST ?= 0
MALI_CUSTOMER_RELEASE ?= 1
endif
MALI_COVERAGE ?= 0
# Kconfig passes in the name with quotes for in-tree builds - remove them.
MALI_PLATFORM_DIR := $(shell echo $(CONFIG_MALI_PLATFORM_NAME))
ifeq ($(CONFIG_MALI_CSF_SUPPORT),y)
MALI_JIT_PRESSURE_LIMIT_BASE = 0
MALI_USE_CSF = 1
else
MALI_JIT_PRESSURE_LIMIT_BASE ?= 1
MALI_USE_CSF ?= 0
endif
ifneq ($(CONFIG_MALI_KUTF), n)
MALI_KERNEL_TEST_API ?= 1
else
MALI_KERNEL_TEST_API ?= 0
endif
# Experimental features (corresponding -D definition should be appended to
# ccflags-y below, e.g. for MALI_EXPERIMENTAL_FEATURE,
# -DMALI_EXPERIMENTAL_FEATURE=$(MALI_EXPERIMENTAL_FEATURE) should be appended)
#
# Experimental features must default to disabled, e.g.:
# MALI_EXPERIMENTAL_FEATURE ?= 0
MALI_INCREMENTAL_RENDERING_JM ?= 0
#
# ccflags
#
ccflags-y = \
-DMALI_CUSTOMER_RELEASE=$(MALI_CUSTOMER_RELEASE) \
-DMALI_USE_CSF=$(MALI_USE_CSF) \
-DMALI_KERNEL_TEST_API=$(MALI_KERNEL_TEST_API) \
-DMALI_UNIT_TEST=$(MALI_UNIT_TEST) \
-DMALI_COVERAGE=$(MALI_COVERAGE) \
-DMALI_RELEASE_NAME=$(MALI_RELEASE_NAME) \
-DMALI_JIT_PRESSURE_LIMIT_BASE=$(MALI_JIT_PRESSURE_LIMIT_BASE) \
-DMALI_INCREMENTAL_RENDERING_JM=$(MALI_INCREMENTAL_RENDERING_JM) \
-DMALI_PLATFORM_DIR=$(MALI_PLATFORM_DIR)
ifeq ($(KBUILD_EXTMOD),)
# in-tree
ccflags-y +=-DMALI_KBASE_PLATFORM_PATH=../../$(src)/platform/$(CONFIG_MALI_PLATFORM_NAME)
else
# out-of-tree
ccflags-y +=-DMALI_KBASE_PLATFORM_PATH=$(src)/platform/$(CONFIG_MALI_PLATFORM_NAME)
endif
ccflags-y += \
-I$(srctree)/include/linux \
-I$(srctree)/drivers/staging/android \
-I$(src) \
-I$(src)/platform/$(MALI_PLATFORM_DIR) \
-I$(src)/../../../base \
-I$(src)/../../../../include
subdir-ccflags-y += $(ccflags-y)
#
# Kernel Modules
#
obj-$(CONFIG_MALI_MIDGARD) += mali_kbase.o
obj-$(CONFIG_MALI_ARBITRATION) += ../arbitration/
obj-$(CONFIG_MALI_KUTF) += tests/
mali_kbase-y := \
mali_kbase_cache_policy.o \
mali_kbase_ccswe.o \
mali_kbase_mem.o \
mali_kbase_reg_track.o \
mali_kbase_mem_migrate.o \
mali_kbase_mem_pool_group.o \
mali_kbase_native_mgm.o \
mali_kbase_ctx_sched.o \
mali_kbase_gpuprops.o \
mali_kbase_pm.o \
mali_kbase_config.o \
mali_kbase_kinstr_prfcnt.o \
mali_kbase_softjobs.o \
mali_kbase_hw.o \
mali_kbase_debug.o \
mali_kbase_gpu_memory_debugfs.o \
mali_kbase_mem_linux.o \
mali_kbase_core_linux.o \
mali_kbase_mem_profile_debugfs.o \
mali_kbase_disjoint_events.o \
mali_kbase_debug_mem_view.o \
mali_kbase_debug_mem_zones.o \
mali_kbase_debug_mem_allocs.o \
mali_kbase_smc.o \
mali_kbase_mem_pool.o \
mali_kbase_mem_pool_debugfs.o \
mali_kbase_debugfs_helper.o \
mali_kbase_as_fault_debugfs.o \
mali_kbase_regs_history_debugfs.o \
mali_kbase_dvfs_debugfs.o \
mali_power_gpu_frequency_trace.o \
mali_kbase_trace_gpu_mem.o \
mali_kbase_pbha.o
mali_kbase-$(CONFIG_DEBUG_FS) += mali_kbase_pbha_debugfs.o
mali_kbase-$(CONFIG_MALI_CINSTR_GWT) += mali_kbase_gwt.o
mali_kbase-$(CONFIG_SYNC_FILE) += \
mali_kbase_fence_ops.o \
mali_kbase_sync_file.o \
mali_kbase_sync_common.o
mali_kbase-$(CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD) += \
mali_power_gpu_work_period_trace.o \
mali_kbase_gpu_metrics.o
ifneq ($(CONFIG_MALI_CSF_SUPPORT),y)
mali_kbase-y += \
mali_kbase_jm.o \
mali_kbase_dummy_job_wa.o \
mali_kbase_debug_job_fault.o \
mali_kbase_event.o \
mali_kbase_jd.o \
mali_kbase_jd_debugfs.o \
mali_kbase_js.o \
mali_kbase_js_ctx_attr.o \
mali_kbase_kinstr_jm.o
mali_kbase-$(CONFIG_SYNC_FILE) += \
mali_kbase_fence_ops.o \
mali_kbase_fence.o
endif
INCLUDE_SUBDIR = \
$(src)/context/Kbuild \
$(src)/debug/Kbuild \
$(src)/device/Kbuild \
$(src)/backend/gpu/Kbuild \
$(src)/mmu/Kbuild \
$(src)/tl/Kbuild \
$(src)/hwcnt/Kbuild \
$(src)/gpu/Kbuild \
$(src)/hw_access/Kbuild \
$(src)/thirdparty/Kbuild \
$(src)/platform/$(MALI_PLATFORM_DIR)/Kbuild
ifeq ($(CONFIG_MALI_CSF_SUPPORT),y)
INCLUDE_SUBDIR += $(src)/csf/Kbuild
endif
ifeq ($(CONFIG_MALI_ARBITER_SUPPORT),y)
INCLUDE_SUBDIR += $(src)/arbiter/Kbuild
endif
ifeq ($(CONFIG_MALI_DEVFREQ),y)
ifeq ($(CONFIG_DEVFREQ_THERMAL),y)
INCLUDE_SUBDIR += $(src)/ipa/Kbuild
endif
endif
ifeq ($(KBUILD_EXTMOD),)
# in-tree
-include $(INCLUDE_SUBDIR)
else
# out-of-tree
include $(INCLUDE_SUBDIR)
endif

View File

@@ -0,0 +1,402 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menuconfig MALI_MIDGARD
tristate "Mali Midgard series support"
select GPU_TRACEPOINTS if ANDROID
select DMA_SHARED_BUFFER
select PM_DEVFREQ
select DEVFREQ_THERMAL
select FW_LOADER
default n
help
Enable this option to build support for a ARM Mali Midgard GPU.
To compile this driver as a module, choose M here:
this will generate a single module, called mali_kbase.
if MALI_MIDGARD
config MALI_PLATFORM_NAME
depends on MALI_MIDGARD
string "Platform name"
default "devicetree"
help
Enter the name of the desired platform configuration directory to
include in the build. 'platform/$(MALI_PLATFORM_NAME)/Kbuild' must
exist.
choice
prompt "Mali HW backend"
depends on MALI_MIDGARD
default MALI_REAL_HW
config MALI_REAL_HW
bool "Enable build of Mali kernel driver for real HW"
depends on MALI_MIDGARD
help
This is the default HW backend.
config MALI_NO_MALI
bool "Enable build of Mali kernel driver for No Mali"
depends on MALI_MIDGARD && MALI_EXPERT
help
This can be used to test the driver in a simulated environment
whereby the hardware is not physically present. If the hardware is physically
present it will not be used. This can be used to test the majority of the
driver without needing actual hardware or for software benchmarking.
All calls to the simulated hardware will complete immediately as if the hardware
completed the task.
config MALI_NO_MALI_DEFAULT_GPU
string "Default GPU for No Mali"
depends on MALI_NO_MALI
default "tMIx"
help
This option sets the default GPU to identify as for No Mali builds.
endchoice
menu "Platform specific options"
source "$(MALI_KCONFIG_EXT_PREFIX)drivers/gpu/arm/midgard/platform/Kconfig"
endmenu
config MALI_CSF_SUPPORT
bool "Enable Mali CSF based GPU support"
depends on MALI_MIDGARD=m
default n
help
Enables support for CSF based GPUs.
config MALI_DEVFREQ
bool "Enable devfreq support for Mali"
depends on MALI_MIDGARD && PM_DEVFREQ
select DEVFREQ_GOV_SIMPLE_ONDEMAND
default y
help
Support devfreq for Mali.
Using the devfreq framework and, by default, the simple on-demand
governor, the frequency of Mali will be dynamically selected from the
available OPPs.
config MALI_MIDGARD_DVFS
bool "Enable legacy DVFS"
depends on MALI_MIDGARD && !MALI_DEVFREQ
default n
help
Choose this option to enable legacy DVFS in the Mali Midgard DDK.
config MALI_GATOR_SUPPORT
bool "Enable Streamline tracing support"
depends on MALI_MIDGARD
default y
help
Enables kbase tracing used by the Arm Streamline Performance Analyzer.
The tracepoints are used to derive GPU activity charts in Streamline.
config MALI_MIDGARD_ENABLE_TRACE
bool "Enable kbase tracing"
depends on MALI_MIDGARD
default y if MALI_DEBUG
default n
help
Enables tracing in kbase. Trace log available through
the "mali_trace" debugfs file, when the CONFIG_DEBUG_FS is enabled
config MALI_ARBITER_SUPPORT
bool "Enable arbiter support for Mali"
depends on MALI_MIDGARD
default n
help
Enable support for the arbiter interface in the driver.
This allows an external arbiter to manage driver access
to GPU hardware in a virtualized environment
If unsure, say N.
config MALI_DMA_BUF_MAP_ON_DEMAND
bool "Enable map imported dma-bufs on demand"
depends on MALI_MIDGARD
default n
help
This option will cause kbase to set up the GPU mapping of imported
dma-buf when needed to run atoms. This is the legacy behavior.
This is intended for testing and the option will get removed in the
future.
config MALI_DMA_BUF_LEGACY_COMPAT
bool "Enable legacy compatibility cache flush on dma-buf map"
depends on MALI_MIDGARD && !MALI_DMA_BUF_MAP_ON_DEMAND
default n
help
This option enables compatibility with legacy dma-buf mapping
behavior, then the dma-buf is mapped on import, by adding cache
maintenance where MALI_DMA_BUF_MAP_ON_DEMAND would do the mapping,
including a cache flush.
This option might work-around issues related to missing cache
flushes in other drivers. This only has an effect for clients using
UK 11.18 or older. For later UK versions it is not possible.
config MALI_CORESIGHT
depends on MALI_MIDGARD && MALI_CSF_SUPPORT && !MALI_NO_MALI
bool "Enable Kbase CoreSight tracing support"
default n
menuconfig MALI_EXPERT
depends on MALI_MIDGARD
bool "Enable Expert Settings"
default n
help
Enabling this option and modifying the default settings may produce
a driver with performance or other limitations.
if MALI_EXPERT
config LARGE_PAGE_SUPPORT
bool "Support for 2MB page allocations"
depends on MALI_MIDGARD && MALI_EXPERT
default y
help
Rather than allocating all GPU memory page-by-page, allow the system
to decide whether to attempt to allocate 2MB pages from the kernel.
This reduces TLB pressure.
Note that this option only enables the support for the module parameter
and does not necessarily mean that 2MB pages will be used automatically.
This depends on GPU support.
If in doubt, say Y.
config PAGE_MIGRATION_SUPPORT
bool "Enable support for page migration"
depends on MALI_MIDGARD && MALI_EXPERT
default y
default n if ANDROID
help
Compile in support for page migration.
If set to disabled ('n') then page migration cannot
be enabled at all, and related symbols are not compiled in.
If not set, page migration is compiled in by default, and
if not explicitly enabled or disabled with the insmod parameter,
page migration becomes automatically enabled with large pages.
If in doubt, say Y. To strip out page migration symbols and support,
say N.
config MALI_CORESTACK
bool "Enable support of GPU core stack power control"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
Enabling this feature on supported GPUs will let the driver powering
on/off the GPU core stack independently without involving the Power
Domain Controller. This should only be enabled on platforms which
integration of the PDC to the Mali GPU is known to be problematic.
This feature is currently only supported on t-Six and t-HEx GPUs.
If unsure, say N.
comment "Platform options"
depends on MALI_MIDGARD && MALI_EXPERT
config MALI_ERROR_INJECT
bool "Enable No Mali error injection"
depends on MALI_MIDGARD && MALI_EXPERT && MALI_NO_MALI
default n
help
Enables insertion of errors to test module failure and recovery mechanisms.
comment "Debug options"
depends on MALI_MIDGARD && MALI_EXPERT
config MALI_DEBUG
bool "Enable debug build"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
Select this option for increased checking and reporting of errors.
config MALI_FENCE_DEBUG
bool "Enable debug sync fence usage"
depends on MALI_MIDGARD && MALI_EXPERT && SYNC_FILE
default y if MALI_DEBUG
help
Select this option to enable additional checking and reporting on the
use of sync fences in the Mali driver.
This will add a 3s timeout to all sync fence waits in the Mali
driver, so that when work for Mali has been waiting on a sync fence
for a long time a debug message will be printed, detailing what fence
is causing the block, and which dependent Mali atoms are blocked as a
result of this.
The timeout can be changed at runtime through the js_soft_timeout
device attribute, where the timeout is specified in milliseconds.
config MALI_SYSTEM_TRACE
bool "Enable system event tracing support"
depends on MALI_MIDGARD && MALI_EXPERT
default y if MALI_DEBUG
default n
help
Choose this option to enable system trace events for each
kbase event. This is typically used for debugging but has
minimal overhead when not in use. Enable only if you know what
you are doing.
comment "Instrumentation options"
depends on MALI_MIDGARD && MALI_EXPERT
choice
prompt "Select Performance counters set"
default MALI_PRFCNT_SET_PRIMARY
depends on MALI_MIDGARD && MALI_EXPERT
config MALI_PRFCNT_SET_PRIMARY
bool "Primary"
depends on MALI_MIDGARD && MALI_EXPERT
help
Select this option to use primary set of performance counters.
config MALI_PRFCNT_SET_SECONDARY
bool "Secondary"
depends on MALI_MIDGARD && MALI_EXPERT
help
Select this option to use secondary set of performance counters. Kernel
features that depend on an access to the primary set of counters may
become unavailable. Enabling this option will prevent power management
from working optimally and may cause instrumentation tools to return
bogus results.
If unsure, use MALI_PRFCNT_SET_PRIMARY.
config MALI_PRFCNT_SET_TERTIARY
bool "Tertiary"
depends on MALI_MIDGARD && MALI_EXPERT
help
Select this option to use tertiary set of performance counters. Kernel
features that depend on an access to the primary set of counters may
become unavailable. Enabling this option will prevent power management
from working optimally and may cause instrumentation tools to return
bogus results.
If unsure, use MALI_PRFCNT_SET_PRIMARY.
endchoice
config MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
bool "Enable runtime selection of performance counters set via debugfs"
depends on MALI_MIDGARD && MALI_EXPERT && DEBUG_FS
default n
help
Select this option to make the secondary set of performance counters
available at runtime via debugfs. Kernel features that depend on an
access to the primary set of counters may become unavailable.
If no runtime debugfs option is set, the build time counter set
choice will be used.
This feature is unsupported and unstable, and may break at any time.
Enabling this option will prevent power management from working
optimally and may cause instrumentation tools to return bogus results.
No validation is done on the debugfs input. Invalid input could cause
performance counter errors. Valid inputs are the values accepted by
the SET_SELECT bits of the PRFCNT_CONFIG register as defined in the
architecture specification.
If unsure, say N.
config MALI_JOB_DUMP
bool "Enable system level support needed for job dumping"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
Choose this option to enable system level support needed for
job dumping. This is typically used for instrumentation but has
minimal overhead when not in use. Enable only if you know what
you are doing.
comment "Workarounds"
depends on MALI_MIDGARD && MALI_EXPERT
config MALI_PWRSOFT_765
bool "Enable workaround for PWRSOFT-765"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
PWRSOFT-765 fixes devfreq cooling devices issues. The fix was merged
in kernel v4.10, however if backported into the kernel then this
option must be manually selected.
If using kernel >= v4.10 then say N, otherwise if devfreq cooling
changes have been backported say Y to avoid compilation errors.
config MALI_HW_ERRATA_1485982_NOT_AFFECTED
bool "Disable workaround for BASE_HW_ISSUE_GPU2017_1336"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
This option disables the default workaround for GPU2017-1336. The
workaround keeps the L2 cache powered up except for powerdown and reset.
The workaround introduces a limitation that will prevent the running of
protected mode content on fully coherent platforms, as the switch to IO
coherency mode requires the L2 to be turned off.
config MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE
bool "Use alternative workaround for BASE_HW_ISSUE_GPU2017_1336"
depends on MALI_MIDGARD && MALI_EXPERT && !MALI_HW_ERRATA_1485982_NOT_AFFECTED
default n
help
This option uses an alternative workaround for GPU2017-1336. Lowering
the GPU clock to a, platform specific, known good frequency before
powering down the L2 cache. The clock can be specified in the device
tree using the property, opp-mali-errata-1485982. Otherwise the
slowest clock will be selected.
endif
config MALI_ARBITRATION
tristate "Enable Virtualization reference code"
depends on MALI_MIDGARD
default n
help
Enables the build of several reference modules used in the reference
virtualization setup for Mali
If unsure, say N.
config MALI_TRACE_POWER_GPU_WORK_PERIOD
bool "Enable per-application GPU metrics tracepoints"
depends on MALI_MIDGARD
default y
help
This option enables per-application GPU metrics tracepoints.
If unsure, say N.
source "$(MALI_KCONFIG_EXT_PREFIX)drivers/gpu/arm/midgard/tests/Kconfig"
endif

View File

@@ -0,0 +1,300 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
KDIR ?= $(KERNEL_SRC)
M ?= $(shell pwd)
ifeq ($(KDIR),)
$(error Must specify KDIR to point to the kernel to target))
endif
#
# Default configuration values
#
# Dependency resolution is done through statements as Kconfig
# is not supported for out-of-tree builds.
#
CONFIGS :=
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
CONFIG_MALI_MIDGARD ?= m
ifeq ($(CONFIG_MALI_MIDGARD),m)
CONFIG_MALI_PLATFORM_NAME ?= "devicetree"
CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD ?= y
CONFIG_MALI_GATOR_SUPPORT ?= y
CONFIG_MALI_ARBITRATION ?= n
CONFIG_MALI_PARTITION_MANAGER ?= n
CONFIG_MALI_64BIT_HW_ACCESS ?= n
ifneq ($(CONFIG_MALI_NO_MALI),y)
# Prevent misuse when CONFIG_MALI_NO_MALI=y
CONFIG_MALI_REAL_HW ?= y
CONFIG_MALI_CORESIGHT = n
endif
ifeq ($(CONFIG_MALI_MIDGARD_DVFS),y)
# Prevent misuse when CONFIG_MALI_MIDGARD_DVFS=y
CONFIG_MALI_DEVFREQ ?= n
else
CONFIG_MALI_DEVFREQ ?= y
endif
ifeq ($(CONFIG_MALI_DMA_BUF_MAP_ON_DEMAND), y)
# Prevent misuse when CONFIG_MALI_DMA_BUF_MAP_ON_DEMAND=y
CONFIG_MALI_DMA_BUF_LEGACY_COMPAT = n
endif
ifeq ($(CONFIG_MALI_CSF_SUPPORT), y)
CONFIG_MALI_CORESIGHT ?= n
endif
#
# Expert/Debug/Test released configurations
#
ifeq ($(CONFIG_MALI_EXPERT), y)
ifeq ($(CONFIG_MALI_NO_MALI), y)
CONFIG_MALI_REAL_HW = n
CONFIG_MALI_NO_MALI_DEFAULT_GPU ?= "tMIx"
else
# Prevent misuse when CONFIG_MALI_NO_MALI=n
CONFIG_MALI_REAL_HW = y
CONFIG_MALI_ERROR_INJECT = n
endif
ifeq ($(CONFIG_MALI_HW_ERRATA_1485982_NOT_AFFECTED), y)
# Prevent misuse when CONFIG_MALI_HW_ERRATA_1485982_NOT_AFFECTED=y
CONFIG_MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE = n
endif
ifeq ($(CONFIG_MALI_DEBUG), y)
CONFIG_MALI_MIDGARD_ENABLE_TRACE ?= y
CONFIG_MALI_SYSTEM_TRACE ?= y
ifeq ($(CONFIG_SYNC_FILE), y)
CONFIG_MALI_FENCE_DEBUG ?= y
else
CONFIG_MALI_FENCE_DEBUG = n
endif
else
# Prevent misuse when CONFIG_MALI_DEBUG=n
CONFIG_MALI_MIDGARD_ENABLE_TRACE = n
CONFIG_MALI_SYSTEM_TRACE = n
CONFIG_MALI_FENCE_DEBUG = n
endif
else
# Prevent misuse when CONFIG_MALI_EXPERT=n
CONFIG_MALI_CORESTACK = n
CONFIG_LARGE_PAGE_SUPPORT = y
CONFIG_MALI_PWRSOFT_765 = n
CONFIG_MALI_JOB_DUMP = n
CONFIG_MALI_NO_MALI = n
CONFIG_MALI_REAL_HW = y
CONFIG_MALI_ERROR_INJECT = n
CONFIG_MALI_HW_ERRATA_1485982_NOT_AFFECTED = n
CONFIG_MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE = n
CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS = n
CONFIG_MALI_DEBUG = n
CONFIG_MALI_MIDGARD_ENABLE_TRACE = n
CONFIG_MALI_SYSTEM_TRACE = n
CONFIG_MALI_FENCE_DEBUG = n
endif
ifeq ($(CONFIG_MALI_DEBUG), y)
CONFIG_MALI_KUTF ?= y
ifeq ($(CONFIG_MALI_KUTF), y)
CONFIG_MALI_KUTF_IRQ_TEST ?= y
CONFIG_MALI_KUTF_CLK_RATE_TRACE ?= y
CONFIG_MALI_KUTF_MGM_INTEGRATION_TEST ?= y
ifeq ($(CONFIG_MALI_DEVFREQ), y)
ifeq ($(CONFIG_MALI_NO_MALI), y)
CONFIG_MALI_KUTF_IPA_UNIT_TEST ?= y
endif
endif
else
# Prevent misuse when CONFIG_MALI_KUTF=n
CONFIG_MALI_KUTF_IRQ_TEST = n
CONFIG_MALI_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_KUTF_MGM_INTEGRATION_TEST = n
endif
else
# Prevent misuse when CONFIG_MALI_DEBUG=n
CONFIG_MALI_KUTF = n
CONFIG_MALI_KUTF_IRQ_TEST = n
CONFIG_MALI_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_KUTF_MGM_INTEGRATION_TEST = n
endif
else
# Prevent misuse when CONFIG_MALI_MIDGARD=n
CONFIG_MALI_ARBITRATION = n
CONFIG_MALI_KUTF = n
CONFIG_MALI_KUTF_IRQ_TEST = n
CONFIG_MALI_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_KUTF_MGM_INTEGRATION_TEST = n
endif
# All Mali CONFIG should be listed here
CONFIGS += \
CONFIG_MALI_MIDGARD \
CONFIG_MALI_CSF_SUPPORT \
CONFIG_MALI_GATOR_SUPPORT \
CONFIG_MALI_ARBITER_SUPPORT \
CONFIG_MALI_ARBITRATION \
CONFIG_MALI_PARTITION_MANAGER \
CONFIG_MALI_REAL_HW \
CONFIG_MALI_DEVFREQ \
CONFIG_MALI_MIDGARD_DVFS \
CONFIG_MALI_DMA_BUF_MAP_ON_DEMAND \
CONFIG_MALI_DMA_BUF_LEGACY_COMPAT \
CONFIG_MALI_EXPERT \
CONFIG_MALI_CORESTACK \
CONFIG_LARGE_PAGE_SUPPORT \
CONFIG_MALI_PWRSOFT_765 \
CONFIG_MALI_JOB_DUMP \
CONFIG_MALI_NO_MALI \
CONFIG_MALI_ERROR_INJECT \
CONFIG_MALI_HW_ERRATA_1485982_NOT_AFFECTED \
CONFIG_MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE \
CONFIG_MALI_PRFCNT_SET_PRIMARY \
CONFIG_MALI_PRFCNT_SET_SECONDARY \
CONFIG_MALI_PRFCNT_SET_TERTIARY \
CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS \
CONFIG_MALI_DEBUG \
CONFIG_MALI_MIDGARD_ENABLE_TRACE \
CONFIG_MALI_SYSTEM_TRACE \
CONFIG_MALI_FENCE_DEBUG \
CONFIG_MALI_KUTF \
CONFIG_MALI_KUTF_IRQ_TEST \
CONFIG_MALI_KUTF_CLK_RATE_TRACE \
CONFIG_MALI_KUTF_MGM_INTEGRATION_TEST \
CONFIG_MALI_XEN \
CONFIG_MALI_CORESIGHT \
CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD
endif
THIS_DIR := $(dir $(lastword $(MAKEFILE_LIST)))
-include $(THIS_DIR)/../arbitration/Makefile
# MAKE_ARGS to pass the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs and values.
# $(value config) is the name of the CONFIG option.
# $(value $(value config)) is its value (y, m).
# When the CONFIG is not set to y or m, it defaults to n.
MAKE_ARGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
$(value config)=$(value $(value config)), \
$(value config)=n))
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
MAKE_ARGS += CONFIG_MALI_PLATFORM_NAME=$(CONFIG_MALI_PLATFORM_NAME)
endif
#
# EXTRA_CFLAGS to define the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs defines with values from CONFIGS.
# $(value config) is the name of the CONFIG option.
# When set to y or m, the CONFIG gets defined to 1.
EXTRA_CFLAGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
-D$(value config)=1))
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
EXTRA_CFLAGS += -DCONFIG_MALI_PLATFORM_NAME='\"$(CONFIG_MALI_PLATFORM_NAME)\"'
EXTRA_CFLAGS += -DCONFIG_MALI_NO_MALI_DEFAULT_GPU='\"$(CONFIG_MALI_NO_MALI_DEFAULT_GPU)\"'
endif
#
# KBUILD_EXTRA_SYMBOLS to prevent warnings about unknown functions
#
BASE_SYMBOLS =
EXTRA_SYMBOLS += \
$(BASE_SYMBOLS)
CFLAGS_MODULE += -Wall -Werror
# The following were added to align with W=1 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wextra -Wunused -Wno-unused-parameter
CFLAGS_MODULE += -Wmissing-declarations
CFLAGS_MODULE += -Wmissing-format-attribute
CFLAGS_MODULE += -Wmissing-prototypes
CFLAGS_MODULE += -Wold-style-definition
# The -Wmissing-include-dirs cannot be enabled as the path to some of the
# included directories change depending on whether it is an in-tree or
# out-of-tree build.
CFLAGS_MODULE += $(call cc-option, -Wunused-but-set-variable)
CFLAGS_MODULE += $(call cc-option, -Wunused-const-variable)
CFLAGS_MODULE += $(call cc-option, -Wpacked-not-aligned)
CFLAGS_MODULE += $(call cc-option, -Wstringop-truncation)
# The following turn off the warnings enabled by -Wextra
CFLAGS_MODULE += -Wno-sign-compare
CFLAGS_MODULE += -Wno-shift-negative-value
# This flag is needed to avoid build errors on older kernels
CFLAGS_MODULE += $(call cc-option, -Wno-cast-function-type)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
# The following were added to align with W=2 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wdisabled-optimization
# The -Wshadow flag cannot be enabled unless upstream kernels are
# patched to fix redefinitions of certain built-in functions and
# global variables.
CFLAGS_MODULE += $(call cc-option, -Wlogical-op)
CFLAGS_MODULE += -Wmissing-field-initializers
# -Wtype-limits must be disabled due to build failures on kernel 5.x
CFLAGS_MODULE += -Wno-type-limits
CFLAGS_MODULE += $(call cc-option, -Wmaybe-uninitialized)
CFLAGS_MODULE += $(call cc-option, -Wunused-macros)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN2
# This warning is disabled to avoid build failures in some kernel versions
CFLAGS_MODULE += -Wno-ignored-qualifiers
ifeq ($(CONFIG_GCOV_KERNEL),y)
CFLAGS_MODULE += $(call cc-option, -ftest-coverage)
CFLAGS_MODULE += $(call cc-option, -fprofile-arcs)
EXTRA_CFLAGS += -DGCOV_PROFILE=1
endif
ifeq ($(CONFIG_MALI_KCOV),y)
CFLAGS_MODULE += $(call cc-option, -fsanitize-coverage=trace-cmp)
EXTRA_CFLAGS += -DKCOV=1
EXTRA_CFLAGS += -DKCOV_ENABLE_COMPARISONS=1
endif
all:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) EXTRA_CFLAGS="$(EXTRA_CFLAGS)" KBUILD_EXTRA_SYMBOLS="$(EXTRA_SYMBOLS)" modules
modules_install:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) modules_install
clean:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) clean

View File

@@ -0,0 +1,396 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menuconfig MALI_MIDGARD
bool "Mali Midgard series support"
default y
help
Enable this option to build support for a ARM Mali Midgard GPU.
To compile this driver as a module, choose M here:
this will generate a single module, called mali_kbase.
config MALI_PLATFORM_NAME
depends on MALI_MIDGARD
string "Platform name"
default "hisilicon" if PLATFORM_HIKEY960
default "hisilicon" if PLATFORM_HIKEY970
default "devicetree"
help
Enter the name of the desired platform configuration directory to
include in the build. 'platform/$(MALI_PLATFORM_NAME)/Kbuild' must
exist.
When PLATFORM_CUSTOM is set, this needs to be set manually to
pick up the desired platform files.
choice
prompt "Mali HW backend"
depends on MALI_MIDGARD
default MALI_NO_MALI if NO_MALI
default MALI_REAL_HW
config MALI_REAL_HW
bool "Enable build of Mali kernel driver for real HW"
depends on MALI_MIDGARD
help
This is the default HW backend.
config MALI_NO_MALI
bool "Enable build of Mali kernel driver for No Mali"
depends on MALI_MIDGARD && MALI_EXPERT
help
This can be used to test the driver in a simulated environment
whereby the hardware is not physically present. If the hardware is physically
present it will not be used. This can be used to test the majority of the
driver without needing actual hardware or for software benchmarking.
All calls to the simulated hardware will complete immediately as if the hardware
completed the task.
endchoice
config MALI_CSF_SUPPORT
bool "Enable Mali CSF based GPU support"
depends on MALI_MIDGARD
default y if GPU_HAS_CSF
help
Enables support for CSF based GPUs.
config MALI_DEVFREQ
bool "Enable devfreq support for Mali"
depends on MALI_MIDGARD
default y
help
Support devfreq for Mali.
Using the devfreq framework and, by default, the simple on-demand
governor, the frequency of Mali will be dynamically selected from the
available OPPs.
config MALI_MIDGARD_DVFS
bool "Enable legacy DVFS"
depends on MALI_MIDGARD && !MALI_DEVFREQ
default n
help
Choose this option to enable legacy DVFS in the Mali Midgard DDK.
config MALI_GATOR_SUPPORT
bool "Enable Streamline tracing support"
depends on MALI_MIDGARD && !BACKEND_USER
default y
help
Enables kbase tracing used by the Arm Streamline Performance Analyzer.
The tracepoints are used to derive GPU activity charts in Streamline.
config MALI_MIDGARD_ENABLE_TRACE
bool "Enable kbase tracing"
depends on MALI_MIDGARD
default y if MALI_DEBUG
default n
help
Enables tracing in kbase. Trace log available through
the "mali_trace" debugfs file, when the CONFIG_DEBUG_FS is enabled
config MALI_ARBITER_SUPPORT
bool "Enable arbiter support for Mali"
depends on MALI_MIDGARD
default n
help
Enable support for the arbiter interface in the driver.
This allows an external arbiter to manage driver access
to GPU hardware in a virtualized environment
If unsure, say N.
config DMA_BUF_SYNC_IOCTL_SUPPORTED
bool "Enable Kernel DMA buffers support DMA_BUF_IOCTL_SYNC"
depends on MALI_MIDGARD && BACKEND_KERNEL
default y
config MALI_DMA_BUF_MAP_ON_DEMAND
bool "Enable map imported dma-bufs on demand"
depends on MALI_MIDGARD
default n
default y if !DMA_BUF_SYNC_IOCTL_SUPPORTED
help
This option will cause kbase to set up the GPU mapping of imported
dma-buf when needed to run atoms. This is the legacy behavior.
This is intended for testing and the option will get removed in the
future.
config MALI_DMA_BUF_LEGACY_COMPAT
bool "Enable legacy compatibility cache flush on dma-buf map"
depends on MALI_MIDGARD && !MALI_DMA_BUF_MAP_ON_DEMAND
default n
help
This option enables compatibility with legacy dma-buf mapping
behavior, then the dma-buf is mapped on import, by adding cache
maintenance where MALI_DMA_BUF_MAP_ON_DEMAND would do the mapping,
including a cache flush.
This option might work-around issues related to missing cache
flushes in other drivers. This only has an effect for clients using
UK 11.18 or older. For later UK versions it is not possible.
config MALI_CORESIGHT
depends on MALI_MIDGARD && MALI_CSF_SUPPORT && !NO_MALI
select CSFFW_DEBUG_FW_AS_RW
bool "Enable Kbase CoreSight tracing support"
default n
menuconfig MALI_EXPERT
depends on MALI_MIDGARD
bool "Enable Expert Settings"
default y
help
Enabling this option and modifying the default settings may produce
a driver with performance or other limitations.
config MALI_CORESTACK
bool "Enable support of GPU core stack power control"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
Enabling this feature on supported GPUs will let the driver powering
on/off the GPU core stack independently without involving the Power
Domain Controller. This should only be enabled on platforms which
integration of the PDC to the Mali GPU is known to be problematic.
This feature is currently only supported on t-Six and t-HEx GPUs.
If unsure, say N.
config PAGE_MIGRATION_SUPPORT
bool "Compile with page migration support"
depends on BACKEND_KERNEL
default y
default n if ANDROID
help
Compile in support for page migration.
If set to disabled ('n') then page migration cannot
be enabled at all. If set to enabled, then page migration
support is explicitly compiled in. This has no effect when
PAGE_MIGRATION_OVERRIDE is disabled.
config LARGE_PAGE_SUPPORT
bool "Support for 2MB page allocations"
depends on BACKEND_KERNEL
default y
help
Rather than allocating all GPU memory page-by-page, allow the system
to decide whether to attempt to allocate 2MB pages from the kernel.
This reduces TLB pressure and helps to prevent memory fragmentation.
Note that this option only enables the support for the module parameter
and does not necessarily mean that 2MB pages will be used automatically.
This depends on GPU support.
If in doubt, say Y.
choice
prompt "Error injection level"
depends on MALI_MIDGARD && MALI_EXPERT
default MALI_ERROR_INJECT_NONE
help
Enables insertion of errors to test module failure and recovery mechanisms.
config MALI_ERROR_INJECT_NONE
bool "disabled"
depends on MALI_MIDGARD && MALI_EXPERT
help
Error injection is disabled.
config MALI_ERROR_INJECT_TRACK_LIST
bool "error track list"
depends on MALI_MIDGARD && MALI_EXPERT && NO_MALI
help
Errors to inject are pre-configured by the user.
config MALI_ERROR_INJECT_RANDOM
bool "random error injection"
depends on MALI_MIDGARD && MALI_EXPERT && NO_MALI
help
Injected errors are random, rather than user-driven.
endchoice
config MALI_ERROR_INJECT_ON
string
depends on MALI_MIDGARD && MALI_EXPERT
default "0" if MALI_ERROR_INJECT_NONE
default "1" if MALI_ERROR_INJECT_TRACK_LIST
default "2" if MALI_ERROR_INJECT_RANDOM
config MALI_ERROR_INJECT
bool
depends on MALI_MIDGARD && MALI_EXPERT
default y if !MALI_ERROR_INJECT_NONE
config MALI_DEBUG
bool "Enable debug build"
depends on MALI_MIDGARD && MALI_EXPERT
default y if DEBUG
default n
help
Select this option for increased checking and reporting of errors.
config MALI_GCOV_KERNEL
bool "Enable branch coverage via gcov"
depends on MALI_MIDGARD && MALI_DEBUG
default n
help
Choose this option to enable building kbase with branch
coverage information. When built against a supporting kernel,
the coverage information will be available via debugfs.
config MALI_KCOV
bool "Enable kcov coverage to support fuzzers"
depends on MALI_MIDGARD && MALI_DEBUG
default n
help
Choose this option to enable building with fuzzing-oriented
coverage, to improve the random test cases that are generated.
config MALI_FENCE_DEBUG
bool "Enable debug sync fence usage"
depends on MALI_MIDGARD && MALI_EXPERT
default y if MALI_DEBUG
help
Select this option to enable additional checking and reporting on the
use of sync fences in the Mali driver.
This will add a 3s timeout to all sync fence waits in the Mali
driver, so that when work for Mali has been waiting on a sync fence
for a long time a debug message will be printed, detailing what fence
is causing the block, and which dependent Mali atoms are blocked as a
result of this.
The timeout can be changed at runtime through the js_soft_timeout
device attribute, where the timeout is specified in milliseconds.
config MALI_SYSTEM_TRACE
bool "Enable system event tracing support"
depends on MALI_MIDGARD && MALI_EXPERT
default y if MALI_DEBUG
default n
help
Choose this option to enable system trace events for each
kbase event. This is typically used for debugging but has
minimal overhead when not in use. Enable only if you know what
you are doing.
# Instrumentation options.
# config MALI_PRFCNT_SET_PRIMARY exists in the Kernel Kconfig but is configured using CINSTR_PRIMARY_HWC in Mconfig.
# config MALI_PRFCNT_SET_SECONDARY exists in the Kernel Kconfig but is configured using CINSTR_SECONDARY_HWC in Mconfig.
# config MALI_PRFCNT_SET_TERTIARY exists in the Kernel Kconfig but is configured using CINSTR_TERTIARY_HWC in Mconfig.
# config MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS exists in the Kernel Kconfig but is configured using CINSTR_HWC_SET_SELECT_VIA_DEBUG_FS in Mconfig.
config MALI_JOB_DUMP
bool "Enable system level support needed for job dumping"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
Choose this option to enable system level support needed for
job dumping. This is typically used for instrumentation but has
minimal overhead when not in use. Enable only if you know what
you are doing.
config MALI_PWRSOFT_765
bool "Enable workaround for PWRSOFT-765"
depends on MALI_MIDGARD && MALI_EXPERT
default n
help
PWRSOFT-765 fixes devfreq cooling devices issues. The fix was merged
in kernel v4.10, however if backported into the kernel then this
option must be manually selected.
If using kernel >= v4.10 then say N, otherwise if devfreq cooling
changes have been backported say Y to avoid compilation errors.
config MALI_HW_ERRATA_1485982_NOT_AFFECTED
bool "Disable workaround for BASE_HW_ISSUE_GPU2017_1336"
depends on MALI_MIDGARD && MALI_EXPERT
default n
default y if PLATFORM_JUNO
help
This option disables the default workaround for GPU2017-1336. The
workaround keeps the L2 cache powered up except for powerdown and reset.
The workaround introduces a limitation that will prevent the running of
protected mode content on fully coherent platforms, as the switch to IO
coherency mode requires the L2 to be turned off.
config MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE
bool "Use alternative workaround for BASE_HW_ISSUE_GPU2017_1336"
depends on MALI_MIDGARD && MALI_EXPERT && !MALI_HW_ERRATA_1485982_NOT_AFFECTED
default n
help
This option uses an alternative workaround for GPU2017-1336. Lowering
the GPU clock to a, platform specific, known good frequeuncy before
powering down the L2 cache. The clock can be specified in the device
tree using the property, opp-mali-errata-1485982. Otherwise the
slowest clock will be selected.
config MALI_TRACE_POWER_GPU_WORK_PERIOD
bool "Enable per-application GPU metrics tracepoints"
depends on MALI_MIDGARD
default y
help
This option enables per-application GPU metrics tracepoints.
If unsure, say N.
choice
prompt "CSF Firmware trace mode"
depends on MALI_MIDGARD
default MALI_FW_TRACE_MODE_MANUAL
help
CSF Firmware log operating mode.
config MALI_FW_TRACE_MODE_MANUAL
bool "manual mode"
depends on MALI_MIDGARD
help
firmware log can be read manually by the userspace (and it will
also be dumped automatically into dmesg on GPU reset).
config MALI_FW_TRACE_MODE_AUTO_PRINT
bool "automatic printing mode"
depends on MALI_MIDGARD
help
firmware log will be periodically emptied into dmesg, manual
reading through debugfs is disabled.
config MALI_FW_TRACE_MODE_AUTO_DISCARD
bool "automatic discarding mode"
depends on MALI_MIDGARD
help
firmware log will be periodically discarded, the remaining log can be
read manually by the userspace (and it will also be dumped
automatically into dmesg on GPU reset).
endchoice
source "kernel/drivers/gpu/arm/arbitration/Mconfig"
source "kernel/drivers/gpu/arm/midgard/tests/Mconfig"

View File

@@ -0,0 +1,23 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
mali_kbase-y += \
arbiter/mali_kbase_arbif.o \
arbiter/mali_kbase_arbiter_pm.o

View File

@@ -0,0 +1,353 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
*/
#include <mali_kbase.h>
#include "mali_kbase_arbif.h"
#include <tl/mali_kbase_tracepoints.h>
#include <linux/of.h>
#include <linux/of_platform.h>
#include "linux/mali_arbiter_interface.h"
/* Arbiter interface version against which was implemented this module */
#define MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION 5
#if MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION != MALI_ARBITER_INTERFACE_VERSION
#error "Unsupported Mali Arbiter interface version."
#endif
static void on_max_config(struct device *dev, uint32_t max_l2_slices, uint32_t max_core_mask)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
if (!max_l2_slices || !max_core_mask) {
dev_dbg(dev, "%s(): max_config ignored as one of the fields is zero", __func__);
return;
}
/* set the max config info in the kbase device */
kbase_arbiter_set_max_config(kbdev, max_l2_slices, max_core_mask);
}
/**
* on_update_freq() - Updates GPU clock frequency
* @dev: arbiter interface device handle
* @freq: GPU clock frequency value reported from arbiter
*
* call back function to update GPU clock frequency with
* new value from arbiter
*/
static void on_update_freq(struct device *dev, uint32_t freq)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
kbase_arbiter_pm_update_gpu_freq(&kbdev->arb.arb_freq, freq);
}
/**
* on_gpu_stop() - sends KBASE_VM_GPU_STOP_EVT event on VM stop
* @dev: arbiter interface device handle
*
* call back function to signal a GPU STOP event from arbiter interface
*/
static void on_gpu_stop(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
KBASE_TLSTREAM_TL_ARBITER_STOP_REQUESTED(kbdev, kbdev);
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_STOP_EVT);
}
/**
* on_gpu_granted() - sends KBASE_VM_GPU_GRANTED_EVT event on GPU granted
* @dev: arbiter interface device handle
*
* call back function to signal a GPU GRANT event from arbiter interface
*/
static void on_gpu_granted(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
KBASE_TLSTREAM_TL_ARBITER_GRANTED(kbdev, kbdev);
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_GRANTED_EVT);
}
/**
* on_gpu_lost() - sends KBASE_VM_GPU_LOST_EVT event on GPU granted
* @dev: arbiter interface device handle
*
* call back function to signal a GPU LOST event from arbiter interface
*/
static void on_gpu_lost(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_LOST_EVT);
}
/**
* kbase_arbif_init() - Kbase Arbiter interface initialisation.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialise Kbase Arbiter interface and assign callback functions.
*
* Return:
* * 0 - the interface was initialized or was not specified
* * in the device tree.
* * -EFAULT - the interface was specified but failed to initialize.
* * -EPROBE_DEFER - module dependencies are not yet available.
*/
int kbase_arbif_init(struct kbase_device *kbdev)
{
#if IS_ENABLED(CONFIG_OF)
struct arbiter_if_arb_vm_ops ops;
struct arbiter_if_dev *arb_if;
struct device_node *arbiter_if_node;
struct platform_device *pdev;
int err;
dev_dbg(kbdev->dev, "%s\n", __func__);
arbiter_if_node = of_parse_phandle(kbdev->dev->of_node, "arbiter-if", 0);
if (!arbiter_if_node)
arbiter_if_node = of_parse_phandle(kbdev->dev->of_node, "arbiter_if", 0);
if (!arbiter_if_node) {
dev_dbg(kbdev->dev, "No arbiter_if in Device Tree\n");
/* no arbiter interface defined in device tree */
kbdev->arb.arb_dev = NULL;
kbdev->arb.arb_if = NULL;
return 0;
}
pdev = of_find_device_by_node(arbiter_if_node);
if (!pdev) {
dev_err(kbdev->dev, "Failed to find arbiter_if device\n");
return -EPROBE_DEFER;
}
if (!pdev->dev.driver || !try_module_get(pdev->dev.driver->owner)) {
dev_err(kbdev->dev, "arbiter_if driver not available\n");
put_device(&pdev->dev);
return -EPROBE_DEFER;
}
kbdev->arb.arb_dev = &pdev->dev;
arb_if = platform_get_drvdata(pdev);
if (!arb_if) {
dev_err(kbdev->dev, "arbiter_if driver not ready\n");
module_put(pdev->dev.driver->owner);
put_device(&pdev->dev);
return -EPROBE_DEFER;
}
kbdev->arb.arb_if = arb_if;
ops.arb_vm_gpu_stop = on_gpu_stop;
ops.arb_vm_gpu_granted = on_gpu_granted;
ops.arb_vm_gpu_lost = on_gpu_lost;
ops.arb_vm_max_config = on_max_config;
ops.arb_vm_update_freq = on_update_freq;
kbdev->arb.arb_freq.arb_freq = 0;
kbdev->arb.arb_freq.freq_updated = false;
mutex_init(&kbdev->arb.arb_freq.arb_freq_lock);
/* register kbase arbiter_if callbacks */
if (arb_if->vm_ops.vm_arb_register_dev) {
err = arb_if->vm_ops.vm_arb_register_dev(arb_if, kbdev->dev, &ops);
if (err) {
dev_err(&pdev->dev, "Failed to register with arbiter. (err = %d)\n", err);
module_put(pdev->dev.driver->owner);
put_device(&pdev->dev);
if (err != -EPROBE_DEFER)
err = -EFAULT;
return err;
}
}
#else /* CONFIG_OF */
dev_dbg(kbdev->dev, "No arbiter without Device Tree support\n");
kbdev->arb.arb_dev = NULL;
kbdev->arb.arb_if = NULL;
#endif
return 0;
}
/**
* kbase_arbif_destroy() - De-init Kbase arbiter interface
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* De-initialise Kbase arbiter interface
*/
void kbase_arbif_destroy(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_unregister_dev) {
dev_dbg(kbdev->dev, "%s\n", __func__);
arb_if->vm_ops.vm_arb_unregister_dev(kbdev->arb.arb_if);
}
kbdev->arb.arb_if = NULL;
if (kbdev->arb.arb_dev) {
module_put(kbdev->arb.arb_dev->driver->owner);
put_device(kbdev->arb.arb_dev);
}
kbdev->arb.arb_dev = NULL;
}
/**
* kbase_arbif_get_max_config() - Request max config info
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting max config info
*/
void kbase_arbif_get_max_config(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_get_max_config) {
dev_dbg(kbdev->dev, "%s\n", __func__);
arb_if->vm_ops.vm_arb_get_max_config(arb_if);
}
}
/**
* kbase_arbif_gpu_request() - Request GPU from
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting GPU for VM
*/
void kbase_arbif_gpu_request(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_request) {
dev_dbg(kbdev->dev, "%s\n", __func__);
KBASE_TLSTREAM_TL_ARBITER_REQUESTED(kbdev, kbdev);
arb_if->vm_ops.vm_arb_gpu_request(arb_if);
}
}
/**
* kbase_arbif_gpu_stopped() - send GPU stopped message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @gpu_required: GPU request flag
*
*/
void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_stopped) {
dev_dbg(kbdev->dev, "%s\n", __func__);
KBASE_TLSTREAM_TL_ARBITER_STOPPED(kbdev, kbdev);
if (gpu_required)
KBASE_TLSTREAM_TL_ARBITER_REQUESTED(kbdev, kbdev);
arb_if->vm_ops.vm_arb_gpu_stopped(arb_if, gpu_required);
}
}
/**
* kbase_arbif_gpu_active() - Sends a GPU_ACTIVE message to the Arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Informs the arbiter VM is active
*/
void kbase_arbif_gpu_active(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_active) {
dev_dbg(kbdev->dev, "%s\n", __func__);
arb_if->vm_ops.vm_arb_gpu_active(arb_if);
}
}
/**
* kbase_arbif_gpu_idle() - Inform the arbiter that the VM has gone idle
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Informs the arbiter VM is idle
*/
void kbase_arbif_gpu_idle(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_idle) {
dev_dbg(kbdev->dev, "vm_arb_gpu_idle\n");
arb_if->vm_ops.vm_arb_gpu_idle(arb_if);
}
}

View File

@@ -0,0 +1,121 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
*/
#ifndef _MALI_KBASE_ARBIF_H_
#define _MALI_KBASE_ARBIF_H_
/**
* enum kbase_arbif_evt - Internal Arbiter event.
*
* @KBASE_VM_GPU_INITIALIZED_EVT: KBase has finished initializing
* and can be stopped
* @KBASE_VM_GPU_STOP_EVT: Stop message received from Arbiter
* @KBASE_VM_GPU_GRANTED_EVT: Grant message received from Arbiter
* @KBASE_VM_GPU_LOST_EVT: Lost message received from Arbiter
* @KBASE_VM_GPU_IDLE_EVENT: KBase has transitioned into an inactive state.
* @KBASE_VM_REF_EVENT: KBase has transitioned into an active state.
* @KBASE_VM_OS_SUSPEND_EVENT: KBase is suspending
* @KBASE_VM_OS_RESUME_EVENT: Kbase is resuming
*/
enum kbase_arbif_evt {
KBASE_VM_GPU_INITIALIZED_EVT = 1,
KBASE_VM_GPU_STOP_EVT,
KBASE_VM_GPU_GRANTED_EVT,
KBASE_VM_GPU_LOST_EVT,
KBASE_VM_GPU_IDLE_EVENT,
KBASE_VM_REF_EVENT,
KBASE_VM_OS_SUSPEND_EVENT,
KBASE_VM_OS_RESUME_EVENT,
};
/**
* kbase_arbif_init() - Initialize the arbiter interface functionality.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialize the arbiter interface and also determines
* if Arbiter functionality is required.
*
* Return:
* * 0 - the interface was initialized or was not specified
* * in the device tree.
* * -EFAULT - the interface was specified but failed to initialize.
* * -EPROBE_DEFER - module dependencies are not yet available.
*/
int kbase_arbif_init(struct kbase_device *kbdev);
/**
* kbase_arbif_destroy() - Cleanups the arbiter interface functionality.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Cleans up the arbiter interface functionality and resets the reference count
* of the arbif module used
*/
void kbase_arbif_destroy(struct kbase_device *kbdev);
/**
* kbase_arbif_get_max_config() - Request max config info
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting max config info
*/
void kbase_arbif_get_max_config(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_request() - Send GPU request message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to request GPU access.
*/
void kbase_arbif_gpu_request(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_stopped() - Send GPU stopped message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @gpu_required: true if GPU access is still required
* (Arbiter will automatically send another grant message)
*
* Sends a message to Arbiter to notify that the GPU has stopped.
* @note Once this call has been made, KBase must not attempt to access the GPU
* until the #KBASE_VM_GPU_GRANTED_EVT event has been received.
*/
void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required);
/**
* kbase_arbif_gpu_active() - Send a GPU active message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to report that KBase has gone active.
*/
void kbase_arbif_gpu_active(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_idle() - Send a GPU idle message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to report that KBase has gone idle.
*/
void kbase_arbif_gpu_idle(struct kbase_device *kbdev);
#endif /* _MALI_KBASE_ARBIF_H_ */

View File

@@ -0,0 +1,76 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali structures define to support arbitration feature
*/
#ifndef _MALI_KBASE_ARBITER_DEFS_H_
#define _MALI_KBASE_ARBITER_DEFS_H_
#include "mali_kbase_arbiter_pm.h"
/**
* struct kbase_arbiter_vm_state - Struct representing the state and containing the
* data of pm work
* @kbdev: Pointer to kbase device structure (must be a valid pointer)
* @vm_state_lock: The lock protecting the VM state when arbiter is used.
* This lock must also be held whenever the VM state is being
* transitioned
* @vm_state_wait: Wait queue set when GPU is granted
* @vm_state: Current state of VM
* @vm_arb_wq: Work queue for resuming or stopping work on the GPU for use
* with the Arbiter
* @vm_suspend_work: Work item for vm_arb_wq to stop current work on GPU
* @vm_resume_work: Work item for vm_arb_wq to resume current work on GPU
* @vm_arb_starting: Work queue resume in progress
* @vm_arb_stopping: Work queue suspend in progress
* @interrupts_installed: Flag set when interrupts are installed
* @vm_request_timer: Timer to monitor GPU request
*/
struct kbase_arbiter_vm_state {
struct kbase_device *kbdev;
struct mutex vm_state_lock;
wait_queue_head_t vm_state_wait;
enum kbase_vm_state vm_state;
struct workqueue_struct *vm_arb_wq;
struct work_struct vm_suspend_work;
struct work_struct vm_resume_work;
bool vm_arb_starting;
bool vm_arb_stopping;
bool interrupts_installed;
struct hrtimer vm_request_timer;
};
/**
* struct kbase_arbiter_device - Representing an instance of arbiter device,
* allocated from the probe method of Mali driver
* @arb_if: Pointer to the arbiter interface device
* @arb_dev: Pointer to the arbiter device
* @arb_freq: GPU clock frequency retrieved from arbiter.
*/
struct kbase_arbiter_device {
struct arbiter_if_dev *arb_if;
struct device *arb_dev;
struct kbase_arbiter_freq arb_freq;
};
#endif /* _MALI_KBASE_ARBITER_DEFS_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,192 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter power manager state machine and APIs
*/
#ifndef _MALI_KBASE_ARBITER_PM_H_
#define _MALI_KBASE_ARBITER_PM_H_
#include "mali_kbase_arbif.h"
/**
* enum kbase_vm_state - Current PM Arbitration state.
*
* @KBASE_VM_STATE_INITIALIZING: Special state before arbiter is initialized.
* @KBASE_VM_STATE_INITIALIZING_WITH_GPU: Initialization after GPU
* has been granted.
* @KBASE_VM_STATE_SUSPENDED: KBase is suspended by OS and GPU is not assigned.
* @KBASE_VM_STATE_STOPPED: GPU is not assigned to KBase and is not required.
* @KBASE_VM_STATE_STOPPED_GPU_REQUESTED: GPU is not assigned to KBase
* but a request has been made.
* @KBASE_VM_STATE_STARTING: GPU is assigned and KBase is getting ready to run.
* @KBASE_VM_STATE_IDLE: GPU is assigned but KBase has no work to do
* @KBASE_VM_STATE_ACTIVE: GPU is assigned and KBase is busy using it
* @KBASE_VM_STATE_SUSPEND_PENDING: OS is going into suspend mode.
* @KBASE_VM_STATE_SUSPEND_WAIT_FOR_GRANT: OS is going into suspend mode but GPU
* has already been requested.
* In this situation we must wait for
* the Arbiter to send a GRANTED message
* and respond immediately with
* a STOPPED message before entering
* the suspend mode.
* @KBASE_VM_STATE_STOPPING_IDLE: Arbiter has sent a stopped message and there
* is currently no work to do on the GPU.
* @KBASE_VM_STATE_STOPPING_ACTIVE: Arbiter has sent a stopped message when
* KBase has work to do.
*/
enum kbase_vm_state {
KBASE_VM_STATE_INITIALIZING,
KBASE_VM_STATE_INITIALIZING_WITH_GPU,
KBASE_VM_STATE_SUSPENDED,
KBASE_VM_STATE_STOPPED,
KBASE_VM_STATE_STOPPED_GPU_REQUESTED,
KBASE_VM_STATE_STARTING,
KBASE_VM_STATE_IDLE,
KBASE_VM_STATE_ACTIVE,
KBASE_VM_STATE_SUSPEND_PENDING,
KBASE_VM_STATE_SUSPEND_WAIT_FOR_GRANT,
KBASE_VM_STATE_STOPPING_IDLE,
KBASE_VM_STATE_STOPPING_ACTIVE
};
/**
* kbase_arbiter_pm_early_init() - Initialize arbiter for VM Paravirtualized use
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialize the arbiter and other required resources during the runtime
* and request the GPU for the VM for the first time.
*
* Return: 0 if successful, otherwise a standard Linux error code
*/
int kbase_arbiter_pm_early_init(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_early_term() - Shutdown arbiter and free resources.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Clean up all the resources
*/
void kbase_arbiter_pm_early_term(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_release_interrupts() - Release the GPU interrupts
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Releases interrupts and set the interrupt flag to false
*/
void kbase_arbiter_pm_release_interrupts(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_install_interrupts() - Install the GPU interrupts
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Install interrupts and set the interrupt_install flag to true.
*
* Return: 0 if success, or a Linux error code
*/
int kbase_arbiter_pm_install_interrupts(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_vm_event() - Dispatch VM event to the state machine
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @event: The event to dispatch
*
* The state machine function. Receives events and transitions states
* according the event received and the current state
*/
void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev, enum kbase_arbif_evt event);
/**
* kbase_arbiter_pm_ctx_active_handle_suspend() - Handle suspend operation for
* arbitration mode
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @suspend_handler: The handler code for how to handle a suspend
* that might occur
*
* This function handles a suspend event from the driver,
* communicating with the arbiter and waiting synchronously for the GPU
* to be granted again depending on the VM state.
*
* Return: 0 if success, 1 if failure due to system suspending/suspended
*/
int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
enum kbase_pm_suspend_handler suspend_handler);
/**
* kbase_arbiter_pm_vm_stopped() - Handle stop event for the VM
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* This function handles a stop event for the VM.
* It will update the VM state and forward the stop event to the driver.
*/
void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev);
/**
* kbase_arbiter_set_max_config() - Set the max config data in kbase device.
* @kbdev: The kbase device structure for the device (must be a valid pointer).
* @max_l2_slices: The maximum number of L2 slices.
* @max_core_mask: The largest core mask.
*
* This function handles a stop event for the VM.
* It will update the VM state and forward the stop event to the driver.
*/
void kbase_arbiter_set_max_config(struct kbase_device *kbdev, uint32_t max_l2_slices,
uint32_t max_core_mask);
/**
* kbase_arbiter_pm_gpu_assigned() - Determine if this VM has access to the GPU
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Return: 0 if the VM does not have access, 1 if it does, and a negative number
* if an error occurred
*/
int kbase_arbiter_pm_gpu_assigned(struct kbase_device *kbdev);
extern struct kbase_clk_rate_trace_op_conf arb_clk_rate_trace_ops;
/**
* struct kbase_arbiter_freq - Holding the GPU clock frequency data retrieved
* from arbiter
* @arb_freq: GPU clock frequency value
* @arb_freq_lock: Mutex protecting access to arbfreq value
* @nb: Notifier block to receive rate change callbacks
* @freq_updated: Flag to indicate whether a frequency changed has just been
* communicated to avoid "GPU_GRANTED when not expected" warning
*/
struct kbase_arbiter_freq {
uint32_t arb_freq;
struct mutex arb_freq_lock;
struct notifier_block *nb;
bool freq_updated;
};
/**
* kbase_arbiter_pm_update_gpu_freq() - Update GPU frequency
* @arb_freq: Pointer to GPU clock frequency data
* @freq: The new frequency
*
* Updates the GPU frequency and triggers any notifications
*/
void kbase_arbiter_pm_update_gpu_freq(struct kbase_arbiter_freq *arb_freq, uint32_t freq);
#endif /*_MALI_KBASE_ARBITER_PM_H_ */

View File

@@ -0,0 +1,58 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2014-2022 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
mali_kbase-y += \
backend/gpu/mali_kbase_cache_policy_backend.o \
backend/gpu/mali_kbase_gpuprops_backend.o \
backend/gpu/mali_kbase_irq_linux.o \
backend/gpu/mali_kbase_pm_backend.o \
backend/gpu/mali_kbase_pm_driver.o \
backend/gpu/mali_kbase_pm_metrics.o \
backend/gpu/mali_kbase_pm_ca.o \
backend/gpu/mali_kbase_pm_always_on.o \
backend/gpu/mali_kbase_pm_coarse_demand.o \
backend/gpu/mali_kbase_pm_policy.o \
backend/gpu/mali_kbase_time.o \
backend/gpu/mali_kbase_l2_mmu_config.o \
backend/gpu/mali_kbase_clk_rate_trace_mgr.o
ifeq ($(MALI_USE_CSF),0)
mali_kbase-y += \
backend/gpu/mali_kbase_instr_backend.o \
backend/gpu/mali_kbase_jm_as.o \
backend/gpu/mali_kbase_debug_job_fault_backend.o \
backend/gpu/mali_kbase_jm_hw.o \
backend/gpu/mali_kbase_jm_rb.o \
backend/gpu/mali_kbase_js_backend.o
endif
mali_kbase-$(CONFIG_MALI_DEVFREQ) += \
backend/gpu/mali_kbase_devfreq.o
ifneq ($(CONFIG_MALI_REAL_HW),y)
mali_kbase-y += backend/gpu/mali_kbase_model_linux.o
endif
# NO_MALI Dummy model interface
mali_kbase-$(CONFIG_MALI_NO_MALI) += backend/gpu/mali_kbase_model_dummy.o
# HW error simulation
mali_kbase-$(CONFIG_MALI_NO_MALI) += backend/gpu/mali_kbase_model_error_generator.o

View File

@@ -0,0 +1,64 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include "backend/gpu/mali_kbase_cache_policy_backend.h"
#include <device/mali_kbase_device.h>
void kbase_cache_set_coherency_mode(struct kbase_device *kbdev, u32 mode)
{
kbdev->current_gpu_coherency_mode = mode;
#if MALI_USE_CSF
if (kbdev->gpu_props.gpu_id.arch_id >= GPU_ID_ARCH_MAKE(12, 0, 1)) {
/* AMBA_ENABLE present from 12.0.1 */
u32 val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE));
val = AMBA_ENABLE_COHERENCY_PROTOCOL_SET(val, mode);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE), val);
} else {
/* Fallback to COHERENCY_ENABLE for older versions */
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(COHERENCY_ENABLE), mode);
}
#else /* MALI_USE_CSF */
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(COHERENCY_ENABLE), mode);
#endif /* MALI_USE_CSF */
}
void kbase_amba_set_shareable_cache_support(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
/* AMBA registers only present from 12.0.1 */
if (kbdev->gpu_props.gpu_id.arch_id < GPU_ID_ARCH_MAKE(12, 0, 1))
return;
if (kbdev->system_coherency != COHERENCY_NONE) {
u32 val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_FEATURES));
if (AMBA_FEATURES_SHAREABLE_CACHE_SUPPORT_GET(val)) {
val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE));
val = AMBA_ENABLE_SHAREABLE_CACHE_SUPPORT_SET(val, 1);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE), val);
}
}
#endif /* MALI_USE_CSF */
}

View File

@@ -0,0 +1,45 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CACHE_POLICY_BACKEND_H_
#define _KBASE_CACHE_POLICY_BACKEND_H_
#include <linux/types.h>
struct kbase_device;
/**
* kbase_cache_set_coherency_mode() - Sets the system coherency mode
* in the GPU.
* @kbdev: Device pointer
* @mode: Coherency mode. COHERENCY_ACE/ACE_LITE
*/
void kbase_cache_set_coherency_mode(struct kbase_device *kbdev, u32 mode);
/**
* kbase_amba_set_shareable_cache_support() - Sets AMBA shareable cache support
* in the GPU.
* @kbdev: Device pointer
*
* Note: Only for arch version 12.x.1 onwards.
*/
void kbase_amba_set_shareable_cache_support(struct kbase_device *kbdev);
#endif /* _KBASE_CACHE_POLICY_BACKEND_H_ */

View File

@@ -0,0 +1,319 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Implementation of the GPU clock rate trace manager.
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <linux/clk.h>
#include <linux/pm_opp.h>
#include <asm/div64.h>
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#ifdef CONFIG_TRACE_POWER_GPU_FREQUENCY
#include <trace/events/power_gpu_frequency.h>
#else
#include "mali_power_gpu_frequency_trace.h"
#endif
#ifndef CLK_RATE_TRACE_OPS
#define CLK_RATE_TRACE_OPS (NULL)
#endif
/**
* get_clk_rate_trace_callbacks() - Returns pointer to clk trace ops.
* @kbdev: Pointer to kbase device, used to check if arbitration is enabled
* when compiled with arbiter support.
* Return: Pointer to clk trace ops if supported or NULL.
*/
static struct kbase_clk_rate_trace_op_conf *
get_clk_rate_trace_callbacks(__maybe_unused struct kbase_device *kbdev)
{
/* base case */
struct kbase_clk_rate_trace_op_conf *callbacks =
(struct kbase_clk_rate_trace_op_conf *)CLK_RATE_TRACE_OPS;
#if defined(CONFIG_MALI_ARBITER_SUPPORT) && defined(CONFIG_OF)
const void *arbiter_if_node;
if (WARN_ON(!kbdev) || WARN_ON(!kbdev->dev))
return callbacks;
arbiter_if_node = of_get_property(kbdev->dev->of_node, "arbiter-if", NULL);
if (!arbiter_if_node)
arbiter_if_node = of_get_property(kbdev->dev->of_node, "arbiter_if", NULL);
/* Arbitration enabled, override the callback pointer.*/
if (arbiter_if_node)
callbacks = &arb_clk_rate_trace_ops;
else
dev_dbg(kbdev->dev,
"Arbitration supported but disabled by platform. Leaving clk rate callbacks as default.\n");
#endif
return callbacks;
}
static int gpu_clk_rate_change_notifier(struct notifier_block *nb, unsigned long event, void *data)
{
struct kbase_gpu_clk_notifier_data *ndata = data;
struct kbase_clk_data *clk_data =
container_of(nb, struct kbase_clk_data, clk_rate_change_nb);
struct kbase_clk_rate_trace_manager *clk_rtm = clk_data->clk_rtm;
unsigned long flags;
if (WARN_ON_ONCE(clk_data->gpu_clk_handle != ndata->gpu_clk_handle))
return NOTIFY_BAD;
spin_lock_irqsave(&clk_rtm->lock, flags);
if (event == POST_RATE_CHANGE) {
if (!clk_rtm->gpu_idle && (clk_data->clock_val != ndata->new_rate)) {
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index,
ndata->new_rate);
}
clk_data->clock_val = ndata->new_rate;
}
spin_unlock_irqrestore(&clk_rtm->lock, flags);
return NOTIFY_DONE;
}
static int gpu_clk_data_init(struct kbase_device *kbdev, void *gpu_clk_handle, unsigned int index)
{
struct kbase_clk_rate_trace_op_conf *callbacks;
struct kbase_clk_data *clk_data;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
int ret = 0;
callbacks = get_clk_rate_trace_callbacks(kbdev);
if (WARN_ON(!callbacks) || WARN_ON(!gpu_clk_handle) ||
WARN_ON(index >= BASE_MAX_NR_CLOCKS_REGULATORS))
return -EINVAL;
clk_data = kzalloc(sizeof(*clk_data), GFP_KERNEL);
if (!clk_data) {
dev_err(kbdev->dev, "Failed to allocate data for clock enumerated at index %u",
index);
return -ENOMEM;
}
clk_data->index = (u8)index;
clk_data->gpu_clk_handle = gpu_clk_handle;
/* Store the initial value of clock */
clk_data->clock_val = callbacks->get_gpu_clk_rate(kbdev, gpu_clk_handle);
{
/* At the initialization time, GPU is powered off. */
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index, 0);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
clk_data->clk_rtm = clk_rtm;
clk_rtm->clks[index] = clk_data;
clk_data->clk_rate_change_nb.notifier_call = gpu_clk_rate_change_notifier;
if (callbacks->gpu_clk_notifier_register)
ret = callbacks->gpu_clk_notifier_register(kbdev, gpu_clk_handle,
&clk_data->clk_rate_change_nb);
if (ret) {
dev_err(kbdev->dev, "Failed to register notifier for clock enumerated at index %u",
index);
kfree(clk_data);
}
return ret;
}
int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_op_conf *callbacks;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
int ret = 0;
callbacks = get_clk_rate_trace_callbacks(kbdev);
spin_lock_init(&clk_rtm->lock);
INIT_LIST_HEAD(&clk_rtm->listeners);
/* Return early if no callbacks provided for clock rate tracing */
if (!callbacks) {
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
return 0;
}
clk_rtm->gpu_idle = true;
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
void *gpu_clk_handle = callbacks->enumerate_gpu_clk(kbdev, i);
if (!gpu_clk_handle)
break;
ret = gpu_clk_data_init(kbdev, gpu_clk_handle, i);
if (ret)
goto error;
}
/* Activate clock rate trace manager if at least one GPU clock was
* enumerated.
*/
if (i) {
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, callbacks);
} else {
dev_info(kbdev->dev, "No clock(s) available for rate tracing");
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
}
return 0;
error:
while (i--) {
clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister(
kbdev, clk_rtm->clks[i]->gpu_clk_handle,
&clk_rtm->clks[i]->clk_rate_change_nb);
kfree(clk_rtm->clks[i]);
}
return ret;
}
void kbase_clk_rate_trace_manager_term(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
WARN_ON(!list_empty(&clk_rtm->listeners));
if (!clk_rtm->clk_rate_trace_ops)
return;
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
if (!clk_rtm->clks[i])
break;
if (clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister)
clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister(
kbdev, clk_rtm->clks[i]->gpu_clk_handle,
&clk_rtm->clks[i]->clk_rate_change_nb);
kfree(clk_rtm->clks[i]);
}
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
}
void kbase_clk_rate_trace_manager_gpu_active(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
unsigned long flags;
if (!clk_rtm->clk_rate_trace_ops)
return;
spin_lock_irqsave(&clk_rtm->lock, flags);
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
struct kbase_clk_data *clk_data = clk_rtm->clks[i];
if (!clk_data)
break;
if (unlikely(!clk_data->clock_val))
continue;
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index,
clk_data->clock_val);
}
clk_rtm->gpu_idle = false;
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
void kbase_clk_rate_trace_manager_gpu_idle(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
unsigned long flags;
if (!clk_rtm->clk_rate_trace_ops)
return;
spin_lock_irqsave(&clk_rtm->lock, flags);
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
struct kbase_clk_data *clk_data = clk_rtm->clks[i];
if (!clk_data)
break;
if (unlikely(!clk_data->clock_val))
continue;
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index, 0);
}
clk_rtm->gpu_idle = true;
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
void kbase_clk_rate_trace_manager_notify_all(struct kbase_clk_rate_trace_manager *clk_rtm,
u32 clk_index, unsigned long new_rate)
{
struct kbase_clk_rate_listener *pos;
struct kbase_device *kbdev;
lockdep_assert_held(&clk_rtm->lock);
kbdev = container_of(clk_rtm, struct kbase_device, pm.clk_rtm);
dev_dbg(kbdev->dev, "%s - GPU clock %u rate changed to %lu, pid: %d", __func__, clk_index,
new_rate, current->pid);
/* Raise standard `power/gpu_frequency` ftrace event */
{
unsigned long new_rate_khz = new_rate;
#if BITS_PER_LONG == 64
do_div(new_rate_khz, 1000);
#elif BITS_PER_LONG == 32
new_rate_khz /= 1000;
#else
#error "unsigned long division is not supported for this architecture"
#endif
trace_gpu_frequency(new_rate_khz, clk_index);
}
/* Notify the listeners. */
list_for_each_entry(pos, &clk_rtm->listeners, node) {
pos->notify(pos, clk_index, new_rate);
}
}
KBASE_EXPORT_TEST_API(kbase_clk_rate_trace_manager_notify_all);

View File

@@ -0,0 +1,150 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CLK_RATE_TRACE_MGR_
#define _KBASE_CLK_RATE_TRACE_MGR_
/* The index of top clock domain in kbase_clk_rate_trace_manager:clks. */
#define KBASE_CLOCK_DOMAIN_TOP (0)
/* The index of shader-cores clock domain in
* kbase_clk_rate_trace_manager:clks.
*/
#define KBASE_CLOCK_DOMAIN_SHADER_CORES (1)
/**
* struct kbase_clk_data - Data stored per enumerated GPU clock.
*
* @clk_rtm: Pointer to clock rate trace manager object.
* @gpu_clk_handle: Handle unique to the enumerated GPU clock.
* @plat_private: Private data for the platform to store into
* @clk_rate_change_nb: notifier block containing the pointer to callback
* function that is invoked whenever the rate of
* enumerated GPU clock changes.
* @clock_val: Current rate of the enumerated GPU clock.
* @index: Index at which the GPU clock was enumerated.
*/
struct kbase_clk_data {
struct kbase_clk_rate_trace_manager *clk_rtm;
void *gpu_clk_handle;
void *plat_private;
struct notifier_block clk_rate_change_nb;
unsigned long clock_val;
u8 index;
};
/**
* kbase_clk_rate_trace_manager_init - Initialize GPU clock rate trace manager.
*
* @kbdev: Device pointer
*
* Return: 0 if success, or an error code on failure.
*/
int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_term - Terminate GPU clock rate trace manager.
*
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_term(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_gpu_active - Inform GPU clock rate trace
* manager of GPU becoming active.
*
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_gpu_active(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_gpu_idle - Inform GPU clock rate trace
* manager of GPU becoming idle.
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_gpu_idle(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_subscribe_no_lock() - Add freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*
* kbase_clk_rate_trace_manager:lock must be held by the caller.
*/
static inline void
kbase_clk_rate_trace_manager_subscribe_no_lock(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
lockdep_assert_held(&clk_rtm->lock);
list_add(&listener->node, &clk_rtm->listeners);
}
/**
* kbase_clk_rate_trace_manager_subscribe() - Add freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*/
static inline void
kbase_clk_rate_trace_manager_subscribe(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
kbase_clk_rate_trace_manager_subscribe_no_lock(clk_rtm, listener);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
/**
* kbase_clk_rate_trace_manager_unsubscribe() - Remove freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*/
static inline void
kbase_clk_rate_trace_manager_unsubscribe(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
list_del(&listener->node);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
/**
* kbase_clk_rate_trace_manager_notify_all() - Notify all clock \
* rate listeners.
*
* @clk_rtm: Clock rate manager instance.
* @clock_index: Clock index.
* @new_rate: New clock frequency(Hz)
*
* kbase_clk_rate_trace_manager:lock must be locked.
* This function is exported to be used by clock rate trace test
* portal.
*/
void kbase_clk_rate_trace_manager_notify_all(struct kbase_clk_rate_trace_manager *clk_rtm,
u32 clock_index, unsigned long new_rate);
#endif /* _KBASE_CLK_RATE_TRACE_MGR_ */

View File

@@ -0,0 +1,167 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <hw_access/mali_kbase_hw_access.h>
#include "mali_kbase_debug_job_fault.h"
#if IS_ENABLED(CONFIG_DEBUG_FS)
/*GPU_CONTROL_REG(r)*/
static unsigned int gpu_control_reg_snapshot[] = { GPU_CONTROL_ENUM(GPU_ID),
GPU_CONTROL_ENUM(SHADER_READY),
GPU_CONTROL_ENUM(TILER_READY),
GPU_CONTROL_ENUM(L2_READY) };
/* JOB_CONTROL_REG(r) */
static unsigned int job_control_reg_snapshot[] = { JOB_CONTROL_ENUM(JOB_IRQ_MASK),
JOB_CONTROL_ENUM(JOB_IRQ_STATUS) };
/* JOB_SLOT_REG(n,r) */
static unsigned int job_slot_reg_snapshot[] = {
JOB_SLOT_ENUM(0, HEAD) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, TAIL) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, AFFINITY) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, CONFIG) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, STATUS) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, HEAD_NEXT) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, AFFINITY_NEXT) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, CONFIG_NEXT) - JOB_SLOT_BASE_ENUM(0)
};
/*MMU_CONTROL_REG(r)*/
static unsigned int mmu_reg_snapshot[] = { MMU_CONTROL_ENUM(IRQ_MASK),
MMU_CONTROL_ENUM(IRQ_STATUS) };
/* MMU_AS_REG(n,r) */
static unsigned int as_reg_snapshot[] = { MMU_AS_ENUM(0, TRANSTAB) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, TRANSCFG) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, MEMATTR) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, FAULTSTATUS) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, FAULTADDRESS) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, STATUS) - MMU_AS_BASE_ENUM(0) };
bool kbase_debug_job_fault_reg_snapshot_init(struct kbase_context *kctx, int reg_range)
{
uint i, j;
int offset = 0;
uint slot_number;
uint as_number;
if (kctx->reg_dump == NULL)
return false;
slot_number = kctx->kbdev->gpu_props.num_job_slots;
as_number = kctx->kbdev->gpu_props.num_address_spaces;
/* get the GPU control registers*/
for (i = 0; i < ARRAY_SIZE(gpu_control_reg_snapshot); i++) {
kctx->reg_dump[offset] = gpu_control_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Job control registers*/
for (i = 0; i < ARRAY_SIZE(job_control_reg_snapshot); i++) {
kctx->reg_dump[offset] = job_control_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Job Slot registers*/
for (j = 0; j < slot_number; j++) {
for (i = 0; i < ARRAY_SIZE(job_slot_reg_snapshot); i++) {
kctx->reg_dump[offset] = JOB_SLOT_BASE_OFFSET(j) + job_slot_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
}
/* get the MMU registers*/
for (i = 0; i < ARRAY_SIZE(mmu_reg_snapshot); i++) {
kctx->reg_dump[offset] = mmu_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Address space registers*/
for (j = 0; j < as_number; j++) {
for (i = 0; i < ARRAY_SIZE(as_reg_snapshot); i++) {
kctx->reg_dump[offset] = MMU_AS_BASE_OFFSET(j) + as_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
}
WARN_ON(offset >= (reg_range * 2 / 4));
/* set the termination flag*/
kctx->reg_dump[offset] = REGISTER_DUMP_TERMINATION_FLAG;
kctx->reg_dump[offset + 1] = REGISTER_DUMP_TERMINATION_FLAG;
dev_dbg(kctx->kbdev->dev, "kbase_job_fault_reg_snapshot_init:%d\n", offset);
return true;
}
bool kbase_job_fault_get_reg_snapshot(struct kbase_context *kctx)
{
int offset = 0;
u32 reg_enum;
u64 val64;
if (kctx->reg_dump == NULL)
return false;
while (kctx->reg_dump[offset] != REGISTER_DUMP_TERMINATION_FLAG) {
reg_enum = kctx->reg_dump[offset];
/* Get register offset from enum */
kbase_reg_get_offset(kctx->kbdev, reg_enum, &kctx->reg_dump[offset]);
if (kbase_reg_is_size64(kctx->kbdev, reg_enum)) {
val64 = kbase_reg_read64(kctx->kbdev, reg_enum);
/* offset computed offset to get _HI offset */
kctx->reg_dump[offset + 2] = kctx->reg_dump[offset] + 4;
kctx->reg_dump[offset + 1] = (u32)(val64 & 0xFFFFFFFF);
kctx->reg_dump[offset + 3] = (u32)(val64 >> 32);
offset += 4;
} else {
kctx->reg_dump[offset + 1] = kbase_reg_read32(kctx->kbdev, reg_enum);
offset += 2;
}
}
return true;
}
#endif

View File

@@ -0,0 +1,135 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific definitions
*/
#ifndef _KBASE_HWACCESS_GPU_DEFS_H_
#define _KBASE_HWACCESS_GPU_DEFS_H_
/* SLOT_RB_SIZE must be < 256 */
#define SLOT_RB_SIZE 2
#define SLOT_RB_MASK (SLOT_RB_SIZE - 1)
/**
* struct rb_entry - Ringbuffer entry
* @katom: Atom associated with this entry
*/
struct rb_entry {
struct kbase_jd_atom *katom;
};
/* SLOT_RB_TAG_PURGED assumes a value that is different from
* NULL (SLOT_RB_NULL_TAG_VAL) and will not be the result of
* any valid pointer via macro translation: SLOT_RB_TAG_KCTX(x).
*/
#define SLOT_RB_TAG_PURGED ((u64)(1 << 1))
#define SLOT_RB_NULL_TAG_VAL ((u64)0)
/**
* SLOT_RB_TAG_KCTX() - a function-like macro for converting a pointer to a
* u64 for serving as tagged value.
* @kctx: Pointer to kbase context.
*/
#define SLOT_RB_TAG_KCTX(kctx) ((u64)(uintptr_t)(kctx))
/**
* struct slot_rb - Slot ringbuffer
* @entries: Ringbuffer entries
* @last_kctx_tagged: The last context that submitted a job to the slot's
* HEAD_NEXT register. The value is a tagged variant so
* must not be dereferenced. It is used in operation to
* track when shader core L1 caches might contain a
* previous context's data, and so must only be set to
* SLOT_RB_NULL_TAG_VAL after reset/powerdown of the
* cores. In slot job submission, if there is a kctx
* change, and the relevant katom is configured with
* BASE_JD_REQ_SKIP_CACHE_START, a L1 read only cache
* maintenace operation is enforced.
* @read_idx: Current read index of buffer
* @write_idx: Current write index of buffer
* @job_chain_flag: Flag used to implement jobchain disambiguation
*/
struct slot_rb {
struct rb_entry entries[SLOT_RB_SIZE];
u64 last_kctx_tagged;
u8 read_idx;
u8 write_idx;
u8 job_chain_flag;
};
/**
* struct kbase_backend_data - GPU backend specific data for HW access layer
* @slot_rb: Slot ringbuffers
* @scheduling_timer: The timer tick used for rescheduling jobs
* @timer_running: Is the timer running? The runpool_mutex must be
* held whilst modifying this.
* @suspend_timer: Is the timer suspended? Set when a suspend
* occurs and cleared on resume. The runpool_mutex
* must be held whilst modifying this.
* @reset_gpu: Set to a KBASE_RESET_xxx value (see comments)
* @reset_workq: Work queue for performing the reset
* @reset_work: Work item for performing the reset
* @reset_wait: Wait event signalled when the reset is complete
* @reset_timer: Timeout for soft-stops before the reset
* @timeouts_updated: Have timeout values just been updated?
*
* The hwaccess_lock (a spinlock) must be held when accessing this structure
*/
struct kbase_backend_data {
#if !MALI_USE_CSF
struct slot_rb slot_rb[BASE_JM_MAX_NR_SLOTS];
struct hrtimer scheduling_timer;
bool timer_running;
#endif
bool suspend_timer;
atomic_t reset_gpu;
/* The GPU reset isn't pending */
#define KBASE_RESET_GPU_NOT_PENDING 0
/* kbase_prepare_to_reset_gpu has been called */
#define KBASE_RESET_GPU_PREPARED 1
/* kbase_reset_gpu has been called - the reset will now definitely happen
* within the timeout period
*/
#define KBASE_RESET_GPU_COMMITTED 2
/* The GPU reset process is currently occurring (timeout has expired or
* kbasep_try_reset_gpu_early was called)
*/
#define KBASE_RESET_GPU_HAPPENING 3
/* Reset the GPU silently, used when resetting the GPU as part of normal
* behavior (e.g. when exiting protected mode).
*/
#define KBASE_RESET_GPU_SILENT 4
struct workqueue_struct *reset_workq;
struct work_struct reset_work;
wait_queue_head_t reset_wait;
struct hrtimer reset_timer;
bool timeouts_updated;
};
#endif /* _KBASE_HWACCESS_GPU_DEFS_H_ */

View File

@@ -0,0 +1,750 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <tl/mali_kbase_tracepoints.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <linux/of.h>
#include <linux/clk.h>
#include <linux/devfreq.h>
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
#include <linux/devfreq_cooling.h>
#endif
#include <linux/version.h>
#include <linux/pm_opp.h>
#include "mali_kbase_devfreq.h"
/**
* get_voltage() - Get the voltage value corresponding to the nominal frequency
* used by devfreq.
* @kbdev: Device pointer
* @freq: Nominal frequency in Hz passed by devfreq.
*
* This function will be called only when the opp table which is compatible with
* "operating-points-v2-mali", is not present in the devicetree for GPU device.
*
* Return: Voltage value in micro volts, 0 in case of error.
*/
static unsigned long get_voltage(struct kbase_device *kbdev, unsigned long freq)
{
struct dev_pm_opp *opp;
unsigned long voltage = 0;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
opp = dev_pm_opp_find_freq_exact(kbdev->dev, freq, true);
if (IS_ERR_OR_NULL(opp))
dev_err(kbdev->dev, "Failed to get opp (%d)\n", PTR_ERR_OR_ZERO(opp));
else {
voltage = dev_pm_opp_get_voltage(opp);
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
dev_pm_opp_put(opp);
#endif
}
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
/* Return the voltage in micro volts */
return voltage;
}
void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq, u64 *core_mask,
unsigned long *freqs, unsigned long *volts)
{
unsigned int i;
for (i = 0; i < kbdev->num_opps; i++) {
if (kbdev->devfreq_table[i].opp_freq == freq) {
unsigned int j;
*core_mask = kbdev->devfreq_table[i].core_mask;
for (j = 0; j < kbdev->nr_clocks; j++) {
freqs[j] = kbdev->devfreq_table[i].real_freqs[j];
volts[j] = kbdev->devfreq_table[i].opp_volts[j];
}
break;
}
}
/* If failed to find OPP, return all cores enabled
* and nominal frequency and the corresponding voltage.
*/
if (i == kbdev->num_opps) {
unsigned long voltage = get_voltage(kbdev, freq);
*core_mask = kbdev->gpu_props.shader_present;
for (i = 0; i < kbdev->nr_clocks; i++) {
freqs[i] = freq;
volts[i] = voltage;
}
}
}
static int kbase_devfreq_target(struct device *dev, unsigned long *target_freq, u32 flags)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
struct dev_pm_opp *opp;
unsigned long nominal_freq;
unsigned long freqs[BASE_MAX_NR_CLOCKS_REGULATORS] = { 0 };
#if IS_ENABLED(CONFIG_REGULATOR)
unsigned long original_freqs[BASE_MAX_NR_CLOCKS_REGULATORS] = { 0 };
#endif
unsigned long volts[BASE_MAX_NR_CLOCKS_REGULATORS] = { 0 };
unsigned int i;
u64 core_mask;
nominal_freq = *target_freq;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
opp = devfreq_recommended_opp(dev, &nominal_freq, flags);
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
if (IS_ERR_OR_NULL(opp)) {
dev_err(dev, "Failed to get opp (%d)\n", PTR_ERR_OR_ZERO(opp));
return IS_ERR(opp) ? PTR_ERR(opp) : -ENODEV;
}
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
dev_pm_opp_put(opp);
#endif
/*
* Only update if there is a change of frequency
*/
if (kbdev->current_nominal_freq == nominal_freq) {
*target_freq = nominal_freq;
return 0;
}
kbase_devfreq_opp_translate(kbdev, nominal_freq, &core_mask, freqs, volts);
#if IS_ENABLED(CONFIG_REGULATOR)
/* Regulators and clocks work in pairs: every clock has a regulator,
* and we never expect to have more regulators than clocks.
*
* We always need to increase the voltage before increasing the number
* of shader cores and the frequency of a regulator/clock pair,
* otherwise the clock wouldn't have enough power to perform
* the transition.
*
* It's always safer to decrease the number of shader cores and
* the frequency before decreasing voltage of a regulator/clock pair,
* otherwise the clock could have problematic operation if it is
* deprived of the necessary power to sustain its current frequency
* (even if that happens for a short transition interval).
*/
for (i = 0; i < kbdev->nr_clocks; i++) {
if (kbdev->regulators[i] && kbdev->current_voltages[i] != volts[i] &&
kbdev->current_freqs[i] < freqs[i]) {
int err;
err = regulator_set_voltage(kbdev->regulators[i], volts[i], volts[i]);
if (!err) {
kbdev->current_voltages[i] = volts[i];
} else {
dev_err(dev, "Failed to increase voltage (%d) (target %lu)\n", err,
volts[i]);
return err;
}
}
}
#endif
for (i = 0; i < kbdev->nr_clocks; i++) {
if (kbdev->clocks[i]) {
int err;
err = clk_set_rate(kbdev->clocks[i], freqs[i]);
if (!err) {
#if IS_ENABLED(CONFIG_REGULATOR)
original_freqs[i] = kbdev->current_freqs[i];
#endif
kbdev->current_freqs[i] = freqs[i];
} else {
dev_err(dev, "Failed to set clock %lu (target %lu)\n", freqs[i],
*target_freq);
return err;
}
}
}
kbase_devfreq_set_core_mask(kbdev, core_mask);
#if IS_ENABLED(CONFIG_REGULATOR)
for (i = 0; i < kbdev->nr_clocks; i++) {
if (kbdev->regulators[i] && kbdev->current_voltages[i] != volts[i] &&
original_freqs[i] > freqs[i]) {
int err;
err = regulator_set_voltage(kbdev->regulators[i], volts[i], volts[i]);
if (!err) {
kbdev->current_voltages[i] = volts[i];
} else {
dev_err(dev, "Failed to decrease voltage (%d) (target %lu)\n", err,
volts[i]);
return err;
}
}
}
#endif
*target_freq = nominal_freq;
kbdev->current_nominal_freq = nominal_freq;
kbdev->current_core_mask = core_mask;
KBASE_TLSTREAM_AUX_DEVFREQ_TARGET(kbdev, (u64)nominal_freq);
return 0;
}
void kbase_devfreq_force_freq(struct kbase_device *kbdev, unsigned long freq)
{
unsigned long target_freq = freq;
kbase_devfreq_target(kbdev->dev, &target_freq, 0);
}
static int kbase_devfreq_cur_freq(struct device *dev, unsigned long *freq)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
*freq = kbdev->current_nominal_freq;
return 0;
}
static int kbase_devfreq_status(struct device *dev, struct devfreq_dev_status *stat)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
struct kbasep_pm_metrics diff;
kbase_pm_get_dvfs_metrics(kbdev, &kbdev->last_devfreq_metrics, &diff);
stat->busy_time = diff.time_busy;
stat->total_time = diff.time_busy + diff.time_idle;
stat->current_frequency = kbdev->current_nominal_freq;
stat->private_data = NULL;
#if MALI_USE_CSF && defined CONFIG_DEVFREQ_THERMAL
kbase_ipa_reset_data(kbdev);
#endif
return 0;
}
static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev, struct devfreq_dev_profile *dp)
{
int count;
unsigned int i = 0;
unsigned long freq;
struct dev_pm_opp *opp;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
count = dev_pm_opp_get_opp_count(kbdev->dev);
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
if (count < 0)
return count;
dp->freq_table = kmalloc_array((size_t)count, sizeof(dp->freq_table[0]), GFP_KERNEL);
if (!dp->freq_table)
return -ENOMEM;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
for (i = 0, freq = ULONG_MAX; i < (unsigned int)count; i++, freq--) {
opp = dev_pm_opp_find_freq_floor(kbdev->dev, &freq);
if (IS_ERR(opp))
break;
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
dev_pm_opp_put(opp);
#endif /* KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE */
dp->freq_table[i] = freq;
}
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
if ((unsigned int)count != i)
dev_warn(kbdev->dev, "Unable to enumerate all OPPs (%d!=%u\n", count, i);
dp->max_state = i;
/* Have the lowest clock as suspend clock.
* It may be overridden by 'opp-mali-errata-1485982'.
*/
if (kbdev->pm.backend.gpu_clock_slow_down_wa) {
freq = 0;
opp = dev_pm_opp_find_freq_ceil(kbdev->dev, &freq);
if (IS_ERR(opp)) {
dev_err(kbdev->dev, "failed to find slowest clock");
return 0;
}
dev_info(kbdev->dev, "suspend clock %lu from slowest", freq);
kbdev->pm.backend.gpu_clock_suspend_freq = freq;
}
return 0;
}
static void kbase_devfreq_term_freq_table(struct kbase_device *kbdev)
{
struct devfreq_dev_profile *dp = &kbdev->devfreq_profile;
kfree(dp->freq_table);
dp->freq_table = NULL;
}
static void kbase_devfreq_term_core_mask_table(struct kbase_device *kbdev)
{
kfree(kbdev->devfreq_table);
kbdev->devfreq_table = NULL;
}
static void kbase_devfreq_exit(struct device *dev)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
if (kbdev)
kbase_devfreq_term_freq_table(kbdev);
}
static void kbasep_devfreq_read_suspend_clock(struct kbase_device *kbdev, struct device_node *node)
{
u64 freq = 0;
int err = 0;
/* Check if this node is the opp entry having 'opp-mali-errata-1485982'
* to get the suspend clock, otherwise skip it.
*/
if (!of_property_read_bool(node, "opp-mali-errata-1485982"))
return;
/* In kbase DevFreq, the clock will be read from 'opp-hz'
* and translated into the actual clock by opp_translate.
*
* In customer DVFS, the clock will be read from 'opp-hz-real'
* for clk driver. If 'opp-hz-real' does not exist,
* read from 'opp-hz'.
*/
if (IS_ENABLED(CONFIG_MALI_DEVFREQ))
err = of_property_read_u64(node, "opp-hz", &freq);
else {
if (of_property_read_u64(node, "opp-hz-real", &freq))
err = of_property_read_u64(node, "opp-hz", &freq);
}
if (WARN_ON(err || !freq))
return;
kbdev->pm.backend.gpu_clock_suspend_freq = freq;
dev_info(kbdev->dev, "suspend clock %llu by opp-mali-errata-1485982", freq);
}
static int kbase_devfreq_init_core_mask_table(struct kbase_device *kbdev)
{
#ifndef CONFIG_OF
/* OPP table initialization requires at least the capability to get
* regulators and clocks from the device tree, as well as parsing
* arrays of unsigned integer values.
*
* The whole initialization process shall simply be skipped if the
* minimum capability is not available.
*/
return 0;
#else
struct device_node *opp_node =
of_parse_phandle(kbdev->dev->of_node, "operating-points-v2", 0);
struct device_node *node;
unsigned int i = 0;
int count;
u64 shader_present = kbdev->gpu_props.shader_present;
if (!opp_node)
return 0;
if (!of_device_is_compatible(opp_node, "operating-points-v2-mali"))
return 0;
count = dev_pm_opp_get_opp_count(kbdev->dev);
kbdev->devfreq_table =
kmalloc_array((size_t)count, sizeof(struct kbase_devfreq_opp), GFP_KERNEL);
if (!kbdev->devfreq_table)
return -ENOMEM;
for_each_available_child_of_node(opp_node, node) {
const void *core_count_p;
u64 core_mask, opp_freq, real_freqs[BASE_MAX_NR_CLOCKS_REGULATORS];
int err;
#if IS_ENABLED(CONFIG_REGULATOR)
u32 opp_volts[BASE_MAX_NR_CLOCKS_REGULATORS];
#endif
/* Read suspend clock from opp table */
if (kbdev->pm.backend.gpu_clock_slow_down_wa)
kbasep_devfreq_read_suspend_clock(kbdev, node);
err = of_property_read_u64(node, "opp-hz", &opp_freq);
if (err) {
dev_warn(kbdev->dev, "Failed to read opp-hz property with error %d\n", err);
continue;
}
#if BASE_MAX_NR_CLOCKS_REGULATORS > 1
err = of_property_read_u64_array(node, "opp-hz-real", real_freqs, kbdev->nr_clocks);
#else
WARN_ON(kbdev->nr_clocks != 1);
err = of_property_read_u64(node, "opp-hz-real", real_freqs);
#endif
if (err < 0) {
dev_warn(kbdev->dev, "Failed to read opp-hz-real property with error %d\n",
err);
continue;
}
#if IS_ENABLED(CONFIG_REGULATOR)
err = of_property_read_u32_array(node, "opp-microvolt", opp_volts,
kbdev->nr_regulators);
if (err < 0) {
dev_warn(kbdev->dev,
"Failed to read opp-microvolt property with error %d\n", err);
continue;
}
#endif
if (of_property_read_u64(node, "opp-core-mask", &core_mask))
core_mask = shader_present;
if (core_mask != shader_present && corestack_driver_control) {
dev_warn(
kbdev->dev,
"Ignoring OPP %llu - Dynamic Core Scaling not supported on this GPU\n",
opp_freq);
continue;
}
core_count_p = of_get_property(node, "opp-core-count", NULL);
if (core_count_p) {
u64 remaining_core_mask = kbdev->gpu_props.shader_present;
int core_count = be32_to_cpup(core_count_p);
core_mask = 0;
for (; core_count > 0; core_count--) {
int core = ffs(remaining_core_mask);
if (!core) {
dev_err(kbdev->dev, "OPP has more cores than GPU\n");
return -ENODEV;
}
core_mask |= (1ull << (core - 1));
remaining_core_mask &= ~(1ull << (core - 1));
}
}
if (!core_mask) {
dev_err(kbdev->dev, "OPP has invalid core mask of 0\n");
return -ENODEV;
}
kbdev->devfreq_table[i].opp_freq = opp_freq;
kbdev->devfreq_table[i].core_mask = core_mask;
if (kbdev->nr_clocks > 0) {
unsigned int j;
for (j = 0; j < kbdev->nr_clocks; j++)
kbdev->devfreq_table[i].real_freqs[j] = real_freqs[j];
}
#if IS_ENABLED(CONFIG_REGULATOR)
if (kbdev->nr_regulators > 0) {
unsigned int j;
for (j = 0; j < kbdev->nr_regulators; j++)
kbdev->devfreq_table[i].opp_volts[j] = opp_volts[j];
}
#endif
dev_info(kbdev->dev, "OPP %d : opp_freq=%llu core_mask=%llx\n", i, opp_freq,
core_mask);
i++;
}
kbdev->num_opps = i;
return 0;
#endif /* CONFIG_OF */
}
static const char *kbase_devfreq_req_type_name(enum kbase_devfreq_work_type type)
{
const char *p;
switch (type) {
case DEVFREQ_WORK_NONE:
p = "devfreq_none";
break;
case DEVFREQ_WORK_SUSPEND:
p = "devfreq_suspend";
break;
case DEVFREQ_WORK_RESUME:
p = "devfreq_resume";
break;
default:
p = "Unknown devfreq_type";
}
return p;
}
static void kbase_devfreq_suspend_resume_worker(struct work_struct *work)
{
struct kbase_devfreq_queue_info *info =
container_of(work, struct kbase_devfreq_queue_info, work);
struct kbase_device *kbdev = container_of(info, struct kbase_device, devfreq_queue);
unsigned long flags;
enum kbase_devfreq_work_type type, acted_type;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
type = kbdev->devfreq_queue.req_type;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
acted_type = kbdev->devfreq_queue.acted_type;
dev_dbg(kbdev->dev, "Worker handles queued req: %s (acted: %s)\n",
kbase_devfreq_req_type_name(type), kbase_devfreq_req_type_name(acted_type));
switch (type) {
case DEVFREQ_WORK_SUSPEND:
case DEVFREQ_WORK_RESUME:
if (type != acted_type) {
if (type == DEVFREQ_WORK_RESUME)
devfreq_resume_device(kbdev->devfreq);
else
devfreq_suspend_device(kbdev->devfreq);
dev_dbg(kbdev->dev, "Devfreq transition occured: %s => %s\n",
kbase_devfreq_req_type_name(acted_type),
kbase_devfreq_req_type_name(type));
kbdev->devfreq_queue.acted_type = type;
}
break;
default:
WARN_ON(1);
}
}
void kbase_devfreq_enqueue_work(struct kbase_device *kbdev, enum kbase_devfreq_work_type work_type)
{
unsigned long flags;
WARN_ON(work_type == DEVFREQ_WORK_NONE);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* Skip enqueuing a work if workqueue has already been terminated. */
if (likely(kbdev->devfreq_queue.workq)) {
kbdev->devfreq_queue.req_type = work_type;
queue_work(kbdev->devfreq_queue.workq, &kbdev->devfreq_queue.work);
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
dev_dbg(kbdev->dev, "Enqueuing devfreq req: %s\n", kbase_devfreq_req_type_name(work_type));
}
static int kbase_devfreq_work_init(struct kbase_device *kbdev)
{
kbdev->devfreq_queue.req_type = DEVFREQ_WORK_NONE;
kbdev->devfreq_queue.acted_type = DEVFREQ_WORK_RESUME;
kbdev->devfreq_queue.workq = alloc_ordered_workqueue("devfreq_workq", 0);
if (!kbdev->devfreq_queue.workq)
return -ENOMEM;
INIT_WORK(&kbdev->devfreq_queue.work, kbase_devfreq_suspend_resume_worker);
return 0;
}
static void kbase_devfreq_work_term(struct kbase_device *kbdev)
{
unsigned long flags;
struct workqueue_struct *workq;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
workq = kbdev->devfreq_queue.workq;
kbdev->devfreq_queue.workq = NULL;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
destroy_workqueue(workq);
}
int kbase_devfreq_init(struct kbase_device *kbdev)
{
struct devfreq_dev_profile *dp;
int err;
unsigned int i;
bool free_devfreq_freq_table = true;
if (kbdev->nr_clocks == 0) {
dev_err(kbdev->dev, "Clock not available for devfreq\n");
return -ENODEV;
}
for (i = 0; i < kbdev->nr_clocks; i++) {
if (kbdev->clocks[i])
kbdev->current_freqs[i] = clk_get_rate(kbdev->clocks[i]);
}
kbdev->current_nominal_freq = kbdev->current_freqs[0];
dp = &kbdev->devfreq_profile;
dp->initial_freq = kbdev->current_freqs[0];
dp->polling_ms = 100;
dp->target = kbase_devfreq_target;
dp->get_dev_status = kbase_devfreq_status;
dp->get_cur_freq = kbase_devfreq_cur_freq;
dp->exit = kbase_devfreq_exit;
if (kbase_devfreq_init_freq_table(kbdev, dp))
return -EFAULT;
if (dp->max_state > 0) {
/* Record the maximum frequency possible */
kbdev->gpu_props.gpu_freq_khz_max = dp->freq_table[0] / 1000;
}
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
err = kbase_ipa_init(kbdev);
if (err) {
dev_err(kbdev->dev, "IPA initialization failed");
goto ipa_init_failed;
}
#endif
err = kbase_devfreq_init_core_mask_table(kbdev);
if (err)
goto init_core_mask_table_failed;
kbdev->devfreq = devfreq_add_device(kbdev->dev, dp, "simple_ondemand", NULL);
if (IS_ERR(kbdev->devfreq)) {
err = PTR_ERR(kbdev->devfreq);
kbdev->devfreq = NULL;
dev_err(kbdev->dev, "Fail to add devfreq device(%d)", err);
goto devfreq_add_dev_failed;
}
/* Explicit free of freq table isn't needed after devfreq_add_device() */
free_devfreq_freq_table = false;
/* Initialize devfreq suspend/resume workqueue */
err = kbase_devfreq_work_init(kbdev);
if (err) {
dev_err(kbdev->dev, "Fail to init devfreq workqueue");
goto devfreq_work_init_failed;
}
/* devfreq_add_device only copies a few of kbdev->dev's fields, so
* set drvdata explicitly so IPA models can access kbdev.
*/
dev_set_drvdata(&kbdev->devfreq->dev, kbdev);
err = devfreq_register_opp_notifier(kbdev->dev, kbdev->devfreq);
if (err) {
dev_err(kbdev->dev, "Failed to register OPP notifier (%d)", err);
goto opp_notifier_failed;
}
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
kbdev->devfreq_cooling = of_devfreq_cooling_register_power(
kbdev->dev->of_node, kbdev->devfreq, &kbase_ipa_power_model_ops);
if (IS_ERR_OR_NULL(kbdev->devfreq_cooling)) {
err = PTR_ERR_OR_ZERO(kbdev->devfreq_cooling);
dev_err(kbdev->dev, "Failed to register cooling device (%d)", err);
err = err == 0 ? -ENODEV : err;
goto cooling_reg_failed;
}
#endif
return 0;
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
cooling_reg_failed:
devfreq_unregister_opp_notifier(kbdev->dev, kbdev->devfreq);
#endif /* CONFIG_DEVFREQ_THERMAL */
opp_notifier_failed:
kbase_devfreq_work_term(kbdev);
devfreq_work_init_failed:
if (devfreq_remove_device(kbdev->devfreq))
dev_err(kbdev->dev, "Failed to terminate devfreq (%d)", err);
kbdev->devfreq = NULL;
devfreq_add_dev_failed:
kbase_devfreq_term_core_mask_table(kbdev);
init_core_mask_table_failed:
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
kbase_ipa_term(kbdev);
ipa_init_failed:
#endif
if (free_devfreq_freq_table)
kbase_devfreq_term_freq_table(kbdev);
return err;
}
void kbase_devfreq_term(struct kbase_device *kbdev)
{
int err;
dev_dbg(kbdev->dev, "Term Mali devfreq\n");
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
if (kbdev->devfreq_cooling)
devfreq_cooling_unregister(kbdev->devfreq_cooling);
#endif
devfreq_unregister_opp_notifier(kbdev->dev, kbdev->devfreq);
kbase_devfreq_work_term(kbdev);
err = devfreq_remove_device(kbdev->devfreq);
if (err)
dev_err(kbdev->dev, "Failed to terminate devfreq (%d)\n", err);
else
kbdev->devfreq = NULL;
kbase_devfreq_term_core_mask_table(kbdev);
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
kbase_ipa_term(kbdev);
#endif
}

View File

@@ -0,0 +1,69 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _BASE_DEVFREQ_H_
#define _BASE_DEVFREQ_H_
/**
* kbase_devfreq_init - Initialize kbase device for DevFreq.
* @kbdev: Device pointer
*
* This function must be called only when a kbase device is initialized.
*
* Return: 0 on success.
*/
int kbase_devfreq_init(struct kbase_device *kbdev);
void kbase_devfreq_term(struct kbase_device *kbdev);
/**
* kbase_devfreq_force_freq - Set GPU frequency on L2 power on/off.
* @kbdev: Device pointer
* @freq: GPU frequency in HZ to be set when
* MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE is enabled
*/
void kbase_devfreq_force_freq(struct kbase_device *kbdev, unsigned long freq);
/**
* kbase_devfreq_enqueue_work - Enqueue a work item for suspend/resume devfreq.
* @kbdev: Device pointer
* @work_type: The type of the devfreq work item, i.e. suspend or resume
*/
void kbase_devfreq_enqueue_work(struct kbase_device *kbdev, enum kbase_devfreq_work_type work_type);
/**
* kbase_devfreq_opp_translate - Translate nominal OPP frequency from devicetree
* into real frequency & voltage pair, along with
* core mask
* @kbdev: Device pointer
* @freq: Nominal frequency
* @core_mask: Pointer to u64 to store core mask to
* @freqs: Pointer to array of frequencies
* @volts: Pointer to array of voltages
*
* This function will only perform translation if an operating-points-v2-mali
* table is present in devicetree. If one is not present then it will return an
* untranslated frequency (and corresponding voltage) and all cores enabled.
* The voltages returned are in micro Volts (uV).
*/
void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq, u64 *core_mask,
unsigned long *freqs, unsigned long *volts);
#endif /* _BASE_DEVFREQ_H_ */

View File

@@ -0,0 +1,137 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel property query backend APIs
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <mali_kbase_hwaccess_gpuprops.h>
#include <mali_kbase_gpuprops_private_types.h>
int kbase_backend_gpuprops_get(struct kbase_device *kbdev, struct kbasep_gpuprops_regdump *regdump)
{
uint i;
/* regdump is zero intiialized, individual entries do not need to be explicitly set */
regdump->gpu_id = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(GPU_ID));
regdump->shader_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(SHADER_PRESENT));
regdump->tiler_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(TILER_PRESENT));
regdump->l2_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(L2_PRESENT));
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(AS_PRESENT)))
regdump->as_present = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AS_PRESENT));
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(STACK_PRESENT)))
regdump->stack_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(STACK_PRESENT));
#if !MALI_USE_CSF
regdump->js_present = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(JS_PRESENT));
/* Not a valid register on TMIX */
/* TGOx specific register */
if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_THREAD_TLS_ALLOC))
regdump->thread_tls_alloc =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_TLS_ALLOC));
#endif /* !MALI_USE_CSF */
regdump->thread_max_threads = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_THREADS));
regdump->thread_max_workgroup_size =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_WORKGROUP_SIZE));
regdump->thread_max_barrier_size =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_BARRIER_SIZE));
regdump->thread_features = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_FEATURES));
/* Feature Registers */
/* AMBA_FEATURES enum is mapped to COHERENCY_FEATURES enum */
regdump->coherency_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(COHERENCY_FEATURES));
if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_CORE_FEATURES))
regdump->core_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(CORE_FEATURES));
#if MALI_USE_CSF
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(GPU_FEATURES)))
regdump->gpu_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(GPU_FEATURES));
#endif /* MALI_USE_CSF */
regdump->tiler_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(TILER_FEATURES));
regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
regdump->mem_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MEM_FEATURES));
regdump->mmu_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MMU_FEATURES));
#if !MALI_USE_CSF
for (i = 0; i < GPU_MAX_JOB_SLOTS; i++)
regdump->js_features[i] = kbase_reg_read32(kbdev, GPU_JS_FEATURES_OFFSET(i));
#endif /* !MALI_USE_CSF */
#if MALI_USE_CSF
#endif /* MALI_USE_CSF */
{
for (i = 0; i < BASE_GPU_NUM_TEXTURE_FEATURES_REGISTERS; i++)
regdump->texture_features[i] =
kbase_reg_read32(kbdev, GPU_TEXTURE_FEATURES_OFFSET(i));
}
if (kbase_is_gpu_removed(kbdev))
return -EIO;
return 0;
}
int kbase_backend_gpuprops_get_curr_config(struct kbase_device *kbdev,
struct kbase_current_config_regdump *curr_config_regdump)
{
if (WARN_ON(!kbdev) || WARN_ON(!curr_config_regdump))
return -EINVAL;
curr_config_regdump->mem_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MEM_FEATURES));
curr_config_regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
curr_config_regdump->shader_present =
kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(SHADER_PRESENT));
curr_config_regdump->l2_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(L2_PRESENT));
if (kbase_is_gpu_removed(kbdev))
return -EIO;
return 0;
}
int kbase_backend_gpuprops_get_l2_features(struct kbase_device *kbdev,
struct kbasep_gpuprops_regdump *regdump)
{
if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_L2_CONFIG)) {
regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
regdump->l2_config = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(L2_CONFIG));
#if MALI_USE_CSF
if (kbase_hw_has_l2_slice_hash_feature(kbdev)) {
uint i;
for (i = 0; i < GPU_L2_SLICE_HASH_COUNT; i++)
regdump->l2_slice_hash[i] =
kbase_reg_read32(kbdev, GPU_L2_SLICE_HASH_OFFSET(i));
}
#endif /* MALI_USE_CSF */
if (kbase_is_gpu_removed(kbdev))
return -EIO;
}
return 0;
}

View File

@@ -0,0 +1,459 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* GPU backend instrumentation APIs.
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_hwaccess_instr.h>
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_instr_internal.h>
static int wait_prfcnt_ready(struct kbase_device *kbdev)
{
u32 val;
const u32 timeout_us =
kbase_get_timeout_ms(kbdev, KBASE_PRFCNT_ACTIVE_TIMEOUT) * USEC_PER_MSEC;
const int err = kbase_reg_poll32_timeout(kbdev, GPU_CONTROL_ENUM(GPU_STATUS), val,
!(val & GPU_STATUS_PRFCNT_ACTIVE), 0, timeout_us,
false);
if (err) {
dev_err(kbdev->dev, "PRFCNT_ACTIVE bit stuck\n");
return -EBUSY;
}
return 0;
}
int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev, struct kbase_context *kctx,
struct kbase_instr_hwcnt_enable *enable)
{
unsigned long flags;
int err = -EINVAL;
u32 irq_mask;
u32 prfcnt_config;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* alignment failure */
if ((enable->dump_buffer == 0ULL) || (enable->dump_buffer & (2048 - 1)))
return err;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_DISABLED) {
/* Instrumentation is already enabled */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
if (kbase_is_gpu_removed(kbdev)) {
/* GPU has been removed by Arbiter */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
/* Enable interrupt */
irq_mask = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK));
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK),
irq_mask | PRFCNT_SAMPLE_COMPLETED);
/* In use, this context is the owner */
kbdev->hwcnt.kctx = kctx;
/* Remember the dump address so we can reprogram it later */
kbdev->hwcnt.addr = enable->dump_buffer;
kbdev->hwcnt.addr_bytes = enable->dump_buffer_bytes;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
/* Configure */
prfcnt_config = (u32)kctx->as_nr << PRFCNT_CONFIG_AS_SHIFT;
#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
prfcnt_config |= (u32)kbdev->hwcnt.backend.override_counter_set
<< PRFCNT_CONFIG_SETSELECT_SHIFT;
#else
prfcnt_config |= (u32)enable->counter_set << PRFCNT_CONFIG_SETSELECT_SHIFT;
#endif
/* Wait until prfcnt config register can be written */
err = wait_prfcnt_ready(kbdev);
if (err)
return err;
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG),
prfcnt_config | PRFCNT_CONFIG_MODE_OFF);
/* Wait until prfcnt is disabled before writing configuration registers */
err = wait_prfcnt_ready(kbdev);
if (err)
return err;
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(PRFCNT_BASE), enable->dump_buffer);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_JM_EN), enable->fe_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_SHADER_EN), enable->shader_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_MMU_L2_EN), enable->mmu_l2_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_TILER_EN), enable->tiler_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG),
prfcnt_config | PRFCNT_CONFIG_MODE_MANUAL);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
dev_dbg(kbdev->dev, "HW counters dumping set-up for context %pK", kctx);
return 0;
}
static void kbasep_instr_hwc_disable_hw_prfcnt(struct kbase_device *kbdev)
{
u32 irq_mask;
lockdep_assert_held(&kbdev->hwaccess_lock);
lockdep_assert_held(&kbdev->hwcnt.lock);
if (kbase_is_gpu_removed(kbdev))
/* GPU has been removed by Arbiter */
return;
/* Disable interrupt */
irq_mask = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK));
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK),
irq_mask & ~PRFCNT_SAMPLE_COMPLETED);
/* Wait until prfcnt config register can be written, then disable the counters.
* Return value is ignored as we are disabling anyway.
*/
wait_prfcnt_ready(kbdev);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG), 0);
kbdev->hwcnt.kctx = NULL;
kbdev->hwcnt.addr = 0ULL;
kbdev->hwcnt.addr_bytes = 0ULL;
}
int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)
{
unsigned long flags, pm_flags;
struct kbase_device *kbdev = kctx->kbdev;
while (1) {
spin_lock_irqsave(&kbdev->hwaccess_lock, pm_flags);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
/* Instrumentation is in unrecoverable error state,
* there is nothing for us to do.
*/
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
/* Already disabled, return no error. */
return 0;
}
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_DISABLED) {
/* Instrumentation is not enabled */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
return -EINVAL;
}
if (kbdev->hwcnt.kctx != kctx) {
/* Instrumentation has been setup for another context */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
return -EINVAL;
}
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE)
break;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
/* Ongoing dump/setup - wait for its completion */
wait_event(kbdev->hwcnt.backend.wait, kbdev->hwcnt.backend.triggered != 0);
}
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
kbdev->hwcnt.backend.triggered = 0;
kbasep_instr_hwc_disable_hw_prfcnt(kbdev);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
dev_dbg(kbdev->dev, "HW counters dumping disabled for context %pK", kctx);
return 0;
}
int kbase_instr_hwcnt_request_dump(struct kbase_context *kctx)
{
unsigned long flags;
int err = -EINVAL;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.kctx != kctx) {
/* The instrumentation has been setup for another context */
goto unlock;
}
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_IDLE) {
/* HW counters are disabled or another dump is ongoing, or we're
* resetting, or we are in unrecoverable error state.
*/
goto unlock;
}
if (kbase_is_gpu_removed(kbdev)) {
/* GPU has been removed by Arbiter */
goto unlock;
}
kbdev->hwcnt.backend.triggered = 0;
/* Mark that we're dumping - the PF handler can signal that we faulted
*/
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DUMPING;
/* Wait until prfcnt is ready to request dump */
err = wait_prfcnt_ready(kbdev);
if (err)
goto unlock;
/* Reconfigure the dump address */
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(PRFCNT_BASE), kbdev->hwcnt.addr);
/* Start dumping */
KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_SAMPLE, NULL, kbdev->hwcnt.addr);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_COMMAND), GPU_COMMAND_PRFCNT_SAMPLE);
dev_dbg(kbdev->dev, "HW counters dumping done for context %pK", kctx);
unlock:
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_request_dump);
bool kbase_instr_hwcnt_dump_complete(struct kbase_context *kctx, bool *const success)
{
unsigned long flags;
bool complete = false;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE) {
*success = true;
complete = true;
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
*success = false;
complete = true;
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return complete;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_dump_complete);
void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* If the state is in unrecoverable error, we already wake_up the waiter
* and don't need to do any action when sample is done.
*/
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_DUMPING) {
/* All finished and idle */
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
int kbase_instr_hwcnt_wait_for_dump(struct kbase_context *kctx)
{
struct kbase_device *kbdev = kctx->kbdev;
unsigned long flags;
int err;
/* Wait for dump & cache clean to complete */
wait_event(kbdev->hwcnt.backend.wait, kbdev->hwcnt.backend.triggered != 0);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
err = -EINVAL;
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
err = -EIO;
} else {
/* Dump done */
KBASE_DEBUG_ASSERT(kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE);
err = 0;
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
int kbase_instr_hwcnt_clear(struct kbase_context *kctx)
{
unsigned long flags;
int err = -EINVAL;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* Check it's the context previously set up and we're not in IDLE
* state.
*/
if (kbdev->hwcnt.kctx != kctx || kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_IDLE)
goto unlock;
if (kbase_is_gpu_removed(kbdev)) {
/* GPU has been removed by Arbiter */
goto unlock;
}
/* Wait until prfcnt is ready to clear */
err = wait_prfcnt_ready(kbdev);
if (err)
goto unlock;
/* Clear the counters */
KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_CLEAR, NULL, 0);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_COMMAND), GPU_COMMAND_PRFCNT_CLEAR);
unlock:
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_clear);
void kbase_instr_hwcnt_on_unrecoverable_error(struct kbase_device *kbdev)
{
unsigned long flags;
lockdep_assert_held(&kbdev->hwaccess_lock);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* If we already in unrecoverable error state, early return. */
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return;
}
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_UNRECOVERABLE_ERROR;
/* Need to disable HW if it's not disabled yet. */
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_DISABLED)
kbasep_instr_hwc_disable_hw_prfcnt(kbdev);
/* Wake up any waiters. */
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_on_unrecoverable_error);
void kbase_instr_hwcnt_on_before_reset(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* A reset is the only way to exit the unrecoverable error state */
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR)
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_on_before_reset);
int kbase_instr_backend_init(struct kbase_device *kbdev)
{
spin_lock_init(&kbdev->hwcnt.lock);
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
init_waitqueue_head(&kbdev->hwcnt.backend.wait);
#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
/* Use the build time option for the override default. */
#if defined(CONFIG_MALI_PRFCNT_SET_SECONDARY)
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_SECONDARY;
#elif defined(CONFIG_MALI_PRFCNT_SET_TERTIARY)
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_TERTIARY;
#else
/* Default to primary */
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_PRIMARY;
#endif
#endif
return 0;
}
void kbase_instr_backend_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
void kbase_instr_backend_debugfs_init(struct kbase_device *kbdev)
{
/* No validation is done on the debugfs input. Invalid input could cause
* performance counter errors. This is acceptable since this is a debug
* only feature and users should know what they are doing.
*
* Valid inputs are the values accepted bythe SET_SELECT bits of the
* PRFCNT_CONFIG register as defined in the architecture specification.
*/
debugfs_create_u8("hwcnt_set_select", 0644, kbdev->mali_debugfs_directory,
(u8 *)&kbdev->hwcnt.backend.override_counter_set);
}
#endif

View File

@@ -0,0 +1,60 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014, 2016, 2018-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific instrumentation definitions
*/
#ifndef _KBASE_INSTR_DEFS_H_
#define _KBASE_INSTR_DEFS_H_
#include <hwcnt/mali_kbase_hwcnt_gpu.h>
/*
* Instrumentation State Machine States
*/
enum kbase_instr_state {
/* State where instrumentation is not active */
KBASE_INSTR_STATE_DISABLED = 0,
/* State machine is active and ready for a command. */
KBASE_INSTR_STATE_IDLE,
/* Hardware is currently dumping a frame. */
KBASE_INSTR_STATE_DUMPING,
/* An error has occurred during DUMPING (page fault). */
KBASE_INSTR_STATE_FAULT,
/* An unrecoverable error has occurred, a reset is the only way to exit
* from unrecoverable error state.
*/
KBASE_INSTR_STATE_UNRECOVERABLE_ERROR,
};
/* Structure used for instrumentation and HW counters dumping */
struct kbase_instr_backend {
wait_queue_head_t wait;
int triggered;
#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
enum kbase_hwcnt_physical_set override_counter_set;
#endif
enum kbase_instr_state state;
};
#endif /* _KBASE_INSTR_DEFS_H_ */

View File

@@ -0,0 +1,41 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014, 2018, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific HW access instrumentation APIs
*/
#ifndef _KBASE_INSTR_INTERNAL_H_
#define _KBASE_INSTR_INTERNAL_H_
/**
* kbasep_cache_clean_worker() - Workqueue for handling cache cleaning
* @data: a &struct work_struct
*/
void kbasep_cache_clean_worker(struct work_struct *data);
/**
* kbase_instr_hwcnt_sample_done() - Dump complete interrupt received
* @kbdev: Kbase device
*/
void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev);
#endif /* _KBASE_INSTR_INTERNAL_H_ */

View File

@@ -0,0 +1,109 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend specific IRQ APIs
*/
#ifndef _KBASE_IRQ_INTERNAL_H_
#define _KBASE_IRQ_INTERNAL_H_
/* GPU IRQ Tags */
#define JOB_IRQ_TAG 0
#define MMU_IRQ_TAG 1
#define GPU_IRQ_TAG 2
/**
* kbase_install_interrupts - Install IRQs handlers.
*
* @kbdev: The kbase device
*
* This function must be called once only when a kbase device is initialized.
*
* Return: 0 on success. Error code (negative) on failure.
*/
int kbase_install_interrupts(struct kbase_device *kbdev);
/**
* kbase_release_interrupts - Uninstall IRQs handlers.
*
* @kbdev: The kbase device
*
* This function needs to be called when a kbase device is terminated.
*/
void kbase_release_interrupts(struct kbase_device *kbdev);
/**
* kbase_synchronize_irqs - Ensure that all IRQ handlers have completed
* execution
* @kbdev: The kbase device
*/
void kbase_synchronize_irqs(struct kbase_device *kbdev);
#ifdef CONFIG_MALI_DEBUG
#if IS_ENABLED(CONFIG_MALI_REAL_HW)
/**
* kbase_validate_interrupts - Validate interrupts
*
* @kbdev: The kbase device
*
* This function will be called once when a kbase device is initialized
* to check whether interrupt handlers are configured appropriately.
* If interrupt numbers and/or flags defined in the device tree are
* incorrect, then the validation might fail.
* The whold device initialization will fail if it returns error code.
*
* Return: 0 on success. Error code (negative) on failure.
*/
int kbase_validate_interrupts(struct kbase_device *const kbdev);
#endif /* CONFIG_MALI_REAL_HW */
#endif /* CONFIG_MALI_DEBUG */
/**
* kbase_get_interrupt_handler - Return default interrupt handler
* @kbdev: Kbase device
* @irq_tag: Tag to choose the handler
*
* If single interrupt line is used the combined interrupt handler
* will be returned regardless of irq_tag. Otherwise the corresponding
* interrupt handler will be returned.
*
* Return: Interrupt handler corresponding to the tag. NULL on failure.
*/
irq_handler_t kbase_get_interrupt_handler(struct kbase_device *kbdev, u32 irq_tag);
/**
* kbase_set_custom_irq_handler - Set a custom IRQ handler
*
* @kbdev: The kbase device for which the handler is to be registered
* @custom_handler: Handler to be registered
* @irq_tag: Interrupt tag
*
* Register given interrupt handler for requested interrupt tag
* In the case where irq handler is not specified, the default handler shall be
* registered
*
* Return: 0 case success, error code otherwise
*/
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag);
#endif /* _KBASE_IRQ_INTERNAL_H_ */

View File

@@ -0,0 +1,498 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_irq_internal.h>
#include <linux/interrupt.h>
#if IS_ENABLED(CONFIG_MALI_REAL_HW)
static void *kbase_tag(void *ptr, u32 tag)
{
return (void *)(((uintptr_t)ptr) | tag);
}
#endif
static void *kbase_untag(void *ptr)
{
return (void *)(((uintptr_t)ptr) & ~(uintptr_t)3);
}
static irqreturn_t kbase_job_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_powered) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_STATUS));
if (!val) {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
#if MALI_USE_CSF
/* call the csf interrupt handler */
kbase_csf_interrupt(kbdev, val);
#else
kbase_job_done(kbdev, val);
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_HANDLED;
}
static irqreturn_t kbase_mmu_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_powered) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
atomic_inc(&kbdev->faults_pending);
val = kbase_reg_read32(kbdev, MMU_CONTROL_ENUM(IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val) {
atomic_dec(&kbdev->faults_pending);
return IRQ_NONE;
}
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbase_mmu_interrupt(kbdev, val);
atomic_dec(&kbdev->faults_pending);
return IRQ_HANDLED;
}
static irqreturn_t kbase_gpuonly_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 gpu_irq_status;
irqreturn_t irq_state = IRQ_NONE;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_powered) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
gpu_irq_status = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (gpu_irq_status) {
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, gpu_irq_status);
kbase_gpu_interrupt(kbdev, gpu_irq_status);
irq_state = IRQ_HANDLED;
}
return irq_state;
}
/**
* kbase_gpu_irq_handler - GPU interrupt handler
* @irq: IRQ number
* @data: Data associated with this IRQ (i.e. kbdev)
*
* Return: IRQ_HANDLED if any interrupt request has been successfully handled.
* IRQ_NONE otherwise.
*/
static irqreturn_t kbase_gpu_irq_handler(int irq, void *data)
{
irqreturn_t irq_state = kbase_gpuonly_irq_handler(irq, data);
return irq_state;
}
/**
* kbase_combined_irq_handler - Combined interrupt handler for all interrupts
* @irq: IRQ number
* @data: Data associated with this IRQ (i.e. kbdev)
*
* This handler will be used for the GPU with single interrupt line.
*
* Return: IRQ_HANDLED if any interrupt request has been successfully handled.
* IRQ_NONE otherwise.
*/
static irqreturn_t kbase_combined_irq_handler(int irq, void *data)
{
irqreturn_t irq_state = IRQ_NONE;
if (kbase_job_irq_handler(irq, data) == IRQ_HANDLED)
irq_state = IRQ_HANDLED;
if (kbase_mmu_irq_handler(irq, data) == IRQ_HANDLED)
irq_state = IRQ_HANDLED;
if (kbase_gpu_irq_handler(irq, data) == IRQ_HANDLED)
irq_state = IRQ_HANDLED;
return irq_state;
}
static irq_handler_t kbase_handler_table[] = {
[JOB_IRQ_TAG] = kbase_job_irq_handler,
[MMU_IRQ_TAG] = kbase_mmu_irq_handler,
[GPU_IRQ_TAG] = kbase_gpu_irq_handler,
};
irq_handler_t kbase_get_interrupt_handler(struct kbase_device *kbdev, u32 irq_tag)
{
if (kbdev->nr_irqs == 1)
return kbase_combined_irq_handler;
else if (irq_tag < ARRAY_SIZE(kbase_handler_table))
return kbase_handler_table[irq_tag];
else
return NULL;
}
#if IS_ENABLED(CONFIG_MALI_REAL_HW)
#ifdef CONFIG_MALI_DEBUG
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag)
{
int result = 0;
irq_handler_t handler = custom_handler;
const int irq = (kbdev->nr_irqs == 1) ? 0 : irq_tag;
if (unlikely(!((irq_tag >= JOB_IRQ_TAG) && (irq_tag <= GPU_IRQ_TAG)))) {
dev_err(kbdev->dev, "Invalid irq_tag (%d)\n", irq_tag);
return -EINVAL;
}
/* Release previous handler */
if (kbdev->irqs[irq].irq)
free_irq(kbdev->irqs[irq].irq, kbase_tag(kbdev, irq));
/* If a custom handler isn't provided use the default handler */
if (!handler)
handler = kbase_get_interrupt_handler(kbdev, irq_tag);
if (request_irq(kbdev->irqs[irq].irq, handler,
kbdev->irqs[irq].flags | ((kbdev->nr_irqs == 1) ? 0 : IRQF_SHARED),
dev_name(kbdev->dev), kbase_tag(kbdev, irq)) != 0) {
result = -EINVAL;
dev_err(kbdev->dev, "Can't request interrupt %u (index %u)\n", kbdev->irqs[irq].irq,
irq_tag);
if (IS_ENABLED(CONFIG_SPARSE_IRQ))
dev_err(kbdev->dev,
"CONFIG_SPARSE_IRQ enabled - is the interrupt number correct for this config?\n");
}
return result;
}
KBASE_EXPORT_TEST_API(kbase_set_custom_irq_handler);
/* test correct interrupt assigment and reception by cpu */
struct kbasep_irq_test {
struct hrtimer timer;
wait_queue_head_t wait;
int triggered;
u32 timeout;
};
static struct kbasep_irq_test kbasep_irq_test_data;
#define IRQ_TEST_TIMEOUT 500
static irqreturn_t kbase_job_irq_test_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_powered) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val)
return IRQ_NONE;
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbasep_irq_test_data.triggered = 1;
wake_up(&kbasep_irq_test_data.wait);
kbase_reg_write32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_CLEAR), val);
return IRQ_HANDLED;
}
static irqreturn_t kbase_mmu_irq_test_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_powered) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, MMU_CONTROL_ENUM(IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val)
return IRQ_NONE;
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbasep_irq_test_data.triggered = 1;
wake_up(&kbasep_irq_test_data.wait);
kbase_reg_write32(kbdev, MMU_CONTROL_ENUM(IRQ_CLEAR), val);
return IRQ_HANDLED;
}
static enum hrtimer_restart kbasep_test_interrupt_timeout(struct hrtimer *timer)
{
struct kbasep_irq_test *test_data = container_of(timer, struct kbasep_irq_test, timer);
test_data->timeout = 1;
test_data->triggered = 1;
wake_up(&test_data->wait);
return HRTIMER_NORESTART;
}
/**
* validate_interrupt - Validate an interrupt
* @kbdev: Kbase device
* @tag: Tag to choose the interrupt
*
* To validate the settings for the interrupt, write a value on RAWSTAT
* register to trigger interrupt. Then with custom interrupt handler
* check whether the interrupt happens within reasonable time.
*
* Return: 0 if validating interrupt succeeds.
*/
static int validate_interrupt(struct kbase_device *const kbdev, u32 tag)
{
int err = 0;
irq_handler_t handler;
const int irq = (kbdev->nr_irqs == 1) ? 0 : tag;
u32 old_mask_val;
u16 mask_offset;
u16 rawstat_offset;
switch (tag) {
case JOB_IRQ_TAG:
handler = kbase_job_irq_test_handler;
rawstat_offset = JOB_CONTROL_ENUM(JOB_IRQ_RAWSTAT);
mask_offset = JOB_CONTROL_ENUM(JOB_IRQ_MASK);
break;
case MMU_IRQ_TAG:
handler = kbase_mmu_irq_test_handler;
rawstat_offset = MMU_CONTROL_ENUM(IRQ_RAWSTAT);
mask_offset = MMU_CONTROL_ENUM(IRQ_MASK);
break;
case GPU_IRQ_TAG:
/* already tested by pm_driver - bail out */
return 0;
default:
dev_err(kbdev->dev, "Invalid tag (%d)\n", tag);
return -EINVAL;
}
/* store old mask */
old_mask_val = kbase_reg_read32(kbdev, mask_offset);
/* mask interrupts */
kbase_reg_write32(kbdev, mask_offset, 0x0);
if (kbdev->irqs[irq].irq) {
/* release original handler and install test handler */
if (kbase_set_custom_irq_handler(kbdev, handler, tag) != 0) {
err = -EINVAL;
} else {
kbasep_irq_test_data.timeout = 0;
hrtimer_init(&kbasep_irq_test_data.timer, CLOCK_MONOTONIC,
HRTIMER_MODE_REL);
kbasep_irq_test_data.timer.function = kbasep_test_interrupt_timeout;
/* trigger interrupt */
kbase_reg_write32(kbdev, mask_offset, 0x1);
kbase_reg_write32(kbdev, rawstat_offset, 0x1);
hrtimer_start(&kbasep_irq_test_data.timer,
HR_TIMER_DELAY_MSEC(IRQ_TEST_TIMEOUT), HRTIMER_MODE_REL);
wait_event(kbasep_irq_test_data.wait, kbasep_irq_test_data.triggered != 0);
if (kbasep_irq_test_data.timeout != 0) {
dev_err(kbdev->dev, "Interrupt %u (index %u) didn't reach CPU.\n",
kbdev->irqs[irq].irq, irq);
err = -EINVAL;
} else {
dev_dbg(kbdev->dev, "Interrupt %u (index %u) reached CPU.\n",
kbdev->irqs[irq].irq, irq);
}
hrtimer_cancel(&kbasep_irq_test_data.timer);
kbasep_irq_test_data.triggered = 0;
/* mask interrupts */
kbase_reg_write32(kbdev, mask_offset, 0x0);
/* release test handler */
free_irq(kbdev->irqs[irq].irq, kbase_tag(kbdev, irq));
}
/* restore original interrupt */
if (request_irq(kbdev->irqs[irq].irq, kbase_get_interrupt_handler(kbdev, tag),
kbdev->irqs[irq].flags | ((kbdev->nr_irqs == 1) ? 0 : IRQF_SHARED),
dev_name(kbdev->dev), kbase_tag(kbdev, irq))) {
dev_err(kbdev->dev, "Can't restore original interrupt %u (index %u)\n",
kbdev->irqs[irq].irq, tag);
err = -EINVAL;
}
}
/* restore old mask */
kbase_reg_write32(kbdev, mask_offset, old_mask_val);
return err;
}
#if IS_ENABLED(CONFIG_MALI_REAL_HW)
int kbase_validate_interrupts(struct kbase_device *const kbdev)
{
int err;
init_waitqueue_head(&kbasep_irq_test_data.wait);
kbasep_irq_test_data.triggered = 0;
/* A suspend won't happen during startup/insmod */
kbase_pm_context_active(kbdev);
err = validate_interrupt(kbdev, JOB_IRQ_TAG);
if (err) {
dev_err(kbdev->dev,
"Interrupt JOB_IRQ didn't reach CPU. Check interrupt assignments.\n");
goto out;
}
err = validate_interrupt(kbdev, MMU_IRQ_TAG);
if (err) {
dev_err(kbdev->dev,
"Interrupt MMU_IRQ didn't reach CPU. Check interrupt assignments.\n");
goto out;
}
dev_dbg(kbdev->dev, "Interrupts are correctly assigned.\n");
out:
kbase_pm_context_idle(kbdev);
return err;
}
#endif /* CONFIG_MALI_REAL_HW */
#endif /* CONFIG_MALI_DEBUG */
int kbase_install_interrupts(struct kbase_device *kbdev)
{
u32 i;
for (i = 0; i < kbdev->nr_irqs; i++) {
const int result = request_irq(
kbdev->irqs[i].irq, kbase_get_interrupt_handler(kbdev, i),
kbdev->irqs[i].flags | ((kbdev->nr_irqs == 1) ? 0 : IRQF_SHARED),
dev_name(kbdev->dev), kbase_tag(kbdev, i));
if (result) {
dev_err(kbdev->dev, "Can't request interrupt %u (index %u)\n",
kbdev->irqs[i].irq, i);
goto release;
}
}
return 0;
release:
if (IS_ENABLED(CONFIG_SPARSE_IRQ))
dev_err(kbdev->dev,
"CONFIG_SPARSE_IRQ enabled - is the interrupt number correct for this config?\n");
while (i-- > 0)
free_irq(kbdev->irqs[i].irq, kbase_tag(kbdev, i));
return -EINVAL;
}
void kbase_release_interrupts(struct kbase_device *kbdev)
{
u32 i;
for (i = 0; i < kbdev->nr_irqs; i++) {
if (kbdev->irqs[i].irq)
free_irq(kbdev->irqs[i].irq, kbase_tag(kbdev, i));
}
}
void kbase_synchronize_irqs(struct kbase_device *kbdev)
{
u32 i;
for (i = 0; i < kbdev->nr_irqs; i++) {
if (kbdev->irqs[i].irq)
synchronize_irq(kbdev->irqs[i].irq);
}
}
KBASE_EXPORT_TEST_API(kbase_synchronize_irqs);
#endif /* IS_ENABLED(CONFIG_MALI_REAL_HW) */

View File

@@ -0,0 +1,237 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register backend context / address space management
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_jm.h>
#include <mali_kbase_ctx_sched.h>
/**
* assign_and_activate_kctx_addr_space - Assign an AS to a context
* @kbdev: Kbase device
* @kctx: Kbase context
* @current_as: Address Space to assign
*
* Assign an Address Space (AS) to a context, and add the context to the Policy.
*
* This includes
* setting up the global runpool_irq structure and the context on the AS,
* Activating the MMU on the AS,
* Allowing jobs to be submitted on the AS.
*
* Context:
* kbasep_js_kctx_info.jsctx_mutex held,
* kbasep_js_device_data.runpool_mutex held,
* AS transaction mutex held,
* Runpool IRQ lock held
*/
static void assign_and_activate_kctx_addr_space(struct kbase_device *kbdev,
struct kbase_context *kctx,
struct kbase_as *current_as)
{
struct kbasep_js_device_data *js_devdata = &kbdev->js_data;
CSTD_UNUSED(current_as);
lockdep_assert_held(&kctx->jctx.sched_info.ctx.jsctx_mutex);
lockdep_assert_held(&js_devdata->runpool_mutex);
lockdep_assert_held(&kbdev->hwaccess_lock);
#if !MALI_USE_CSF
/* Attribute handling */
kbasep_js_ctx_attr_runpool_retain_ctx(kbdev, kctx);
#endif
/* Allow it to run jobs */
kbasep_js_set_submit_allowed(js_devdata, kctx);
kbase_js_runpool_inc_context_count(kbdev, kctx);
}
bool kbase_backend_use_ctx_sched(struct kbase_device *kbdev, struct kbase_context *kctx,
unsigned int js)
{
int i;
if (kbdev->hwaccess.active_kctx[js] == kctx) {
/* Context is already active */
return true;
}
for (i = 0; i < kbdev->nr_hw_address_spaces; i++) {
if (kbdev->as_to_kctx[i] == kctx) {
/* Context already has ASID - mark as active */
return true;
}
}
/* Context does not have address space assigned */
return false;
}
void kbase_backend_release_ctx_irq(struct kbase_device *kbdev, struct kbase_context *kctx)
{
int as_nr = kctx->as_nr;
if (as_nr == KBASEP_AS_NR_INVALID) {
WARN(1, "Attempting to release context without ASID\n");
return;
}
lockdep_assert_held(&kbdev->hwaccess_lock);
if (atomic_read(&kctx->refcount) != 1) {
WARN(1, "Attempting to release active ASID\n");
return;
}
kbasep_js_clear_submit_allowed(&kbdev->js_data, kctx);
kbase_ctx_sched_release_ctx(kctx);
kbase_js_runpool_dec_context_count(kbdev, kctx);
}
void kbase_backend_release_ctx_noirq(struct kbase_device *kbdev, struct kbase_context *kctx)
{
CSTD_UNUSED(kbdev);
CSTD_UNUSED(kctx);
}
int kbase_backend_find_and_release_free_address_space(struct kbase_device *kbdev,
struct kbase_context *kctx)
{
struct kbasep_js_device_data *js_devdata;
struct kbasep_js_kctx_info *js_kctx_info;
unsigned long flags;
int i;
js_devdata = &kbdev->js_data;
js_kctx_info = &kctx->jctx.sched_info;
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
for (i = 0; i < kbdev->nr_hw_address_spaces; i++) {
struct kbasep_js_kctx_info *as_js_kctx_info;
struct kbase_context *as_kctx;
as_kctx = kbdev->as_to_kctx[i];
as_js_kctx_info = &as_kctx->jctx.sched_info;
/* Don't release privileged or active contexts, or contexts with
* jobs running.
* Note that a context will have at least 1 reference (which
* was previously taken by kbasep_js_schedule_ctx()) until
* descheduled.
*/
if (as_kctx && !kbase_ctx_flag(as_kctx, KCTX_PRIVILEGED) &&
atomic_read(&as_kctx->refcount) == 1) {
if (!kbase_ctx_sched_inc_refcount_nolock(as_kctx)) {
WARN(1, "Failed to retain active context\n");
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return KBASEP_AS_NR_INVALID;
}
kbasep_js_clear_submit_allowed(js_devdata, as_kctx);
/* Drop and retake locks to take the jsctx_mutex on the
* context we're about to release without violating lock
* ordering
*/
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
/* Release context from address space */
mutex_lock(&as_js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
kbasep_js_runpool_release_ctx_nolock(kbdev, as_kctx);
if (!kbase_ctx_flag(as_kctx, KCTX_SCHEDULED)) {
kbasep_js_runpool_requeue_or_kill_ctx(kbdev, as_kctx, true);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&as_js_kctx_info->ctx.jsctx_mutex);
return i;
}
/* Context was retained while locks were dropped,
* continue looking for free AS
*/
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&as_js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
}
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return KBASEP_AS_NR_INVALID;
}
bool kbase_backend_use_ctx(struct kbase_device *kbdev, struct kbase_context *kctx, int as_nr)
{
struct kbasep_js_device_data *js_devdata;
struct kbase_as *new_address_space = NULL;
int js;
js_devdata = &kbdev->js_data;
for (js = 0; js < BASE_JM_MAX_NR_SLOTS; js++) {
if (kbdev->hwaccess.active_kctx[js] == kctx) {
WARN(1, "Context is already scheduled in\n");
return false;
}
}
new_address_space = &kbdev->as[as_nr];
lockdep_assert_held(&js_devdata->runpool_mutex);
lockdep_assert_held(&kbdev->mmu_hw_mutex);
lockdep_assert_held(&kbdev->hwaccess_lock);
assign_and_activate_kctx_addr_space(kbdev, kctx, new_address_space);
if (kbase_ctx_flag(kctx, KCTX_PRIVILEGED)) {
/* We need to retain it to keep the corresponding address space
*/
kbase_ctx_sched_retain_ctx_refcount(kctx);
}
return true;
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,148 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2016, 2018-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Job Manager backend-specific low-level APIs.
*/
#ifndef _KBASE_JM_HWACCESS_H_
#define _KBASE_JM_HWACCESS_H_
#include <mali_kbase_hw.h>
#include <mali_kbase_debug.h>
#include <linux/atomic.h>
#include <backend/gpu/mali_kbase_jm_rb.h>
#include <device/mali_kbase_device.h>
/**
* kbase_job_done_slot() - Complete the head job on a particular job-slot
* @kbdev: Device pointer
* @s: Job slot
* @completion_code: Completion code of job reported by GPU
* @job_tail: Job tail address reported by GPU
* @end_timestamp: Timestamp of job completion
*/
void kbase_job_done_slot(struct kbase_device *kbdev, int s, u32 completion_code, u64 job_tail,
ktime_t *end_timestamp);
#if IS_ENABLED(CONFIG_GPU_TRACEPOINTS)
static inline char *kbasep_make_job_slot_string(unsigned int js, char *js_string, size_t js_size)
{
(void)scnprintf(js_string, js_size, "job_slot_%u", js);
return js_string;
}
#endif
/**
* kbase_job_hw_submit() - Submit a job to the GPU
* @kbdev: Device pointer
* @katom: Atom to submit
* @js: Job slot to submit on
*
* The caller must check kbasep_jm_is_submit_slots_free() != false before
* calling this.
*
* The following locking conditions are made on the caller:
* - it must hold the hwaccess_lock
*
* Return: 0 if the job was successfully submitted to hardware, an error otherwise.
*/
int kbase_job_hw_submit(struct kbase_device *kbdev, struct kbase_jd_atom *katom, unsigned int js);
#if !MALI_USE_CSF
/**
* kbasep_job_slot_soft_or_hard_stop_do_action() - Perform a soft or hard stop
* on the specified atom
* @kbdev: Device pointer
* @js: Job slot to stop on
* @action: The action to perform, either JS_COMMAND_HARD_STOP or
* JS_COMMAND_SOFT_STOP
* @core_reqs: Core requirements of atom to stop
* @target_katom: Atom to stop
*
* The following locking conditions are made on the caller:
* - it must hold the hwaccess_lock
*/
void kbasep_job_slot_soft_or_hard_stop_do_action(struct kbase_device *kbdev, unsigned int js,
u32 action, base_jd_core_req core_reqs,
struct kbase_jd_atom *target_katom);
#endif /* !MALI_USE_CSF */
/**
* kbase_backend_soft_hard_stop_slot() - Soft or hard stop jobs on a given job
* slot belonging to a given context.
* @kbdev: Device pointer
* @kctx: Context pointer. May be NULL
* @katom: Specific atom to stop. May be NULL
* @js: Job slot to hard stop
* @action: The action to perform, either JS_COMMAND_HARD_STOP or
* JS_COMMAND_SOFT_STOP
*
* If no context is provided then all jobs on the slot will be soft or hard
* stopped.
*
* If a katom is provided then only that specific atom will be stopped. In this
* case the kctx parameter is ignored.
*
* Jobs that are on the slot but are not yet on the GPU will be unpulled and
* returned to the job scheduler.
*
* Return: true if an atom was stopped, false otherwise
*/
bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev, struct kbase_context *kctx,
unsigned int js, struct kbase_jd_atom *katom, u32 action);
/**
* kbase_job_slot_init - Initialise job slot framework
* @kbdev: Device pointer
*
* Called on driver initialisation
*
* Return: 0 on success
*/
int kbase_job_slot_init(struct kbase_device *kbdev);
/**
* kbase_job_slot_halt - Halt the job slot framework
* @kbdev: Device pointer
*
* Should prevent any further job slot processing
*/
void kbase_job_slot_halt(struct kbase_device *kbdev);
/**
* kbase_job_slot_term - Terminate job slot framework
* @kbdev: Device pointer
*
* Called on driver termination
*/
void kbase_job_slot_term(struct kbase_device *kbdev);
/**
* kbase_gpu_cache_clean - Cause a GPU cache clean & flush
* @kbdev: Device pointer
*
* Caller must not be in IRQ context
*/
void kbase_gpu_cache_clean(struct kbase_device *kbdev);
#endif /* _KBASE_JM_HWACCESS_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,77 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2018, 2020-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific APIs
*/
#ifndef _KBASE_HWACCESS_GPU_H_
#define _KBASE_HWACCESS_GPU_H_
#include <backend/gpu/mali_kbase_pm_internal.h>
/**
* kbase_gpu_irq_evict - Evict an atom from a NEXT slot
*
* @kbdev: Device pointer
* @js: Job slot to evict from
* @completion_code: Event code from job that was run.
*
* Evict the atom in the NEXT slot for the specified job slot. This function is
* called from the job complete IRQ handler when the previous job has failed.
*
* Return: true if job evicted from NEXT registers, false otherwise
*/
bool kbase_gpu_irq_evict(struct kbase_device *kbdev, unsigned int js, u32 completion_code);
/**
* kbase_gpu_complete_hw - Complete an atom on job slot js
*
* @kbdev: Device pointer
* @js: Job slot that has completed
* @completion_code: Event code from job that has completed
* @job_tail: The tail address from the hardware if the job has partially
* completed
* @end_timestamp: Time of completion
*/
void kbase_gpu_complete_hw(struct kbase_device *kbdev, unsigned int js, u32 completion_code,
u64 job_tail, ktime_t *end_timestamp);
/**
* kbase_gpu_inspect - Inspect the contents of the HW access ringbuffer
*
* @kbdev: Device pointer
* @js: Job slot to inspect
* @idx: Index into ringbuffer. 0 is the job currently running on
* the slot, 1 is the job waiting, all other values are invalid.
* Return: The atom at that position in the ringbuffer
* or NULL if no atom present
*/
struct kbase_jd_atom *kbase_gpu_inspect(struct kbase_device *kbdev, unsigned int js, int idx);
/**
* kbase_gpu_dump_slots - Print the contents of the slot ringbuffers
*
* @kbdev: Device pointer
*/
void kbase_gpu_dump_slots(struct kbase_device *kbdev);
#endif /* _KBASE_HWACCESS_GPU_H_ */

View File

@@ -0,0 +1,362 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific job scheduler APIs
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_jm.h>
#include <mali_kbase_reset_gpu.h>
#include <backend/gpu/mali_kbase_jm_internal.h>
#include <backend/gpu/mali_kbase_js_internal.h>
#if IS_ENABLED(CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD)
#include <mali_kbase_gpu_metrics.h>
#endif
/*
* Hold the runpool_mutex for this
*/
static inline bool timer_callback_should_run(struct kbase_device *kbdev, int nr_running_ctxs)
{
lockdep_assert_held(&kbdev->js_data.runpool_mutex);
#ifdef CONFIG_MALI_DEBUG
if (kbdev->js_data.softstop_always) {
/* Debug support for allowing soft-stop on a single context */
return true;
}
#endif /* CONFIG_MALI_DEBUG */
if (kbase_hw_has_issue(kbdev, BASE_HW_ISSUE_9435)) {
/* Timeouts would have to be 4x longer (due to micro-
* architectural design) to support OpenCL conformance tests, so
* only run the timer when there's:
* - 2 or more CL contexts
* - 1 or more GLES contexts
*
* NOTE: We will treat a context that has both Compute and Non-
* Compute jobs will be treated as an OpenCL context (hence, we
* don't check KBASEP_JS_CTX_ATTR_NON_COMPUTE).
*/
{
int nr_compute_ctxs = kbasep_js_ctx_attr_count_on_runpool(
kbdev, KBASEP_JS_CTX_ATTR_COMPUTE);
int nr_noncompute_ctxs = nr_running_ctxs - nr_compute_ctxs;
return (bool)(nr_compute_ctxs >= 2 || nr_noncompute_ctxs > 0);
}
} else {
/* Run the timer callback whenever you have at least 1 context
*/
return (bool)(nr_running_ctxs > 0);
}
}
static enum hrtimer_restart timer_callback(struct hrtimer *timer)
{
unsigned long flags;
struct kbase_device *kbdev;
struct kbasep_js_device_data *js_devdata;
struct kbase_backend_data *backend;
unsigned int s;
bool reset_needed = false;
KBASE_DEBUG_ASSERT(timer != NULL);
backend = container_of(timer, struct kbase_backend_data, scheduling_timer);
kbdev = container_of(backend, struct kbase_device, hwaccess.backend);
js_devdata = &kbdev->js_data;
/* Loop through the slots */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
for (s = 0; s < kbdev->gpu_props.num_job_slots; s++) {
struct kbase_jd_atom *atom = NULL;
if (kbase_backend_nr_atoms_on_slot(kbdev, s) > 0) {
atom = kbase_gpu_inspect(kbdev, s, 0);
KBASE_DEBUG_ASSERT(atom != NULL);
}
if (atom != NULL) {
/* The current version of the model doesn't support
* Soft-Stop
*/
if (!kbase_hw_has_issue(kbdev, BASE_HW_ISSUE_5736)) {
u32 ticks = atom->ticks++;
#if !defined(CONFIG_MALI_JOB_DUMP) && !defined(CONFIG_MALI_VECTOR_DUMP)
u32 soft_stop_ticks, hard_stop_ticks, gpu_reset_ticks;
if (atom->core_req & BASE_JD_REQ_ONLY_COMPUTE) {
soft_stop_ticks = js_devdata->soft_stop_ticks_cl;
hard_stop_ticks = js_devdata->hard_stop_ticks_cl;
gpu_reset_ticks = js_devdata->gpu_reset_ticks_cl;
} else {
soft_stop_ticks = js_devdata->soft_stop_ticks;
hard_stop_ticks = js_devdata->hard_stop_ticks_ss;
gpu_reset_ticks = js_devdata->gpu_reset_ticks_ss;
}
/* If timeouts have been changed then ensure
* that atom tick count is not greater than the
* new soft_stop timeout. This ensures that
* atoms do not miss any of the timeouts due to
* races between this worker and the thread
* changing the timeouts.
*/
if (backend->timeouts_updated && ticks > soft_stop_ticks)
ticks = atom->ticks = soft_stop_ticks;
/* Job is Soft-Stoppable */
if (ticks == soft_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->soft_stop_ticks ticks.
* Soft stop the slot so we can run
* other jobs.
*/
#if !KBASE_DISABLE_SCHEDULING_SOFT_STOPS
int disjoint_threshold =
KBASE_DISJOINT_STATE_INTERLEAVED_CONTEXT_COUNT_THRESHOLD;
u32 softstop_flags = 0u;
dev_dbg(kbdev->dev, "Soft-stop");
/* nr_user_contexts_running is updated
* with the runpool_mutex, but we can't
* take that here.
*
* However, if it's about to be
* increased then the new context can't
* run any jobs until they take the
* hwaccess_lock, so it's OK to observe
* the older value.
*
* Similarly, if it's about to be
* decreased, the last job from another
* context has already finished, so
* it's not too bad that we observe the
* older value and register a disjoint
* event when we try soft-stopping
*/
if (js_devdata->nr_user_contexts_running >=
disjoint_threshold)
softstop_flags |= JS_COMMAND_SW_CAUSES_DISJOINT;
kbase_job_slot_softstop_swflags(kbdev, s, atom,
softstop_flags);
#endif
} else if (ticks == hard_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->hard_stop_ticks_ss ticks.
* It should have been soft-stopped by
* now. Hard stop the slot.
*/
#if !KBASE_DISABLE_SCHEDULING_HARD_STOPS
u32 ms = js_devdata->scheduling_period_ns / 1000000u;
dev_warn(
kbdev->dev,
"JS: Job Hard-Stopped (took more than %u ticks at %u ms/tick)",
ticks, ms);
kbase_job_slot_hardstop(atom->kctx, s, atom);
#endif
} else if (ticks == gpu_reset_ticks) {
/* Job has been scheduled for at least
* js_devdata->gpu_reset_ticks_ss ticks.
* It should have left the GPU by now.
* Signal that the GPU needs to be
* reset.
*/
reset_needed = true;
}
#else /* !CONFIG_MALI_JOB_DUMP */
/* NOTE: During CONFIG_MALI_JOB_DUMP, we use
* the alternate timeouts, which makes the hard-
* stop and GPU reset timeout much longer. We
* also ensure that we don't soft-stop at all.
*/
if (ticks == js_devdata->soft_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->soft_stop_ticks. We do
* not soft-stop during
* CONFIG_MALI_JOB_DUMP, however.
*/
dev_dbg(kbdev->dev, "Soft-stop");
} else if (ticks == js_devdata->hard_stop_ticks_dumping) {
/* Job has been scheduled for at least
* js_devdata->hard_stop_ticks_dumping
* ticks. Hard stop the slot.
*/
#if !KBASE_DISABLE_SCHEDULING_HARD_STOPS
u32 ms = js_devdata->scheduling_period_ns / 1000000u;
dev_warn(
kbdev->dev,
"JS: Job Hard-Stopped (took more than %u ticks at %u ms/tick)",
ticks, ms);
kbase_job_slot_hardstop(atom->kctx, s, atom);
#endif
} else if (ticks == js_devdata->gpu_reset_ticks_dumping) {
/* Job has been scheduled for at least
* js_devdata->gpu_reset_ticks_dumping
* ticks. It should have left the GPU by
* now. Signal that the GPU needs to be
* reset.
*/
reset_needed = true;
}
#endif /* !CONFIG_MALI_JOB_DUMP */
}
}
}
if (reset_needed) {
dev_err(kbdev->dev,
"JS: Job has been on the GPU for too long (JS_RESET_TICKS_SS/DUMPING timeout hit). Issuing GPU soft-reset to resolve.");
if (kbase_prepare_to_reset_gpu_locked(kbdev, RESET_FLAGS_NONE))
kbase_reset_gpu_locked(kbdev);
}
/* the timer is re-issued if there is contexts in the run-pool */
if (backend->timer_running)
hrtimer_start(&backend->scheduling_timer,
HR_TIMER_DELAY_NSEC(js_devdata->scheduling_period_ns),
HRTIMER_MODE_REL);
backend->timeouts_updated = false;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return HRTIMER_NORESTART;
}
void kbase_backend_ctx_count_changed(struct kbase_device *kbdev)
{
struct kbasep_js_device_data *js_devdata = &kbdev->js_data;
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
unsigned long flags;
/* Timer must stop if we are suspending */
const bool suspend_timer = backend->suspend_timer;
const int nr_running_ctxs = atomic_read(&kbdev->js_data.nr_contexts_runnable);
lockdep_assert_held(&js_devdata->runpool_mutex);
if (suspend_timer || !timer_callback_should_run(kbdev, nr_running_ctxs)) {
/* Take spinlock to force synchronisation with timer */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
backend->timer_running = false;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* From now on, return value of timer_callback_should_run()
* will also cause the timer to not requeue itself. Its return
* value cannot change, because it depends on variables updated
* with the runpool_mutex held, which the caller of this must
* also hold
*/
hrtimer_cancel(&backend->scheduling_timer);
}
if (!suspend_timer && timer_callback_should_run(kbdev, nr_running_ctxs) &&
!backend->timer_running) {
/* Take spinlock to force synchronisation with timer */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
backend->timer_running = true;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
hrtimer_start(&backend->scheduling_timer,
HR_TIMER_DELAY_NSEC(js_devdata->scheduling_period_ns),
HRTIMER_MODE_REL);
KBASE_KTRACE_ADD_JM(kbdev, JS_POLICY_TIMER_START, NULL, NULL, 0u, 0u);
}
#if IS_ENABLED(CONFIG_MALI_TRACE_POWER_GPU_WORK_PERIOD)
if (unlikely(suspend_timer)) {
js_devdata->gpu_metrics_timer_needed = false;
/* Cancel the timer as System suspend is happening */
hrtimer_cancel(&js_devdata->gpu_metrics_timer);
js_devdata->gpu_metrics_timer_running = false;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* Explicitly emit the tracepoint on System suspend */
kbase_gpu_metrics_emit_tracepoint(kbdev, ktime_get_raw_ns());
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return;
}
if (!nr_running_ctxs) {
/* Just set the flag to not restart the timer on expiry */
js_devdata->gpu_metrics_timer_needed = false;
return;
}
/* There are runnable contexts so the timer is needed */
if (!js_devdata->gpu_metrics_timer_needed) {
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
js_devdata->gpu_metrics_timer_needed = true;
/* No need to restart the timer if it is already running. */
if (!js_devdata->gpu_metrics_timer_running) {
hrtimer_start(&js_devdata->gpu_metrics_timer,
HR_TIMER_DELAY_NSEC(kbase_gpu_metrics_get_tp_emit_interval()),
HRTIMER_MODE_REL);
js_devdata->gpu_metrics_timer_running = true;
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
#endif
}
int kbase_backend_timer_init(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
hrtimer_init(&backend->scheduling_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
backend->scheduling_timer.function = timer_callback;
backend->timer_running = false;
return 0;
}
void kbase_backend_timer_term(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
hrtimer_cancel(&backend->scheduling_timer);
}
void kbase_backend_timer_suspend(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->suspend_timer = true;
kbase_backend_ctx_count_changed(kbdev);
}
void kbase_backend_timer_resume(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->suspend_timer = false;
kbase_backend_ctx_count_changed(kbdev);
}
void kbase_backend_timeouts_changed(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->timeouts_updated = true;
}

View File

@@ -0,0 +1,72 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2015, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific job scheduler APIs
*/
#ifndef _KBASE_JS_BACKEND_H_
#define _KBASE_JS_BACKEND_H_
/**
* kbase_backend_timer_init() - Initialise the JS scheduling timer
* @kbdev: Device pointer
*
* This function should be called at driver initialisation
*
* Return: 0 on success
*/
int kbase_backend_timer_init(struct kbase_device *kbdev);
/**
* kbase_backend_timer_term() - Terminate the JS scheduling timer
* @kbdev: Device pointer
*
* This function should be called at driver termination
*/
void kbase_backend_timer_term(struct kbase_device *kbdev);
/**
* kbase_backend_timer_suspend - Suspend is happening, stop the JS scheduling
* timer
* @kbdev: Device pointer
*
* This function should be called on suspend, after the active count has reached
* zero. This is required as the timer may have been started on job submission
* to the job scheduler, but before jobs are submitted to the GPU.
*
* Caller must hold runpool_mutex.
*/
void kbase_backend_timer_suspend(struct kbase_device *kbdev);
/**
* kbase_backend_timer_resume - Resume is happening, re-evaluate the JS
* scheduling timer
* @kbdev: Device pointer
*
* This function should be called on resume. Note that is not guaranteed to
* re-start the timer, only evalute whether it should be re-started.
*
* Caller must hold runpool_mutex.
*/
void kbase_backend_timer_resume(struct kbase_device *kbdev);
#endif /* _KBASE_JS_BACKEND_H_ */

View File

@@ -0,0 +1,120 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <linux/version_compat_defs.h>
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <device/mali_kbase_device.h>
#include "mali_kbase_l2_mmu_config.h"
/**
* struct l2_mmu_config_limit_region - L2 MMU limit field
*
* @value: The default value to load into the L2_MMU_CONFIG register
* @mask: The shifted mask of the field in the L2_MMU_CONFIG register
* @shift: The shift of where the field starts in the L2_MMU_CONFIG register
* This should be the same value as the smaller of the two mask
* values
*/
struct l2_mmu_config_limit_region {
u32 value, mask, shift;
};
/**
* struct l2_mmu_config_limit - L2 MMU read and write limit
*
* @product_model: The GPU for which this entry applies
* @read: Values for the read limit field
* @write: Values for the write limit field
*/
struct l2_mmu_config_limit {
u32 product_model;
struct l2_mmu_config_limit_region read;
struct l2_mmu_config_limit_region write;
};
/*
* Zero represents no limit
*
* For LBEX TBEX TBAX TTRX and TNAX:
* The value represents the number of outstanding reads (6 bits) or writes (5 bits)
*
* For all other GPUS it is a fraction see: mali_kbase_config_defaults.h
*/
static const struct l2_mmu_config_limit limits[] = {
/* GPU, read, write */
{ GPU_ID_PRODUCT_LBEX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TBEX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TBAX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TTRX, { 0, GENMASK(12, 7), 7 }, { 0, GENMASK(17, 13), 13 } },
{ GPU_ID_PRODUCT_TNAX, { 0, GENMASK(12, 7), 7 }, { 0, GENMASK(17, 13), 13 } },
{ GPU_ID_PRODUCT_TGOX,
{ KBASE_3BIT_AID_32, GENMASK(14, 12), 12 },
{ KBASE_3BIT_AID_32, GENMASK(17, 15), 15 } },
{ GPU_ID_PRODUCT_TNOX,
{ KBASE_3BIT_AID_32, GENMASK(14, 12), 12 },
{ KBASE_3BIT_AID_32, GENMASK(17, 15), 15 } },
};
int kbase_set_mmu_quirks(struct kbase_device *kbdev)
{
/* All older GPUs had 2 bits for both fields, this is a default */
struct l2_mmu_config_limit limit = { 0, /* Any GPU not in the limits array defined above */
{ KBASE_AID_32, GENMASK(25, 24), 24 },
{ KBASE_AID_32, GENMASK(27, 26), 26 } };
u32 product_model;
u32 mmu_config = 0;
unsigned int i;
product_model = kbdev->gpu_props.gpu_id.product_model;
/* Limit the GPU bus bandwidth if the platform needs this. */
for (i = 0; i < ARRAY_SIZE(limits); i++) {
if (product_model == limits[i].product_model) {
limit = limits[i];
break;
}
}
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(L2_MMU_CONFIG)))
mmu_config = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(L2_MMU_CONFIG));
if (kbase_is_gpu_removed(kbdev))
return -EIO;
mmu_config &= ~(limit.read.mask | limit.write.mask);
/* Can't use FIELD_PREP() macro here as the mask isn't constant */
mmu_config |= (limit.read.value << limit.read.shift) |
(limit.write.value << limit.write.shift);
kbdev->hw_quirks_mmu = mmu_config;
if (kbdev->system_coherency == COHERENCY_ACE) {
/* Allow memory configuration disparity to be ignored,
* we optimize the use of shared memory and thus we
* expect some disparity in the memory configuration.
*/
kbdev->hw_quirks_mmu |= L2_MMU_CONFIG_ALLOW_SNOOP_DISPARITY;
}
return 0;
}

View File

@@ -0,0 +1,36 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_L2_MMU_CONFIG_H_
#define _KBASE_L2_MMU_CONFIG_H_
/**
* kbase_set_mmu_quirks - Set the hw_quirks_mmu field of kbdev
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Use this function to initialise the hw_quirks_mmu field, for instance to set
* the MAX_READS and MAX_WRITES to sane defaults for each GPU.
*
* Return: Zero for succeess or a Linux error code
*/
int kbase_set_mmu_quirks(struct kbase_device *kbdev);
#endif /* _KBASE_L2_MMU_CONFIG_H */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,224 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Dummy Model interface
*
* Support for NO_MALI dummy Model interface.
*
* +-----------------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------------+
* | Model Linux Framework |
* +-----------------------------------+
* | Model Dummy interface definitions |
* +-----------------+-----------------+
* | Fake R/W | Fake IRQ |
* +-----------------+-----------------+
*/
#ifndef _KBASE_MODEL_DUMMY_H_
#define _KBASE_MODEL_DUMMY_H_
#include <uapi/gpu/arm/midgard/backend/gpu/mali_kbase_model_linux.h>
#include <uapi/gpu/arm/midgard/backend/gpu/mali_kbase_model_dummy.h>
#define model_error_log(module, ...) pr_err(__VA_ARGS__)
#define NUM_SLOTS 4 /*number of job slots */
/*Errors Mask Codes*/
/* each bit of errors_mask is associated to a specific error:
* NON FAULT STATUS CODES: only the following are implemented since the others
* represent normal working statuses
*/
#define KBASE_JOB_INTERRUPTED (1 << 0)
#define KBASE_JOB_STOPPED (1 << 1)
#define KBASE_JOB_TERMINATED (1 << 2)
/* JOB EXCEPTIONS: */
#define KBASE_JOB_CONFIG_FAULT (1 << 3)
#define KBASE_JOB_POWER_FAULT (1 << 4)
#define KBASE_JOB_READ_FAULT (1 << 5)
#define KBASE_JOB_WRITE_FAULT (1 << 6)
#define KBASE_JOB_AFFINITY_FAULT (1 << 7)
#define KBASE_JOB_BUS_FAULT (1 << 8)
#define KBASE_INSTR_INVALID_PC (1 << 9)
#define KBASE_INSTR_INVALID_ENC (1 << 10)
#define KBASE_INSTR_TYPE_MISMATCH (1 << 11)
#define KBASE_INSTR_OPERAND_FAULT (1 << 12)
#define KBASE_INSTR_TLS_FAULT (1 << 13)
#define KBASE_INSTR_BARRIER_FAULT (1 << 14)
#define KBASE_INSTR_ALIGN_FAULT (1 << 15)
#define KBASE_DATA_INVALID_FAULT (1 << 16)
#define KBASE_TILE_RANGE_FAULT (1 << 17)
#define KBASE_ADDR_RANGE_FAULT (1 << 18)
#define KBASE_OUT_OF_MEMORY (1 << 19)
#define KBASE_UNKNOWN (1 << 20)
/* GPU EXCEPTIONS:*/
#define KBASE_DELAYED_BUS_FAULT (1 << 21)
#define KBASE_SHAREABILITY_FAULT (1 << 22)
/* MMU EXCEPTIONS:*/
#define KBASE_TRANSLATION_FAULT (1 << 23)
#define KBASE_PERMISSION_FAULT (1 << 24)
#define KBASE_TRANSTAB_BUS_FAULT (1 << 25)
#define KBASE_ACCESS_FLAG (1 << 26)
/* generic useful bitmasks */
#define IS_A_JOB_ERROR ((KBASE_UNKNOWN << 1) - KBASE_JOB_INTERRUPTED)
#define IS_A_MMU_ERROR ((KBASE_ACCESS_FLAG << 1) - KBASE_TRANSLATION_FAULT)
#define IS_A_GPU_ERROR (KBASE_DELAYED_BUS_FAULT | KBASE_SHAREABILITY_FAULT)
/* number of possible MMU address spaces */
#define NUM_MMU_AS \
16 /* total number of MMU address spaces as in
* MMU_IRQ_RAWSTAT register
*/
/* Forward declaration */
struct kbase_device;
/*
* the function below is used to trigger the simulation of a faulty
* HW condition for a specific job chain atom
*/
struct kbase_error_params {
u64 jc;
u32 errors_mask;
u32 mmu_table_level;
u16 faulty_mmu_as;
u16 padding[3];
};
enum kbase_model_control_command {
/* Disable/Enable job completion in the dummy model */
KBASE_MC_DISABLE_JOBS
};
/* struct to control dummy model behavior */
struct kbase_model_control_params {
s32 command;
s32 value;
};
/* struct to track faulty atoms */
struct kbase_error_atom {
struct kbase_error_params params;
struct kbase_error_atom *next;
};
/*struct to track the system error state*/
struct error_status_t {
spinlock_t access_lock;
u32 errors_mask;
u32 mmu_table_level;
u32 faulty_mmu_as;
u64 current_jc;
u32 current_job_slot;
u32 job_irq_rawstat;
u32 job_irq_status;
u32 js_status[NUM_SLOTS];
u32 mmu_irq_mask;
u32 mmu_irq_rawstat;
u32 gpu_error_irq;
u32 gpu_fault_status;
u32 as_faultstatus[NUM_MMU_AS];
u32 as_command[NUM_MMU_AS];
u64 as_transtab[NUM_MMU_AS];
};
/**
* struct gpu_model_prfcnt_en - Performance counter enable masks
* @fe: Enable mask for front-end block
* @tiler: Enable mask for tiler block
* @l2: Enable mask for L2/Memory system blocks
* @shader: Enable mask for shader core blocks
*/
struct gpu_model_prfcnt_en {
u32 fe;
u32 tiler;
u32 l2;
u32 shader;
};
void midgard_set_error(u32 job_slot);
int job_atom_inject_error(struct kbase_error_params *params);
int gpu_model_control(void *h, struct kbase_model_control_params *params);
/**
* gpu_model_set_dummy_prfcnt_user_sample() - Set performance counter values
* @data: Userspace pointer to array of counter values
* @size: Size of counter value array
*
* Counter values set by this function will be used for one sample dump only
* after which counters will be cleared back to zero.
*
* Return: 0 on success, else error code.
*/
int gpu_model_set_dummy_prfcnt_user_sample(u32 __user *data, u32 size);
/**
* gpu_model_set_dummy_prfcnt_kernel_sample() - Set performance counter values
* @data: Pointer to array of counter values
* @size: Size of counter value array
*
* Counter values set by this function will be used for one sample dump only
* after which counters will be cleared back to zero.
*/
void gpu_model_set_dummy_prfcnt_kernel_sample(u64 *data, u32 size);
void gpu_model_get_dummy_prfcnt_cores(struct kbase_device *kbdev, u64 *l2_present,
u64 *shader_present);
void gpu_model_set_dummy_prfcnt_cores(struct kbase_device *kbdev, u64 l2_present,
u64 shader_present);
/* Clear the counter values array maintained by the dummy model */
void gpu_model_clear_prfcnt_values(void);
#if MALI_USE_CSF
/**
* gpu_model_prfcnt_dump_request() - Request performance counter sample dump.
* @sample_buf: Pointer to KBASE_DUMMY_MODEL_MAX_VALUES_PER_SAMPLE sized array
* in which to store dumped performance counter values.
* @enable_maps: Physical enable maps for performance counter blocks.
*/
void gpu_model_prfcnt_dump_request(uint32_t *sample_buf, struct gpu_model_prfcnt_en enable_maps);
/**
* gpu_model_glb_request_job_irq() - Trigger job interrupt with global request
* flag set.
* @model: Model pointer returned by midgard_model_create().
*/
void gpu_model_glb_request_job_irq(void *model);
#endif /* MALI_USE_CSF */
extern struct error_status_t hw_error_status;
#endif

View File

@@ -0,0 +1,172 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2015, 2018-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <linux/random.h>
#include "backend/gpu/mali_kbase_model_linux.h"
static struct kbase_error_atom *error_track_list;
#ifdef CONFIG_MALI_ERROR_INJECT_RANDOM
/** Kernel 6.1.0 has dropped prandom_u32(), use get_random_u32() */
#if (KERNEL_VERSION(6, 1, 0) <= LINUX_VERSION_CODE)
#define prandom_u32 get_random_u32
#endif
/*following error probability are set quite high in order to stress the driver*/
static unsigned int error_probability = 50; /* to be set between 0 and 100 */
/* probability to have multiple error give that there is an error */
static unsigned int multiple_error_probability = 50;
/* all the error conditions supported by the model */
#define TOTAL_FAULTS 27
/* maximum number of levels in the MMU translation table tree */
#define MAX_MMU_TABLE_LEVEL 4
/* worst case scenario is <1 MMU fault + 1 job fault + 2 GPU faults> */
#define MAX_CONCURRENT_FAULTS 3
/**
* gpu_generate_error - Generate GPU error
*/
static void gpu_generate_error(void)
{
unsigned int errors_num = 0;
/*is there at least one error? */
if ((prandom_u32() % 100) < error_probability) {
/* pick up a faulty mmu address space */
hw_error_status.faulty_mmu_as = prandom_u32() % NUM_MMU_AS;
/* pick up an mmu table level */
hw_error_status.mmu_table_level = 1 + (prandom_u32() % MAX_MMU_TABLE_LEVEL);
hw_error_status.errors_mask = (u32)(1 << (prandom_u32() % TOTAL_FAULTS));
/*is there also one or more errors? */
if ((prandom_u32() % 100) < multiple_error_probability) {
errors_num = 1 + (prandom_u32() % (MAX_CONCURRENT_FAULTS - 1));
while (errors_num-- > 0) {
u32 temp_mask;
temp_mask = (u32)(1 << (prandom_u32() % TOTAL_FAULTS));
/* below we check that no bit of the same error
* type is set again in the error mask
*/
if ((temp_mask & IS_A_JOB_ERROR) &&
(hw_error_status.errors_mask & IS_A_JOB_ERROR)) {
errors_num++;
continue;
}
if ((temp_mask & IS_A_MMU_ERROR) &&
(hw_error_status.errors_mask & IS_A_MMU_ERROR)) {
errors_num++;
continue;
}
if ((temp_mask & IS_A_GPU_ERROR) &&
(hw_error_status.errors_mask & IS_A_GPU_ERROR)) {
errors_num++;
continue;
}
/* this error mask is already set */
if ((hw_error_status.errors_mask | temp_mask) ==
hw_error_status.errors_mask) {
errors_num++;
continue;
}
hw_error_status.errors_mask |= temp_mask;
}
}
}
}
#endif
int job_atom_inject_error(struct kbase_error_params *params)
{
struct kbase_error_atom *new_elem;
KBASE_DEBUG_ASSERT(params);
new_elem = kzalloc(sizeof(*new_elem), GFP_KERNEL);
if (!new_elem) {
model_error_log(KBASE_CORE,
"\njob_atom_inject_error: kzalloc failed for new_elem\n");
return -ENOMEM;
}
new_elem->params.jc = params->jc;
new_elem->params.errors_mask = params->errors_mask;
new_elem->params.mmu_table_level = params->mmu_table_level;
new_elem->params.faulty_mmu_as = params->faulty_mmu_as;
/*circular list below */
if (error_track_list == NULL) { /*no elements */
error_track_list = new_elem;
new_elem->next = error_track_list;
} else {
struct kbase_error_atom *walker = error_track_list;
while (walker->next != error_track_list)
walker = walker->next;
new_elem->next = error_track_list;
walker->next = new_elem;
}
return 0;
}
void midgard_set_error(u32 job_slot)
{
#ifdef CONFIG_MALI_ERROR_INJECT_RANDOM
gpu_generate_error();
#else
struct kbase_error_atom *walker, *auxiliar;
if (error_track_list != NULL) {
walker = error_track_list->next;
auxiliar = error_track_list;
do {
if (walker->params.jc == hw_error_status.current_jc) {
/* found a faulty atom matching with the
* current one
*/
hw_error_status.errors_mask = walker->params.errors_mask;
hw_error_status.mmu_table_level = walker->params.mmu_table_level;
hw_error_status.faulty_mmu_as = walker->params.faulty_mmu_as;
hw_error_status.current_job_slot = job_slot;
if (walker->next == walker) {
/* only one element */
kfree(error_track_list);
error_track_list = NULL;
} else {
auxiliar->next = walker->next;
if (walker == error_track_list)
error_track_list = walker->next;
kfree(walker);
}
break;
}
auxiliar = walker;
walker = walker->next;
} while (auxiliar->next != error_track_list);
}
#endif /* CONFIG_MALI_ERROR_INJECT_RANDOM */
}

View File

@@ -0,0 +1,178 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Model Linux Framework interfaces.
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include "backend/gpu/mali_kbase_model_linux.h"
#include "device/mali_kbase_device.h"
#include "mali_kbase_irq_internal.h"
#include <linux/kthread.h>
struct model_irq_data {
struct kbase_device *kbdev;
struct work_struct work;
};
#define DEFINE_SERVE_IRQ(irq_handler) \
static void serve_##irq_handler(struct work_struct *work) \
{ \
struct model_irq_data *data = container_of(work, struct model_irq_data, work); \
struct kbase_device *kbdev = data->kbdev; \
irq_handler(kbdev); \
kmem_cache_free(kbdev->irq_slab, data); \
}
static void job_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_job_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, JOB_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(job_irq)
static void gpu_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_gpu_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, GPU_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(gpu_irq)
static void mmu_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_mmu_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, MMU_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(mmu_irq)
void gpu_device_raise_irq(void *model, u32 irq)
{
struct model_irq_data *data;
struct kbase_device *kbdev = gpu_device_get_data(model);
KBASE_DEBUG_ASSERT(kbdev);
data = kmem_cache_alloc(kbdev->irq_slab, GFP_ATOMIC);
if (data == NULL)
return;
data->kbdev = kbdev;
switch (irq) {
case MODEL_LINUX_JOB_IRQ:
INIT_WORK(&data->work, serve_job_irq);
atomic_set(&kbdev->serving_job_irq, 1);
break;
case MODEL_LINUX_GPU_IRQ:
INIT_WORK(&data->work, serve_gpu_irq);
atomic_set(&kbdev->serving_gpu_irq, 1);
break;
case MODEL_LINUX_MMU_IRQ:
INIT_WORK(&data->work, serve_mmu_irq);
atomic_set(&kbdev->serving_mmu_irq, 1);
break;
default:
dev_warn(kbdev->dev, "Unknown IRQ");
kmem_cache_free(kbdev->irq_slab, data);
data = NULL;
break;
}
if (data != NULL)
queue_work(kbdev->irq_workq, &data->work);
}
int kbase_install_interrupts(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
atomic_set(&kbdev->serving_job_irq, 0);
atomic_set(&kbdev->serving_gpu_irq, 0);
atomic_set(&kbdev->serving_mmu_irq, 0);
kbdev->irq_workq = alloc_ordered_workqueue("dummy irq queue", 0);
if (kbdev->irq_workq == NULL)
return -ENOMEM;
kbdev->irq_slab =
kmem_cache_create("dummy_irq_slab", sizeof(struct model_irq_data), 0, 0, NULL);
if (kbdev->irq_slab == NULL) {
destroy_workqueue(kbdev->irq_workq);
return -ENOMEM;
}
kbdev->nr_irqs = 3;
return 0;
}
void kbase_release_interrupts(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
destroy_workqueue(kbdev->irq_workq);
kmem_cache_destroy(kbdev->irq_slab);
}
void kbase_synchronize_irqs(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
flush_workqueue(kbdev->irq_workq);
}
KBASE_EXPORT_TEST_API(kbase_synchronize_irqs);
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag)
{
return 0;
}
KBASE_EXPORT_TEST_API(kbase_set_custom_irq_handler);
int kbase_gpu_device_create(struct kbase_device *kbdev)
{
kbdev->model = midgard_model_create(kbdev);
if (kbdev->model == NULL)
return -ENOMEM;
spin_lock_init(&kbdev->reg_op_lock);
return 0;
}
/**
* kbase_gpu_device_destroy - Destroy GPU device
*
* @kbdev: kbase device
*/
void kbase_gpu_device_destroy(struct kbase_device *kbdev)
{
midgard_model_destroy(kbdev->model);
}

View File

@@ -0,0 +1,160 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Model Linux Framework interfaces.
*
* This framework is used to provide generic Kbase Models interfaces.
* Note: Backends cannot be used together; the selection is done at build time.
*
* - Without Model Linux Framework:
* +-----------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------+
* | HW interface definitions |
* +-----------------------------+
*
* - With Model Linux Framework:
* +-----------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------+
* | Model Linux Framework |
* +-----------------------------+
* | Model interface definitions |
* +-----------------------------+
*/
#ifndef _KBASE_MODEL_LINUX_H_
#define _KBASE_MODEL_LINUX_H_
/*
* Include Model definitions
*/
#if IS_ENABLED(CONFIG_MALI_NO_MALI)
#include <backend/gpu/mali_kbase_model_dummy.h>
#endif /* IS_ENABLED(CONFIG_MALI_NO_MALI) */
#if !IS_ENABLED(CONFIG_MALI_REAL_HW)
/**
* kbase_gpu_device_create() - Generic create function.
*
* @kbdev: Kbase device.
*
* Specific model hook is implemented by midgard_model_create()
*
* Return: 0 on success, error code otherwise.
*/
int kbase_gpu_device_create(struct kbase_device *kbdev);
/**
* kbase_gpu_device_destroy() - Generic create function.
*
* @kbdev: Kbase device.
*
* Specific model hook is implemented by midgard_model_destroy()
*/
void kbase_gpu_device_destroy(struct kbase_device *kbdev);
/**
* midgard_model_create() - Private create function.
*
* @kbdev: Kbase device.
*
* This hook is specific to the model built in Kbase.
*
* Return: Model handle.
*/
void *midgard_model_create(struct kbase_device *kbdev);
/**
* midgard_model_destroy() - Private destroy function.
*
* @h: Model handle.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_destroy(void *h);
/**
* midgard_model_write_reg() - Private model write function.
*
* @h: Model handle.
* @addr: Address at which to write.
* @value: value to write.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_write_reg(void *h, u32 addr, u32 value);
/**
* midgard_model_read_reg() - Private model read function.
*
* @h: Model handle.
* @addr: Address from which to read.
* @value: Pointer where to store the read value.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_read_reg(void *h, u32 addr, u32 *const value);
/**
* midgard_model_arch_timer_get_cntfrq - Get Model specific System Timer Frequency
*
* @h: Model handle.
*
* Return: Frequency in Hz
*/
u64 midgard_model_arch_timer_get_cntfrq(void *h);
/**
* gpu_device_raise_irq() - Private IRQ raise function.
*
* @model: Model handle.
* @irq: IRQ type to raise.
*
* This hook is global to the model Linux framework.
*/
void gpu_device_raise_irq(void *model, u32 irq);
/**
* gpu_device_set_data() - Private model set data function.
*
* @model: Model handle.
* @data: Data carried by model.
*
* This hook is global to the model Linux framework.
*/
void gpu_device_set_data(void *model, void *data);
/**
* gpu_device_get_data() - Private model get data function.
*
* @model: Model handle.
*
* This hook is global to the model Linux framework.
*
* Return: Pointer to the data carried by model.
*/
void *gpu_device_get_data(void *model);
#endif /* !IS_ENABLED(CONFIG_MALI_REAL_HW) */
#endif /* _KBASE_MODEL_LINUX_H_ */

View File

@@ -0,0 +1,75 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Always on" power management policy
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
static bool always_on_shaders_needed(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
return true;
}
static bool always_on_get_core_active(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
return true;
}
static void always_on_init(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/**
* always_on_term - Term callback function for always-on power policy
*
* @kbdev: kbase device
*/
static void always_on_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/*
* The struct kbase_pm_policy structure for the demand power policy.
*
* This is the static structure that defines the demand power policy's callback
* and name.
*/
const struct kbase_pm_policy kbase_pm_always_on_policy_ops = {
"always_on", /* name */
always_on_init, /* init */
always_on_term, /* term */
always_on_shaders_needed, /* shaders_needed */
always_on_get_core_active, /* get_core_active */
NULL, /* handle_event */
KBASE_PM_POLICY_ID_ALWAYS_ON, /* id */
#if MALI_USE_CSF
ALWAYS_ON_PM_SCHED_FLAGS, /* pm_sched_flags */
#endif
};
KBASE_EXPORT_TEST_API(kbase_pm_always_on_policy_ops);

View File

@@ -0,0 +1,77 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Always on" power management policy
*/
#ifndef MALI_KBASE_PM_ALWAYS_ON_H
#define MALI_KBASE_PM_ALWAYS_ON_H
/**
* DOC:
* The "Always on" power management policy has the following
* characteristics:
*
* - When KBase indicates that the GPU will be powered up, but we don't yet
* know which Job Chains are to be run:
* Shader Cores are powered up, regardless of whether or not they will be
* needed later.
*
* - When KBase indicates that Shader Cores are needed to submit the currently
* queued Job Chains:
* Shader Cores are kept powered, regardless of whether or not they will be
* needed
*
* - When KBase indicates that the GPU need not be powered:
* The Shader Cores are kept powered, regardless of whether or not they will
* be needed. The GPU itself is also kept powered, even though it is not
* needed.
*
* This policy is automatically overridden during system suspend: the desired
* core state is ignored, and the cores are forced off regardless of what the
* policy requests. After resuming from suspend, new changes to the desired
* core state made by the policy are honored.
*
* Note:
*
* - KBase indicates the GPU will be powered up when it has a User Process that
* has just started to submit Job Chains.
*
* - KBase indicates the GPU need not be powered when all the Job Chains from
* User Processes have finished, and it is waiting for a User Process to
* submit some more Job Chains.
*/
/**
* struct kbasep_pm_policy_always_on - Private struct for policy instance data
* @dummy: unused dummy variable
*
* This contains data that is private to the particular power policy that is
* active.
*/
struct kbasep_pm_policy_always_on {
int dummy;
};
extern const struct kbase_pm_policy kbase_pm_always_on_policy_ops;
#endif /* MALI_KBASE_PM_ALWAYS_ON_H */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,155 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2013-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel core availability APIs
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <backend/gpu/mali_kbase_model_linux.h>
#include <mali_kbase_dummy_job_wa.h>
int kbase_pm_ca_init(struct kbase_device *kbdev)
{
#ifdef CONFIG_MALI_DEVFREQ
struct kbase_pm_backend_data *pm_backend = &kbdev->pm.backend;
if (kbdev->current_core_mask)
pm_backend->ca_cores_enabled = kbdev->current_core_mask;
else
pm_backend->ca_cores_enabled = kbdev->gpu_props.shader_present;
#endif
return 0;
}
void kbase_pm_ca_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
#ifdef CONFIG_MALI_DEVFREQ
void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask)
{
struct kbase_pm_backend_data *pm_backend = &kbdev->pm.backend;
unsigned long flags;
#if MALI_USE_CSF
u64 old_core_mask = 0;
#endif
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
#if MALI_USE_CSF
if (!(core_mask & kbdev->pm.debug_core_mask)) {
dev_err(kbdev->dev,
"OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
core_mask, kbdev->pm.debug_core_mask);
goto unlock;
}
old_core_mask = pm_backend->ca_cores_enabled;
#else
if (!(core_mask & kbdev->pm.debug_core_mask_all)) {
dev_err(kbdev->dev,
"OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
core_mask, kbdev->pm.debug_core_mask_all);
goto unlock;
}
if (kbase_dummy_job_wa_enabled(kbdev)) {
dev_err_once(kbdev->dev,
"Dynamic core scaling not supported as dummy job WA is enabled");
goto unlock;
}
#endif /* MALI_USE_CSF */
pm_backend->ca_cores_enabled = core_mask;
kbase_pm_update_state(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
#if MALI_USE_CSF
/* Check if old_core_mask contained the undesired cores and wait
* for those cores to get powered down
*/
if ((core_mask & old_core_mask) != old_core_mask) {
if (kbase_pm_wait_for_cores_down_scale(kbdev)) {
dev_warn(kbdev->dev,
"Wait for update of core_mask from %llx to %llx failed",
old_core_mask, core_mask);
}
}
#endif
dev_dbg(kbdev->dev, "Devfreq policy : new core mask=%llX\n", pm_backend->ca_cores_enabled);
return;
unlock:
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_devfreq_set_core_mask);
#endif
u64 kbase_pm_ca_get_debug_core_mask(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
return kbdev->pm.debug_core_mask;
#else
return kbdev->pm.debug_core_mask_all;
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_ca_get_debug_core_mask);
u64 kbase_pm_ca_get_core_mask(struct kbase_device *kbdev)
{
u64 debug_core_mask = kbase_pm_ca_get_debug_core_mask(kbdev);
lockdep_assert_held(&kbdev->hwaccess_lock);
#ifdef CONFIG_MALI_DEVFREQ
/*
* Although in the init we let the pm_backend->ca_cores_enabled to be
* the max config (it uses the base_gpu_props), at this function we need
* to limit it to be a subgroup of the curr config, otherwise the
* shaders state machine on the PM does not evolve.
*/
return kbdev->gpu_props.curr_config.shader_present & kbdev->pm.backend.ca_cores_enabled &
debug_core_mask;
#else
return kbdev->gpu_props.curr_config.shader_present & debug_core_mask;
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_ca_get_core_mask);
u64 kbase_pm_ca_get_instr_core_mask(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
#if IS_ENABLED(CONFIG_MALI_NO_MALI)
return (((1ull) << KBASE_DUMMY_MODEL_MAX_SHADER_CORES) - 1);
#elif MALI_USE_CSF
return kbase_pm_get_ready_cores(kbdev, KBASE_PM_CORE_SHADER);
#else
return kbdev->pm.backend.pm_shaders_core_mask;
#endif
}

View File

@@ -0,0 +1,99 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel core availability APIs
*/
#ifndef _KBASE_PM_CA_H_
#define _KBASE_PM_CA_H_
/**
* kbase_pm_ca_init - Initialize core availability framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Must be called before calling any other core availability function
*
* Return: 0 if the core availability framework was successfully initialized,
* -errno otherwise
*/
int kbase_pm_ca_init(struct kbase_device *kbdev);
/**
* kbase_pm_ca_term - Terminate core availability framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*/
void kbase_pm_ca_term(struct kbase_device *kbdev);
/**
* kbase_pm_ca_get_core_mask - Get currently available shaders core mask
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the currently available shader cores.
* Calls into the core availability policy
*
* Return: The bit mask of available cores
*/
u64 kbase_pm_ca_get_core_mask(struct kbase_device *kbdev);
/**
* kbase_pm_ca_get_debug_core_mask - Get debug core mask.
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the currently selected shader cores.
*
* Return: The bit mask of user-selected cores
*/
u64 kbase_pm_ca_get_debug_core_mask(struct kbase_device *kbdev);
/**
* kbase_pm_ca_update_core_status - Update core status
*
* @kbdev: The kbase device structure for the device (must be
* a valid pointer)
* @cores_ready: The bit mask of cores ready for job submission
* @cores_transitioning: The bit mask of cores that are transitioning power
* state
*
* Update core availability policy with current core power status
*
* Calls into the core availability policy
*/
void kbase_pm_ca_update_core_status(struct kbase_device *kbdev, u64 cores_ready,
u64 cores_transitioning);
/**
* kbase_pm_ca_get_instr_core_mask - Get the PM state sync-ed shaders core mask
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the PM state synchronised shader cores for arranging
* HW performance counter dumps
*
* Return: The bit mask of PM state synchronised cores
*/
u64 kbase_pm_ca_get_instr_core_mask(struct kbase_device *kbdev);
#endif /* _KBASE_PM_CA_H_ */

View File

@@ -0,0 +1,58 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2017-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* A core availability policy for use with devfreq, where core masks are
* associated with OPPs.
*/
#ifndef MALI_KBASE_PM_CA_DEVFREQ_H
#define MALI_KBASE_PM_CA_DEVFREQ_H
/**
* struct kbasep_pm_ca_policy_devfreq - Private structure for devfreq ca policy
*
* @cores_desired: Cores that the policy wants to be available
* @cores_enabled: Cores that the policy is currently returning as available
* @cores_used: Cores currently powered or transitioning
*
* This contains data that is private to the devfreq core availability
* policy.
*/
struct kbasep_pm_ca_policy_devfreq {
u64 cores_desired;
u64 cores_enabled;
u64 cores_used;
};
extern const struct kbase_pm_ca_policy kbase_pm_ca_devfreq_policy_ops;
/**
* kbase_devfreq_set_core_mask - Set core mask for policy to use
* @kbdev: Device pointer
* @core_mask: New core mask
*
* The new core mask will have immediate effect if the GPU is powered, or will
* take effect when it is next powered on.
*/
void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask);
#endif /* MALI_KBASE_PM_CA_DEVFREQ_H */

View File

@@ -0,0 +1,67 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Coarse Demand" power management policy
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
static bool coarse_demand_shaders_needed(struct kbase_device *kbdev)
{
return kbase_pm_is_active(kbdev);
}
static bool coarse_demand_get_core_active(struct kbase_device *kbdev)
{
return kbase_pm_is_active(kbdev);
}
static void coarse_demand_init(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
static void coarse_demand_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/* The struct kbase_pm_policy structure for the demand power policy.
*
* This is the static structure that defines the demand power policy's callback
* and name.
*/
const struct kbase_pm_policy kbase_pm_coarse_demand_policy_ops = {
"coarse_demand", /* name */
coarse_demand_init, /* init */
coarse_demand_term, /* term */
coarse_demand_shaders_needed, /* shaders_needed */
coarse_demand_get_core_active, /* get_core_active */
NULL, /* handle_event */
KBASE_PM_POLICY_ID_COARSE_DEMAND, /* id */
#if MALI_USE_CSF
COARSE_ON_DEMAND_PM_SCHED_FLAGS, /* pm_sched_flags */
#endif
};
KBASE_EXPORT_TEST_API(kbase_pm_coarse_demand_policy_ops);

View File

@@ -0,0 +1,64 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2012-2015, 2018, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Coarse Demand" power management policy
*/
#ifndef MALI_KBASE_PM_COARSE_DEMAND_H
#define MALI_KBASE_PM_COARSE_DEMAND_H
/**
* DOC:
* The "Coarse" demand power management policy has the following
* characteristics:
* - When KBase indicates that the GPU will be powered up, but we don't yet
* know which Job Chains are to be run:
* - Shader Cores are powered up, regardless of whether or not they will be
* needed later.
* - When KBase indicates that Shader Cores are needed to submit the currently
* queued Job Chains:
* - Shader Cores are kept powered, regardless of whether or not they will
* be needed
* - When KBase indicates that the GPU need not be powered:
* - The Shader Cores are powered off, and the GPU itself is powered off too.
*
* @note:
* - KBase indicates the GPU will be powered up when it has a User Process that
* has just started to submit Job Chains.
* - KBase indicates the GPU need not be powered when all the Job Chains from
* User Processes have finished, and it is waiting for a User Process to
* submit some more Job Chains.
*/
/**
* struct kbasep_pm_policy_coarse_demand - Private structure for coarse demand
* policy
* @dummy: Dummy member - no state needed
* This contains data that is private to the coarse demand power policy.
*/
struct kbasep_pm_policy_coarse_demand {
int dummy;
};
extern const struct kbase_pm_policy kbase_pm_coarse_demand_policy_ops;
#endif /* MALI_KBASE_PM_COARSE_DEMAND_H */

View File

@@ -0,0 +1,680 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager definitions
*/
#ifndef _KBASE_PM_HWACCESS_DEFS_H_
#define _KBASE_PM_HWACCESS_DEFS_H_
#include "mali_kbase_pm_always_on.h"
#include "mali_kbase_pm_coarse_demand.h"
#include <hw_access/mali_kbase_hw_access_regmap.h>
#if defined(CONFIG_PM_RUNTIME) || defined(CONFIG_PM)
#define KBASE_PM_RUNTIME 1
#endif
/* Forward definition - see mali_kbase.h */
struct kbase_device;
struct kbase_jd_atom;
/**
* enum kbase_pm_core_type - The types of core in a GPU.
*
* @KBASE_PM_CORE_L2: The L2 cache
* @KBASE_PM_CORE_SHADER: Shader cores
* @KBASE_PM_CORE_TILER: Tiler cores
* @KBASE_PM_CORE_STACK: Core stacks
*
* These enumerated values are used in calls to
* - kbase_pm_get_present_cores()
* - kbase_pm_get_active_cores()
* - kbase_pm_get_trans_cores()
* - kbase_pm_get_ready_cores()
* - kbase_pm_get_state()
* - core_type_to_reg()
* - pwr_cmd_constructor()
* - valid_to_power_up()
* - valid_to_power_down()
* - kbase_pm_invoke()
*
* They specify which type of core should be acted on.
*/
enum kbase_pm_core_type {
KBASE_PM_CORE_L2 = GPU_CONTROL_ENUM(L2_PRESENT),
KBASE_PM_CORE_SHADER = GPU_CONTROL_ENUM(SHADER_PRESENT),
KBASE_PM_CORE_TILER = GPU_CONTROL_ENUM(TILER_PRESENT),
KBASE_PM_CORE_STACK = GPU_CONTROL_ENUM(STACK_PRESENT)
};
/*
* enum kbase_l2_core_state - The states used for the L2 cache & tiler power
* state machine.
*/
enum kbase_l2_core_state {
#define KBASEP_L2_STATE(n) KBASE_L2_##n,
#include "mali_kbase_pm_l2_states.h"
#undef KBASEP_L2_STATE
};
#if MALI_USE_CSF
/*
* enum kbase_mcu_state - The states used for the MCU state machine.
*/
enum kbase_mcu_state {
#define KBASEP_MCU_STATE(n) KBASE_MCU_##n,
#include "mali_kbase_pm_mcu_states.h"
#undef KBASEP_MCU_STATE
};
#endif
/*
* enum kbase_shader_core_state - The states used for the shaders' state machine.
*/
enum kbase_shader_core_state {
#define KBASEP_SHADER_STATE(n) KBASE_SHADERS_##n,
#include "mali_kbase_pm_shader_states.h"
#undef KBASEP_SHADER_STATE
};
/**
* enum kbase_pm_runtime_suspend_abort_reason - Reason why runtime suspend was aborted
* after the wake up of MCU.
*
* @ABORT_REASON_NONE: Not aborted
* @ABORT_REASON_DB_MIRROR_IRQ: Runtime suspend was aborted due to DB_MIRROR irq.
* @ABORT_REASON_NON_IDLE_CGS: Runtime suspend was aborted as CSGs were detected as non-idle after
* their suspension.
*/
enum kbase_pm_runtime_suspend_abort_reason {
ABORT_REASON_NONE,
ABORT_REASON_DB_MIRROR_IRQ,
ABORT_REASON_NON_IDLE_CGS
};
/**
* struct kbasep_pm_metrics - Metrics data collected for use by the power
* management framework.
*
* @time_busy: the amount of time the GPU was busy executing jobs since the
* @time_period_start timestamp, in units of 256ns. This also includes
* time_in_protm, the time spent in protected mode, since it's assumed
* the GPU was busy 100% during this period.
* @time_idle: the amount of time the GPU was not executing jobs since the
* time_period_start timestamp, measured in units of 256ns.
* @time_in_protm: The amount of time the GPU has spent in protected mode since
* the time_period_start timestamp, measured in units of 256ns.
* @busy_cl: the amount of time the GPU was busy executing CL jobs. Note that
* if two CL jobs were active for 256ns, this value would be updated
* with 2 (2x256ns).
* @busy_gl: the amount of time the GPU was busy executing GL jobs. Note that
* if two GL jobs were active for 256ns, this value would be updated
* with 2 (2x256ns).
*/
struct kbasep_pm_metrics {
u32 time_busy;
u32 time_idle;
#if MALI_USE_CSF
u32 time_in_protm;
#else
u32 busy_cl[2];
u32 busy_gl;
#endif
};
/**
* struct kbasep_pm_metrics_state - State required to collect the metrics in
* struct kbasep_pm_metrics
* @time_period_start: time at which busy/idle measurements started
* @ipa_control_client: Handle returned on registering DVFS as a
* kbase_ipa_control client
* @skip_gpu_active_sanity_check: Decide whether to skip GPU_ACTIVE sanity
* check in DVFS utilisation calculation
* @gpu_active: true when the GPU is executing jobs. false when
* not. Updated when the job scheduler informs us a job in submitted
* or removed from a GPU slot.
* @active_cl_ctx: number of CL jobs active on the GPU. Array is per-device.
* @active_gl_ctx: number of GL jobs active on the GPU. Array is per-slot.
* @lock: spinlock protecting the kbasep_pm_metrics_state structure
* @platform_data: pointer to data controlled by platform specific code
* @kbdev: pointer to kbase device for which metrics are collected
* @values: The current values of the power management metrics. The
* kbase_pm_get_dvfs_metrics() function is used to compare these
* current values with the saved values from a previous invocation.
* @initialized: tracks whether metrics_state has been initialized or not.
* @timer: timer to regularly make DVFS decisions based on the power
* management metrics.
* @timer_state: atomic indicating current @timer state, on, off, or stopped.
* @dvfs_last: values of the PM metrics from the last DVFS tick
* @dvfs_diff: different between the current and previous PM metrics.
*/
struct kbasep_pm_metrics_state {
ktime_t time_period_start;
#if MALI_USE_CSF
void *ipa_control_client;
bool skip_gpu_active_sanity_check;
#else
bool gpu_active;
u32 active_cl_ctx[2];
u32 active_gl_ctx[3];
#endif
spinlock_t lock;
void *platform_data;
struct kbase_device *kbdev;
struct kbasep_pm_metrics values;
#ifdef CONFIG_MALI_MIDGARD_DVFS
bool initialized;
struct hrtimer timer;
atomic_t timer_state;
struct kbasep_pm_metrics dvfs_last;
struct kbasep_pm_metrics dvfs_diff;
#endif
};
/**
* struct kbasep_pm_tick_timer_state - State for the shader hysteresis timer
* @wq: Work queue to wait for the timer to stopped
* @work: Work item which cancels the timer
* @timer: Timer for powering off the shader cores
* @configured_interval: Period of GPU poweroff timer
* @default_ticks: User-configured number of ticks to wait after the shader
* power down request is received before turning off the cores
* @configured_ticks: Power-policy configured number of ticks to wait after the
* shader power down request is received before turning off
* the cores. For simple power policies, this is equivalent
* to @default_ticks.
* @remaining_ticks: Number of remaining timer ticks until shaders are powered off
* @cancel_queued: True if the cancellation work item has been queued. This is
* required to ensure that it is not queued twice, e.g. after
* a reset, which could cause the timer to be incorrectly
* cancelled later by a delayed workitem.
* @needed: Whether the timer should restart itself
*/
struct kbasep_pm_tick_timer_state {
struct workqueue_struct *wq;
struct work_struct work;
struct hrtimer timer;
ktime_t configured_interval;
unsigned int default_ticks;
unsigned int configured_ticks;
unsigned int remaining_ticks;
bool cancel_queued;
bool needed;
};
union kbase_pm_policy_data {
struct kbasep_pm_policy_always_on always_on;
struct kbasep_pm_policy_coarse_demand coarse_demand;
};
/**
* struct kbase_pm_backend_data - Data stored per device for power management.
*
* @pm_current_policy: The policy that is currently actively controlling the
* power state.
* @pm_policy_data: Private data for current PM policy. This is automatically
* zeroed when a policy change occurs.
* @reset_done: Flag when a reset is complete
* @reset_done_wait: Wait queue to wait for changes to @reset_done
* @gpu_cycle_counter_requests: The reference count of active gpu cycle counter
* users
* @gpu_cycle_counter_requests_lock: Lock to protect @gpu_cycle_counter_requests
* @gpu_in_desired_state_wait: Wait queue set when the GPU is in the desired
* state according to the L2 and shader power state
* machines
* @gpu_powered: Set to true when the GPU is powered and register
* accesses are possible, false otherwise. Access to this
* variable should be protected by: both the hwaccess_lock
* spinlock and the pm.lock mutex for writes; or at least
* one of either lock for reads.
* @gpu_ready: Indicates whether the GPU is in a state in which it is
* safe to perform PM changes. When false, the PM state
* machine needs to wait before making changes to the GPU
* power policy, DevFreq or core_mask, so as to avoid these
* changing while implicit GPU resets are ongoing.
* @pm_shaders_core_mask: Shader PM state synchronised shaders core mask. It
* holds the cores enabled in a hardware counters dump,
* and may differ from @shaders_avail when under different
* states and transitions.
* @cg1_disabled: Set if the policy wants to keep the second core group
* powered off
* @metrics: Structure to hold metrics for the GPU
* @shader_tick_timer: Structure to hold the shader poweroff tick timer state
* @poweroff_wait_in_progress: true if a wait for GPU power off is in progress.
* hwaccess_lock must be held when accessing
* @invoke_poweroff_wait_wq_when_l2_off: flag indicating that the L2 power state
* machine should invoke the poweroff
* worker after the L2 has turned off.
* @poweron_required: true if a GPU power on is required. Should only be set
* when poweroff_wait_in_progress is true, and therefore the
* GPU can not immediately be powered on. pm.lock must be
* held when accessing
* @gpu_poweroff_wait_wq: workqueue for waiting for GPU to power off
* @gpu_poweroff_wait_work: work item for use with @gpu_poweroff_wait_wq
* @poweroff_wait: waitqueue for waiting for @gpu_poweroff_wait_work to complete
* @callback_power_on: Callback when the GPU needs to be turned on. See
* &struct kbase_pm_callback_conf
* @callback_power_off: Callback when the GPU may be turned off. See
* &struct kbase_pm_callback_conf
* @callback_power_suspend: Callback when a suspend occurs and the GPU needs to
* be turned off. See &struct kbase_pm_callback_conf
* @callback_power_resume: Callback when a resume occurs and the GPU needs to
* be turned on. See &struct kbase_pm_callback_conf
* @callback_power_runtime_on: Callback when the GPU needs to be turned on. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_off: Callback when the GPU may be turned off. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_idle: Optional callback invoked by runtime PM core
* when the GPU may be idle. See
* &struct kbase_pm_callback_conf
* @callback_soft_reset: Optional callback to software reset the GPU. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_gpu_idle: Callback invoked by Kbase when GPU has
* become idle.
* See &struct kbase_pm_callback_conf.
* @callback_power_runtime_gpu_active: Callback when GPU has become active and
* @callback_power_runtime_gpu_idle was
* called previously.
* See &struct kbase_pm_callback_conf.
* @ca_cores_enabled: Cores that are currently available
* @apply_hw_issue_TITANHW_2938_wa: Indicates if the workaround for BASE_HW_ISSUE_TITANHW_2938
* needs to be applied when unmapping memory from GPU.
* @mcu_state: The current state of the micro-control unit, only applicable
* to GPUs that have such a component
* @l2_state: The current state of the L2 cache state machine. See
* &enum kbase_l2_core_state
* @l2_desired: True if the L2 cache should be powered on by the L2 cache state
* machine
* @l2_always_on: If true, disable powering down of l2 cache.
* @shaders_state: The current state of the shader state machine.
* @shaders_avail: This is updated by the state machine when it is in a state
* where it can write to the SHADER_PWRON or PWROFF registers
* to have the same set of available cores as specified by
* @shaders_desired_mask. So would precisely indicate the cores
* that are currently available. This is internal to shader
* state machine of JM GPUs and should *not* be modified
* elsewhere.
* @shaders_desired_mask: This is updated by the state machine when it is in
* a state where it can handle changes to the core
* availability (either by DVFS or sysfs). This is
* internal to the shader state machine and should
* *not* be modified elsewhere.
* @shaders_desired: True if the PM active count or power policy requires the
* shader cores to be on. This is used as an input to the
* shader power state machine. The current state of the
* cores may be different, but there should be transitions in
* progress that will eventually achieve this state (assuming
* that the policy doesn't change its mind in the mean time).
* @mcu_desired: True if the micro-control unit should be powered on
* @policy_change_clamp_state_to_off: Signaling the backend is in PM policy
* change transition, needs the mcu/L2 to be brought back to the
* off state and remain in that state until the flag is cleared.
* @csf_pm_sched_flags: CSF Dynamic PM control flags in accordance to the
* current active PM policy. This field is updated whenever a
* new policy is activated.
* @policy_change_lock: Used to serialize the policy change calls. In CSF case,
* the change of policy may involve the scheduler to
* suspend running CSGs and then reconfigure the MCU.
* @core_idle_wq: Workqueue for executing the @core_idle_work.
* @core_idle_work: Work item used to wait for undesired cores to become inactive.
* The work item is enqueued when Host controls the power for
* shader cores and down scaling of cores is performed.
* @gpu_sleep_supported: Flag to indicate that if GPU sleep feature can be
* supported by the kernel driver or not. If this
* flag is not set, then HW state is directly saved
* when GPU idle notification is received.
* @gpu_sleep_mode_active: Flag to indicate that the GPU needs to be in sleep
* mode. It is set when the GPU idle notification is
* received and is cleared when HW state has been
* saved in the runtime suspend callback function or
* when the GPU power down is aborted if GPU became
* active whilst it was in sleep mode. The flag is
* guarded with hwaccess_lock spinlock.
* @exit_gpu_sleep_mode: Flag to indicate the GPU can now exit the sleep
* mode due to the submission of work from Userspace.
* The flag is guarded with hwaccess_lock spinlock.
* The @gpu_sleep_mode_active flag is not immediately
* reset when this flag is set, this is to ensure that
* MCU doesn't gets disabled undesirably without the
* suspend of CSGs. That could happen when
* scheduler_pm_active() and scheduler_pm_idle() gets
* called before the Scheduler gets reactivated.
* @gpu_idled: Flag to ensure that the gpu_idle & gpu_active callbacks are
* always called in pair. The flag is guarded with pm.lock mutex.
* @gpu_wakeup_override: Flag to force the power up of L2 cache & reactivation
* of MCU. This is set during the runtime suspend
* callback function, when GPU needs to exit the sleep
* mode for the saving the HW state before power down.
* @db_mirror_interrupt_enabled: Flag tracking if the Doorbell mirror interrupt
* is enabled or not.
* @runtime_suspend_abort_reason: Tracks if the runtime suspend was aborted,
* after the wake up of MCU, due to the DB_MIRROR irq
* or non-idle CSGs. Tracking is done to avoid
* redundant transition of MCU to sleep state after the
* abort of runtime suspend and before the resumption
* of scheduling.
* @l2_force_off_after_mcu_halt: Flag to indicate that L2 cache power down is
* must after performing the MCU halt. Flag is set
* immediately after the MCU halt and cleared
* after the L2 cache power down. MCU can't be
* re-enabled whilst the flag is set.
* @in_reset: True if a GPU is resetting and normal power manager operation is
* suspended
* @partial_shaderoff: True if we want to partial power off shader cores,
* it indicates a partial shader core off case,
* do some special operation for such case like flush
* L2 cache because of GPU2017-861
* @protected_entry_transition_override : True if GPU reset is being used
* before entering the protected mode and so
* the reset handling behaviour is being
* overridden.
* @protected_transition_override : True if a protected mode transition is in
* progress and is overriding power manager
* behaviour.
* @protected_l2_override : Non-zero if the L2 cache is required during a
* protected mode transition. Has no effect if not
* transitioning.
* @hwcnt_desired: True if we want GPU hardware counters to be enabled.
* @hwcnt_disabled: True if GPU hardware counters are not enabled.
* @hwcnt_disable_work: Work item to disable GPU hardware counters, used if
* atomic disable is not possible.
* @gpu_clock_suspend_freq: 'opp-mali-errata-1485982' clock in opp table
* for safe L2 power cycle.
* If no opp-mali-errata-1485982 specified,
* the slowest clock will be taken.
* @gpu_clock_slow_down_wa: If true, slow down GPU clock during L2 power cycle.
* @gpu_clock_slow_down_desired: True if we want lower GPU clock
* for safe L2 power cycle. False if want GPU clock
* to back to normalized one. This is updated only
* in L2 state machine, kbase_pm_l2_update_state.
* @gpu_clock_slowed_down: During L2 power cycle,
* True if gpu clock is set at lower frequency
* for safe L2 power down, False if gpu clock gets
* restored to previous speed. This is updated only in
* work function, kbase_pm_gpu_clock_control_worker.
* @gpu_clock_control_work: work item to set GPU clock during L2 power cycle
* using gpu_clock_control
*
* This structure contains data for the power management framework. There is one
* instance of this structure per device in the system.
*
* Note:
* During an IRQ, @pm_current_policy can be NULL when the policy is being
* changed with kbase_pm_set_policy(). The change is protected under
* kbase_device.pm.pcower_change_lock. Direct access to this from IRQ context
* must therefore check for NULL. If NULL, then kbase_pm_set_policy() will
* re-issue the policy functions that would have been done under IRQ.
*/
struct kbase_pm_backend_data {
const struct kbase_pm_policy *pm_current_policy;
union kbase_pm_policy_data pm_policy_data;
bool reset_done;
wait_queue_head_t reset_done_wait;
int gpu_cycle_counter_requests;
spinlock_t gpu_cycle_counter_requests_lock;
wait_queue_head_t gpu_in_desired_state_wait;
bool gpu_powered;
bool gpu_ready;
u64 pm_shaders_core_mask;
bool cg1_disabled;
struct kbasep_pm_metrics_state metrics;
struct kbasep_pm_tick_timer_state shader_tick_timer;
bool poweroff_wait_in_progress;
bool invoke_poweroff_wait_wq_when_l2_off;
bool poweron_required;
struct workqueue_struct *gpu_poweroff_wait_wq;
struct work_struct gpu_poweroff_wait_work;
wait_queue_head_t poweroff_wait;
int (*callback_power_on)(struct kbase_device *kbdev);
void (*callback_power_off)(struct kbase_device *kbdev);
void (*callback_power_suspend)(struct kbase_device *kbdev);
void (*callback_power_resume)(struct kbase_device *kbdev);
int (*callback_power_runtime_on)(struct kbase_device *kbdev);
void (*callback_power_runtime_off)(struct kbase_device *kbdev);
int (*callback_power_runtime_idle)(struct kbase_device *kbdev);
int (*callback_soft_reset)(struct kbase_device *kbdev);
void (*callback_power_runtime_gpu_idle)(struct kbase_device *kbdev);
void (*callback_power_runtime_gpu_active)(struct kbase_device *kbdev);
u64 ca_cores_enabled;
#if MALI_USE_CSF
bool apply_hw_issue_TITANHW_2938_wa;
enum kbase_mcu_state mcu_state;
#endif
enum kbase_l2_core_state l2_state;
enum kbase_shader_core_state shaders_state;
u64 shaders_avail;
u64 shaders_desired_mask;
#if MALI_USE_CSF
bool mcu_desired;
bool policy_change_clamp_state_to_off;
unsigned int csf_pm_sched_flags;
struct mutex policy_change_lock;
struct workqueue_struct *core_idle_wq;
struct work_struct core_idle_work;
#ifdef KBASE_PM_RUNTIME
bool gpu_sleep_supported;
bool gpu_sleep_mode_active;
bool exit_gpu_sleep_mode;
bool gpu_idled;
bool gpu_wakeup_override;
bool db_mirror_interrupt_enabled;
enum kbase_pm_runtime_suspend_abort_reason runtime_suspend_abort_reason;
#endif
bool l2_force_off_after_mcu_halt;
#endif
bool l2_desired;
bool l2_always_on;
bool shaders_desired;
bool in_reset;
#if !MALI_USE_CSF
bool partial_shaderoff;
bool protected_entry_transition_override;
bool protected_transition_override;
int protected_l2_override;
#endif
bool hwcnt_desired;
bool hwcnt_disabled;
struct work_struct hwcnt_disable_work;
u64 gpu_clock_suspend_freq;
bool gpu_clock_slow_down_wa;
bool gpu_clock_slow_down_desired;
bool gpu_clock_slowed_down;
struct work_struct gpu_clock_control_work;
};
#if MALI_USE_CSF
/* CSF PM flag, signaling that the MCU shader Core should be kept on */
#define CSF_DYNAMIC_PM_CORE_KEEP_ON (1 << 0)
/* CSF PM flag, signaling no scheduler suspension on idle groups */
#define CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE (1 << 1)
/* CSF PM flag, signaling no scheduler suspension on no runnable groups */
#define CSF_DYNAMIC_PM_SCHED_NO_SUSPEND (1 << 2)
/* The following flags corresponds to existing defined PM policies */
#define ALWAYS_ON_PM_SCHED_FLAGS \
(CSF_DYNAMIC_PM_CORE_KEEP_ON | CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE | \
CSF_DYNAMIC_PM_SCHED_NO_SUSPEND)
#define COARSE_ON_DEMAND_PM_SCHED_FLAGS (0)
#if !MALI_CUSTOMER_RELEASE
#define ALWAYS_ON_DEMAND_PM_SCHED_FLAGS (CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE)
#endif
#endif
/* List of policy IDs */
enum kbase_pm_policy_id {
KBASE_PM_POLICY_ID_COARSE_DEMAND,
#if !MALI_CUSTOMER_RELEASE
KBASE_PM_POLICY_ID_ALWAYS_ON_DEMAND,
#endif
KBASE_PM_POLICY_ID_ALWAYS_ON
};
/**
* enum kbase_pm_policy_event - PM Policy event ID
*/
enum kbase_pm_policy_event {
/**
* @KBASE_PM_POLICY_EVENT_IDLE: Indicates that the GPU power state
* model has determined that the GPU has gone idle.
*/
KBASE_PM_POLICY_EVENT_IDLE,
/**
* @KBASE_PM_POLICY_EVENT_POWER_ON: Indicates that the GPU state model
* is preparing to power on the GPU.
*/
KBASE_PM_POLICY_EVENT_POWER_ON,
/**
* @KBASE_PM_POLICY_EVENT_TIMER_HIT: Indicates that the GPU became
* active while the Shader Tick Timer was holding the GPU in a powered
* on state.
*/
KBASE_PM_POLICY_EVENT_TIMER_HIT,
/**
* @KBASE_PM_POLICY_EVENT_TIMER_MISS: Indicates that the GPU did not
* become active before the Shader Tick Timer timeout occurred.
*/
KBASE_PM_POLICY_EVENT_TIMER_MISS
};
/**
* struct kbase_pm_policy - Power policy structure.
*
* @name: The name of this policy
* @init: Function called when the policy is selected
* @term: Function called when the policy is unselected
* @shaders_needed: Function called to find out if shader cores are needed
* @get_core_active: Function called to get the current overall GPU power
* state
* @handle_event: Function called when a PM policy event occurs. Should be
* set to NULL if the power policy doesn't require any
* event notifications.
* @id: Field indicating an ID for this policy. This is not
* necessarily the same as its index in the list returned
* by kbase_pm_list_policies().
* It is used purely for debugging.
* @pm_sched_flags: Policy associated with CSF PM scheduling operational flags.
* Pre-defined required flags exist for each of the
* ARM released policies, such as 'always_on', 'coarse_demand'
* and etc.
* Each power policy exposes a (static) instance of this structure which
* contains function pointers to the policy's methods.
*/
struct kbase_pm_policy {
char *name;
/*
* Function called when the policy is selected
*
* This should initialize the kbdev->pm.pm_policy_data structure. It
* should not attempt to make any changes to hardware state.
*
* It is undefined what state the cores are in when the function is
* called.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*/
void (*init)(struct kbase_device *kbdev);
/*
* Function called when the policy is unselected.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*/
void (*term)(struct kbase_device *kbdev);
/*
* Function called to find out if shader cores are needed
*
* This needs to at least satisfy kbdev->pm.backend.shaders_desired,
* and so must never return false when shaders_desired is true.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*
* Return: true if shader cores are needed, false otherwise
*/
bool (*shaders_needed)(struct kbase_device *kbdev);
/*
* Function called to get the current overall GPU power state
*
* This function must meet or exceed the requirements for power
* indicated by kbase_pm_is_active().
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*
* Return: true if the GPU should be powered, false otherwise
*/
bool (*get_core_active)(struct kbase_device *kbdev);
/*
* Function called when a power event occurs
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
* @event: The id of the power event that has occurred
*/
void (*handle_event)(struct kbase_device *kbdev, enum kbase_pm_policy_event event);
enum kbase_pm_policy_id id;
#if MALI_USE_CSF
/* Policy associated with CSF PM scheduling operational flags.
* There are pre-defined required flags exist for each of the
* ARM released policies, such as 'always_on', 'coarse_demand'
* and etc.
*/
unsigned int pm_sched_flags;
#endif
};
#endif /* _KBASE_PM_HWACCESS_DEFS_H_ */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,50 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager level 2 cache state definitions.
* The function-like macro KBASEP_L2_STATE() must be defined before including
* this header file. This header file can be included multiple times in the
* same compilation unit with different definitions of KBASEP_L2_STATE().
*
* @OFF: The L2 cache and tiler are off
* @PEND_ON: The L2 cache and tiler are powering on
* @RESTORE_CLOCKS: The GPU clock is restored. Conditionally used.
* @ON_HWCNT_ENABLE: The L2 cache and tiler are on, and hwcnt is being enabled
* @ON: The L2 cache and tiler are on, and hwcnt is enabled
* @ON_HWCNT_DISABLE: The L2 cache and tiler are on, and hwcnt is being disabled
* @SLOW_DOWN_CLOCKS: The GPU clock is set to appropriate or lowest clock.
* Conditionally used.
* @POWER_DOWN: The L2 cache and tiler are about to be powered off
* @PEND_OFF: The L2 cache and tiler are powering off
* @RESET_WAIT: The GPU is resetting, L2 cache and tiler power state are
* unknown
*/
KBASEP_L2_STATE(OFF)
KBASEP_L2_STATE(PEND_ON)
KBASEP_L2_STATE(RESTORE_CLOCKS)
KBASEP_L2_STATE(ON_HWCNT_ENABLE)
KBASEP_L2_STATE(ON)
KBASEP_L2_STATE(ON_HWCNT_DISABLE)
KBASEP_L2_STATE(SLOW_DOWN_CLOCKS)
KBASEP_L2_STATE(POWER_DOWN)
KBASEP_L2_STATE(PEND_OFF)
KBASEP_L2_STATE(RESET_WAIT)

View File

@@ -0,0 +1,108 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager MCU state definitions.
* The function-like macro KBASEP_MCU_STATE() must be defined before including
* this header file. This header file can be included multiple times in the
* same compilation unit with different definitions of KBASEP_MCU_STATE().
*
* @OFF: The MCU is powered off.
* @PEND_ON_RELOAD: The warm boot of MCU or cold boot of MCU (with
* firmware reloading) is in progress.
* @ON_GLB_REINIT_PEND: The MCU is enabled and Global configuration
* requests have been sent to the firmware.
* @ON_HWCNT_ENABLE: The Global requests have completed and MCU is now
* ready for use and hwcnt is being enabled.
* @ON: The MCU is active and hwcnt has been enabled.
* @ON_CORE_ATTR_UPDATE_PEND: The MCU is active and mask of enabled shader cores
* is being updated.
* @ON_HWCNT_DISABLE: The MCU is on and hwcnt is being disabled.
* @ON_HALT: The MCU is on and hwcnt has been disabled, MCU
* halt would be triggered.
* @ON_PEND_HALT: MCU halt in progress, confirmation pending.
* @POWER_DOWN: MCU halted operations, pending being disabled.
* @PEND_OFF: MCU is being disabled, pending on powering off.
* @RESET_WAIT: The GPU is resetting, MCU state is unknown.
* @HCTL_SHADERS_PEND_ON: Global configuration requests sent to the firmware
* have completed and shaders have been requested to
* power on.
* @HCTL_CORES_NOTIFY_PEND: Shader cores have powered up and firmware is being
* notified of the mask of enabled shader cores.
* @HCTL_MCU_ON_RECHECK: MCU is on and hwcnt disabling is triggered
* and checks are done to update the number of
* enabled cores.
* @HCTL_SHADERS_READY_OFF: MCU has halted and cores need to be powered down
* @HCTL_SHADERS_PEND_OFF: Cores are transitioning to power down.
* @HCTL_CORES_DOWN_SCALE_NOTIFY_PEND: Firmware has been informed to stop using
* specific cores, due to core_mask change request.
* After the ACK from FW, the wait will be done for
* undesired cores to become inactive.
* @HCTL_CORE_INACTIVE_PEND: Waiting for specific cores to become inactive.
* Once the cores become inactive their power down
* will be initiated.
* @HCTL_SHADERS_CORE_OFF_PEND: Waiting for specific cores to complete the
* transition to power down. Once powered down,
* HW counters will be re-enabled.
* @ON_SLEEP_INITIATE: MCU is on and hwcnt has been disabled and MCU
* is being put to sleep.
* @ON_PEND_SLEEP: MCU sleep is in progress.
* @IN_SLEEP: Sleep request is completed and MCU has halted.
* @ON_PMODE_ENTER_CORESIGHT_DISABLE: The MCU is on, protected mode enter is about to
* be requested, Coresight is being disabled.
* @ON_PMODE_EXIT_CORESIGHT_ENABLE : The MCU is on, protected mode exit has happened
* Coresight is being enabled.
* @CORESIGHT_DISABLE: The MCU is on and Coresight is being disabled.
* @CORESIGHT_ENABLE: The MCU is on, host does not have control and
* Coresight is being enabled.
*/
KBASEP_MCU_STATE(OFF)
KBASEP_MCU_STATE(PEND_ON_RELOAD)
KBASEP_MCU_STATE(ON_GLB_REINIT_PEND)
KBASEP_MCU_STATE(ON_HWCNT_ENABLE)
KBASEP_MCU_STATE(ON)
KBASEP_MCU_STATE(ON_CORE_ATTR_UPDATE_PEND)
KBASEP_MCU_STATE(ON_HWCNT_DISABLE)
KBASEP_MCU_STATE(ON_HALT)
KBASEP_MCU_STATE(ON_PEND_HALT)
KBASEP_MCU_STATE(POWER_DOWN)
KBASEP_MCU_STATE(PEND_OFF)
KBASEP_MCU_STATE(RESET_WAIT)
/* Additional MCU states with HOST_CONTROL_SHADERS */
KBASEP_MCU_STATE(HCTL_SHADERS_PEND_ON)
KBASEP_MCU_STATE(HCTL_CORES_NOTIFY_PEND)
KBASEP_MCU_STATE(HCTL_MCU_ON_RECHECK)
KBASEP_MCU_STATE(HCTL_SHADERS_READY_OFF)
KBASEP_MCU_STATE(HCTL_SHADERS_PEND_OFF)
KBASEP_MCU_STATE(HCTL_CORES_DOWN_SCALE_NOTIFY_PEND)
KBASEP_MCU_STATE(HCTL_CORE_INACTIVE_PEND)
KBASEP_MCU_STATE(HCTL_SHADERS_CORE_OFF_PEND)
/* Additional MCU states to support GPU sleep feature */
KBASEP_MCU_STATE(ON_SLEEP_INITIATE)
KBASEP_MCU_STATE(ON_PEND_SLEEP)
KBASEP_MCU_STATE(IN_SLEEP)
#if IS_ENABLED(CONFIG_MALI_CORESIGHT)
/* Additional MCU states for Coresight */
KBASEP_MCU_STATE(ON_PMODE_ENTER_CORESIGHT_DISABLE)
KBASEP_MCU_STATE(ON_PMODE_EXIT_CORESIGHT_ENABLE)
KBASEP_MCU_STATE(CORESIGHT_DISABLE)
KBASEP_MCU_STATE(CORESIGHT_ENABLE)
#endif /* IS_ENABLED(CONFIG_MALI_CORESIGHT) */

View File

@@ -0,0 +1,497 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Metrics for power management
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#if MALI_USE_CSF
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#include <csf/ipa_control/mali_kbase_csf_ipa_control.h>
#else
#include <backend/gpu/mali_kbase_jm_rb.h>
#endif /* !MALI_USE_CSF */
#include <backend/gpu/mali_kbase_pm_defs.h>
#include <mali_linux_trace.h>
#if defined(CONFIG_MALI_DEVFREQ) || defined(CONFIG_MALI_MIDGARD_DVFS) || !MALI_USE_CSF
/* Shift used for kbasep_pm_metrics_data.time_busy/idle - units of (1 << 8) ns
* This gives a maximum period between samples of 2^(32+8)/100 ns = slightly
* under 11s. Exceeding this will cause overflow
*/
#define KBASE_PM_TIME_SHIFT 8
#endif
#if MALI_USE_CSF
/* To get the GPU_ACTIVE value in nano seconds unit */
#define GPU_ACTIVE_SCALING_FACTOR ((u64)1E9)
#endif
/*
* Possible state transitions
* ON -> ON | OFF | STOPPED
* STOPPED -> ON | OFF
* OFF -> ON
*
*
* ef
* v v
* ON a> STOPPED b> OFF
* ^^
* c
*
* d
*
* Transition effects:
* a. None
* b. Timer expires without restart
* c. Timer is not stopped, timer period is unaffected
* d. Timer must be restarted
* e. Callback is executed and the timer is restarted
* f. Timer is cancelled, or the callback is waited on if currently executing. This is called during
* tear-down and should not be subject to a race from an OFF->ON transition
*/
enum dvfs_metric_timer_state { TIMER_OFF, TIMER_STOPPED, TIMER_ON };
#ifdef CONFIG_MALI_MIDGARD_DVFS
static enum hrtimer_restart dvfs_callback(struct hrtimer *timer)
{
struct kbasep_pm_metrics_state *metrics;
if (WARN_ON(!timer))
return HRTIMER_NORESTART;
metrics = container_of(timer, struct kbasep_pm_metrics_state, timer);
/* Transition (b) to fully off if timer was stopped, don't restart the timer in this case */
if (atomic_cmpxchg(&metrics->timer_state, TIMER_STOPPED, TIMER_OFF) != TIMER_ON)
return HRTIMER_NORESTART;
kbase_pm_get_dvfs_action(metrics->kbdev);
/* Set the new expiration time and restart (transition e) */
hrtimer_forward_now(timer, HR_TIMER_DELAY_MSEC(metrics->kbdev->pm.dvfs_period));
return HRTIMER_RESTART;
}
#endif /* CONFIG_MALI_MIDGARD_DVFS */
int kbasep_pm_metrics_init(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
struct kbase_ipa_control_perf_counter perf_counter;
int err;
/* One counter group */
const size_t NUM_PERF_COUNTERS = 1;
KBASE_DEBUG_ASSERT(kbdev != NULL);
kbdev->pm.backend.metrics.kbdev = kbdev;
kbdev->pm.backend.metrics.time_period_start = ktime_get_raw();
perf_counter.scaling_factor = GPU_ACTIVE_SCALING_FACTOR;
/* Normalize values by GPU frequency */
perf_counter.gpu_norm = true;
/* We need the GPU_ACTIVE counter, which is in the CSHW group */
perf_counter.type = KBASE_IPA_CORE_TYPE_CSHW;
/* We need the GPU_ACTIVE counter */
perf_counter.idx = GPU_ACTIVE_CNT_IDX;
err = kbase_ipa_control_register(kbdev, &perf_counter, NUM_PERF_COUNTERS,
&kbdev->pm.backend.metrics.ipa_control_client);
if (err) {
dev_err(kbdev->dev, "Failed to register IPA with kbase_ipa_control: err=%d", err);
return -1;
}
#else
KBASE_DEBUG_ASSERT(kbdev != NULL);
kbdev->pm.backend.metrics.kbdev = kbdev;
kbdev->pm.backend.metrics.time_period_start = ktime_get_raw();
#endif
spin_lock_init(&kbdev->pm.backend.metrics.lock);
#ifdef CONFIG_MALI_MIDGARD_DVFS
hrtimer_init(&kbdev->pm.backend.metrics.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
kbdev->pm.backend.metrics.timer.function = dvfs_callback;
kbdev->pm.backend.metrics.initialized = true;
atomic_set(&kbdev->pm.backend.metrics.timer_state, TIMER_OFF);
kbase_pm_metrics_start(kbdev);
#endif /* CONFIG_MALI_MIDGARD_DVFS */
#if MALI_USE_CSF
/* The sanity check on the GPU_ACTIVE performance counter
* is skipped for Juno platforms that have timing problems.
*/
kbdev->pm.backend.metrics.skip_gpu_active_sanity_check =
of_machine_is_compatible("arm,juno");
#endif
return 0;
}
KBASE_EXPORT_TEST_API(kbasep_pm_metrics_init);
void kbasep_pm_metrics_term(struct kbase_device *kbdev)
{
#ifdef CONFIG_MALI_MIDGARD_DVFS
KBASE_DEBUG_ASSERT(kbdev != NULL);
/* Cancel the timer, and block if the callback is currently executing (transition f) */
kbdev->pm.backend.metrics.initialized = false;
atomic_set(&kbdev->pm.backend.metrics.timer_state, TIMER_OFF);
hrtimer_cancel(&kbdev->pm.backend.metrics.timer);
#endif /* CONFIG_MALI_MIDGARD_DVFS */
#if MALI_USE_CSF
kbase_ipa_control_unregister(kbdev, kbdev->pm.backend.metrics.ipa_control_client);
#else
CSTD_UNUSED(kbdev);
#endif
}
KBASE_EXPORT_TEST_API(kbasep_pm_metrics_term);
/* caller needs to hold kbdev->pm.backend.metrics.lock before calling this
* function
*/
#if MALI_USE_CSF
#if defined(CONFIG_MALI_DEVFREQ) || defined(CONFIG_MALI_MIDGARD_DVFS)
static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev)
{
int err;
u64 gpu_active_counter;
u64 protected_time;
ktime_t now;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
/* Query IPA_CONTROL for the latest GPU-active and protected-time
* info.
*/
err = kbase_ipa_control_query(kbdev, kbdev->pm.backend.metrics.ipa_control_client,
&gpu_active_counter, 1, &protected_time);
/* Read the timestamp after reading the GPU_ACTIVE counter value.
* This ensures the time gap between the 2 reads is consistent for
* a meaningful comparison between the increment of GPU_ACTIVE and
* elapsed time. The lock taken inside kbase_ipa_control_query()
* function can cause lot of variation.
*/
now = ktime_get_raw();
if (err) {
dev_err(kbdev->dev, "Failed to query the increment of GPU_ACTIVE counter: err=%d",
err);
} else {
u64 diff_ns;
s64 diff_ns_signed;
u32 ns_time;
ktime_t diff = ktime_sub(now, kbdev->pm.backend.metrics.time_period_start);
diff_ns_signed = ktime_to_ns(diff);
if (diff_ns_signed < 0)
return;
diff_ns = (u64)diff_ns_signed;
#if !IS_ENABLED(CONFIG_MALI_NO_MALI)
/* The GPU_ACTIVE counter shouldn't clock-up more time than has
* actually elapsed - but still some margin needs to be given
* when doing the comparison. There could be some drift between
* the CPU and GPU clock.
*
* Can do the check only in a real driver build, as an arbitrary
* value for GPU_ACTIVE can be fed into dummy model in no_mali
* configuration which may not correspond to the real elapsed
* time.
*/
if (!kbdev->pm.backend.metrics.skip_gpu_active_sanity_check) {
/* The margin is scaled to allow for the worst-case
* scenario where the samples are maximally separated,
* plus a small offset for sampling errors.
*/
u64 const MARGIN_NS =
IPA_CONTROL_TIMER_DEFAULT_VALUE_MS * NSEC_PER_MSEC * 3 / 2;
if (gpu_active_counter > (diff_ns + MARGIN_NS)) {
dev_info(
kbdev->dev,
"GPU activity takes longer than time interval: %llu ns > %llu ns",
(unsigned long long)gpu_active_counter,
(unsigned long long)diff_ns);
}
}
#endif
/* Calculate time difference in units of 256ns */
ns_time = (u32)(diff_ns >> KBASE_PM_TIME_SHIFT);
/* Add protected_time to gpu_active_counter so that time in
* protected mode is included in the apparent GPU active time,
* then convert it from units of 1ns to units of 256ns, to
* match what JM GPUs use. The assumption is made here that the
* GPU is 100% busy while in protected mode, so we should add
* this since the GPU can't (and thus won't) update these
* counters while it's actually in protected mode.
*
* Perform the add after dividing each value down, to reduce
* the chances of overflows.
*/
protected_time >>= KBASE_PM_TIME_SHIFT;
gpu_active_counter >>= KBASE_PM_TIME_SHIFT;
gpu_active_counter += protected_time;
/* Ensure the following equations don't go wrong if ns_time is
* slightly larger than gpu_active_counter somehow
*/
gpu_active_counter = MIN(gpu_active_counter, ns_time);
kbdev->pm.backend.metrics.values.time_busy += gpu_active_counter;
kbdev->pm.backend.metrics.values.time_idle += ns_time - gpu_active_counter;
/* Also make time in protected mode available explicitly,
* so users of this data have this info, too.
*/
kbdev->pm.backend.metrics.values.time_in_protm += protected_time;
}
kbdev->pm.backend.metrics.time_period_start = now;
}
#endif /* defined(CONFIG_MALI_DEVFREQ) || defined(CONFIG_MALI_MIDGARD_DVFS) */
#else
static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev, ktime_t now)
{
ktime_t diff;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
diff = ktime_sub(now, kbdev->pm.backend.metrics.time_period_start);
if (ktime_to_ns(diff) < 0)
return;
if (kbdev->pm.backend.metrics.gpu_active) {
u32 ns_time = (u32)(ktime_to_ns(diff) >> KBASE_PM_TIME_SHIFT);
kbdev->pm.backend.metrics.values.time_busy += ns_time;
if (kbdev->pm.backend.metrics.active_cl_ctx[0])
kbdev->pm.backend.metrics.values.busy_cl[0] += ns_time;
if (kbdev->pm.backend.metrics.active_cl_ctx[1])
kbdev->pm.backend.metrics.values.busy_cl[1] += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[0])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[1])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[2])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
} else {
kbdev->pm.backend.metrics.values.time_idle +=
(u32)(ktime_to_ns(diff) >> KBASE_PM_TIME_SHIFT);
}
kbdev->pm.backend.metrics.time_period_start = now;
}
#endif /* MALI_USE_CSF */
#if defined(CONFIG_MALI_DEVFREQ) || defined(CONFIG_MALI_MIDGARD_DVFS)
void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev, struct kbasep_pm_metrics *last,
struct kbasep_pm_metrics *diff)
{
struct kbasep_pm_metrics *cur = &kbdev->pm.backend.metrics.values;
unsigned long flags;
spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
#if MALI_USE_CSF
kbase_pm_get_dvfs_utilisation_calc(kbdev);
#else
kbase_pm_get_dvfs_utilisation_calc(kbdev, ktime_get_raw());
#endif
memset(diff, 0, sizeof(*diff));
diff->time_busy = cur->time_busy - last->time_busy;
diff->time_idle = cur->time_idle - last->time_idle;
#if MALI_USE_CSF
diff->time_in_protm = cur->time_in_protm - last->time_in_protm;
#else
diff->busy_cl[0] = cur->busy_cl[0] - last->busy_cl[0];
diff->busy_cl[1] = cur->busy_cl[1] - last->busy_cl[1];
diff->busy_gl = cur->busy_gl - last->busy_gl;
#endif
*last = *cur;
spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_pm_get_dvfs_metrics);
#endif
#ifdef CONFIG_MALI_MIDGARD_DVFS
void kbase_pm_get_dvfs_action(struct kbase_device *kbdev)
{
int utilisation;
struct kbasep_pm_metrics *diff;
#if !MALI_USE_CSF
int busy;
int util_gl_share;
int util_cl_share[2];
#endif
KBASE_DEBUG_ASSERT(kbdev != NULL);
diff = &kbdev->pm.backend.metrics.dvfs_diff;
kbase_pm_get_dvfs_metrics(kbdev, &kbdev->pm.backend.metrics.dvfs_last, diff);
utilisation = (100 * diff->time_busy) / max(diff->time_busy + diff->time_idle, 1u);
#if !MALI_USE_CSF
busy = max(diff->busy_gl + diff->busy_cl[0] + diff->busy_cl[1], 1u);
util_gl_share = (100 * diff->busy_gl) / busy;
util_cl_share[0] = (100 * diff->busy_cl[0]) / busy;
util_cl_share[1] = (100 * diff->busy_cl[1]) / busy;
kbase_platform_dvfs_event(kbdev, utilisation, util_gl_share, util_cl_share);
#else
/* Note that, at present, we don't pass protected-mode time to the
* platform here. It's unlikely to be useful, however, as the platform
* probably just cares whether the GPU is busy or not; time in
* protected mode is already added to busy-time at this point, though,
* so we should be good.
*/
kbase_platform_dvfs_event(kbdev, utilisation);
#endif
}
bool kbase_pm_metrics_is_active(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev != NULL);
return atomic_read(&kbdev->pm.backend.metrics.timer_state) == TIMER_ON;
}
KBASE_EXPORT_TEST_API(kbase_pm_metrics_is_active);
void kbase_pm_metrics_start(struct kbase_device *kbdev)
{
struct kbasep_pm_metrics_state *metrics = &kbdev->pm.backend.metrics;
if (unlikely(!metrics->initialized))
return;
/* Transition to ON, from a stopped state (transition c) */
if (atomic_xchg(&metrics->timer_state, TIMER_ON) == TIMER_OFF)
/* Start the timer only if it's been fully stopped (transition d)*/
hrtimer_start(&metrics->timer, HR_TIMER_DELAY_MSEC(kbdev->pm.dvfs_period),
HRTIMER_MODE_REL);
}
void kbase_pm_metrics_stop(struct kbase_device *kbdev)
{
if (unlikely(!kbdev->pm.backend.metrics.initialized))
return;
/* Timer is Stopped if its currently on (transition a) */
atomic_cmpxchg(&kbdev->pm.backend.metrics.timer_state, TIMER_ON, TIMER_STOPPED);
}
#endif /* CONFIG_MALI_MIDGARD_DVFS */
#if !MALI_USE_CSF
/**
* kbase_pm_metrics_active_calc - Update PM active counts based on currently
* running atoms
* @kbdev: Device pointer
*
* The caller must hold kbdev->pm.backend.metrics.lock
*/
static void kbase_pm_metrics_active_calc(struct kbase_device *kbdev)
{
unsigned int js;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
kbdev->pm.backend.metrics.active_gl_ctx[0] = 0;
kbdev->pm.backend.metrics.active_gl_ctx[1] = 0;
kbdev->pm.backend.metrics.active_gl_ctx[2] = 0;
kbdev->pm.backend.metrics.active_cl_ctx[0] = 0;
kbdev->pm.backend.metrics.active_cl_ctx[1] = 0;
kbdev->pm.backend.metrics.gpu_active = false;
for (js = 0; js < BASE_JM_MAX_NR_SLOTS; js++) {
struct kbase_jd_atom *katom = kbase_gpu_inspect(kbdev, js, 0);
/* Head atom may have just completed, so if it isn't running
* then try the next atom
*/
if (katom && katom->gpu_rb_state != KBASE_ATOM_GPU_RB_SUBMITTED)
katom = kbase_gpu_inspect(kbdev, js, 1);
if (katom && katom->gpu_rb_state == KBASE_ATOM_GPU_RB_SUBMITTED) {
if (katom->core_req & BASE_JD_REQ_ONLY_COMPUTE) {
u32 device_nr =
(katom->core_req & BASE_JD_REQ_SPECIFIC_COHERENT_GROUP) ?
katom->device_nr :
0;
if (!WARN_ON(device_nr >= 2))
kbdev->pm.backend.metrics.active_cl_ctx[device_nr] = 1;
} else {
kbdev->pm.backend.metrics.active_gl_ctx[js] = 1;
trace_sysgraph(SGR_ACTIVE, 0, js);
}
kbdev->pm.backend.metrics.gpu_active = true;
} else {
trace_sysgraph(SGR_INACTIVE, 0, js);
}
}
}
/* called when job is submitted to or removed from a GPU slot */
void kbase_pm_metrics_update(struct kbase_device *kbdev, ktime_t *timestamp)
{
unsigned long flags;
ktime_t now;
lockdep_assert_held(&kbdev->hwaccess_lock);
spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
if (!timestamp) {
now = ktime_get_raw();
timestamp = &now;
}
/* Track how much of time has been spent busy or idle. For JM GPUs,
* this also evaluates how long CL and/or GL jobs have been busy for.
*/
kbase_pm_get_dvfs_utilisation_calc(kbdev, *timestamp);
kbase_pm_metrics_active_calc(kbdev);
spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
}
#endif /* !MALI_USE_CSF */

View File

@@ -0,0 +1,440 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Power policy API implementations
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <mali_kbase_reset_gpu.h>
#if MALI_USE_CSF && defined CONFIG_MALI_DEBUG
#include <csf/mali_kbase_csf_firmware.h>
#endif
#include <linux/of.h>
static const struct kbase_pm_policy *const all_policy_list[] = {
#if IS_ENABLED(CONFIG_MALI_NO_MALI)
&kbase_pm_always_on_policy_ops,
&kbase_pm_coarse_demand_policy_ops,
#else /* CONFIG_MALI_NO_MALI */
&kbase_pm_coarse_demand_policy_ops,
&kbase_pm_always_on_policy_ops,
#endif /* CONFIG_MALI_NO_MALI */
};
void kbase_pm_policy_init(struct kbase_device *kbdev)
{
const struct kbase_pm_policy *default_policy = all_policy_list[0];
struct device_node *np = kbdev->dev->of_node;
const char *power_policy_name;
unsigned long flags;
unsigned int i;
/* Read "power-policy" property and fallback to "power_policy" if not found */
if ((of_property_read_string(np, "power-policy", &power_policy_name) == 0) ||
(of_property_read_string(np, "power_policy", &power_policy_name) == 0)) {
for (i = 0; i < ARRAY_SIZE(all_policy_list); i++)
if (sysfs_streq(all_policy_list[i]->name, power_policy_name)) {
default_policy = all_policy_list[i];
break;
}
}
#if MALI_USE_CSF && defined(CONFIG_MALI_DEBUG)
/* Use always_on policy if module param fw_debug=1 is
* passed, to aid firmware debugging.
*/
if (fw_debug)
default_policy = &kbase_pm_always_on_policy_ops;
#endif
default_policy->init(kbdev);
#if MALI_USE_CSF
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.pm_current_policy = default_policy;
kbdev->pm.backend.csf_pm_sched_flags = default_policy->pm_sched_flags;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
#else
CSTD_UNUSED(flags);
kbdev->pm.backend.pm_current_policy = default_policy;
#endif
}
void kbase_pm_policy_term(struct kbase_device *kbdev)
{
kbdev->pm.backend.pm_current_policy->term(kbdev);
}
void kbase_pm_update_active(struct kbase_device *kbdev)
{
struct kbase_pm_device_data *pm = &kbdev->pm;
struct kbase_pm_backend_data *backend = &pm->backend;
unsigned long flags;
bool active;
lockdep_assert_held(&pm->lock);
/* pm_current_policy will never be NULL while pm.lock is held */
KBASE_DEBUG_ASSERT(backend->pm_current_policy);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
active = backend->pm_current_policy->get_core_active(kbdev);
WARN((kbase_pm_is_active(kbdev) && !active),
"GPU is active but policy '%s' is indicating that it can be powered off",
kbdev->pm.backend.pm_current_policy->name);
if (active) {
/* Power on the GPU and any cores requested by the policy */
if (!pm->backend.invoke_poweroff_wait_wq_when_l2_off &&
pm->backend.poweroff_wait_in_progress) {
KBASE_DEBUG_ASSERT(kbdev->pm.backend.gpu_powered);
pm->backend.poweron_required = true;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
} else {
/* Cancel the invocation of
* kbase_pm_gpu_poweroff_wait_wq() from the L2 state
* machine. This is safe - it
* invoke_poweroff_wait_wq_when_l2_off is true, then
* the poweroff work hasn't even been queued yet,
* meaning we can go straight to powering on.
*/
pm->backend.invoke_poweroff_wait_wq_when_l2_off = false;
pm->backend.poweroff_wait_in_progress = false;
pm->backend.l2_desired = true;
#if MALI_USE_CSF
pm->backend.mcu_desired = true;
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
kbase_pm_do_poweron(kbdev, false);
}
} else {
/* It is an error for the power policy to power off the GPU
* when there are contexts active
*/
KBASE_DEBUG_ASSERT(pm->active_count == 0);
pm->backend.poweron_required = false;
/* Request power off */
if (pm->backend.gpu_powered) {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* Power off the GPU immediately */
kbase_pm_do_poweroff(kbdev);
} else {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
}
}
void kbase_pm_update_dynamic_cores_onoff(struct kbase_device *kbdev)
{
bool shaders_desired;
lockdep_assert_held(&kbdev->hwaccess_lock);
lockdep_assert_held(&kbdev->pm.lock);
if (kbdev->pm.backend.pm_current_policy == NULL)
return;
if (kbdev->pm.backend.poweroff_wait_in_progress)
return;
#if MALI_USE_CSF
CSTD_UNUSED(shaders_desired);
/* Invoke the MCU state machine to send a request to FW for updating
* the mask of shader cores that can be used for allocation of
* endpoints requested by CSGs.
*/
if (kbase_pm_is_mcu_desired(kbdev))
kbase_pm_update_state(kbdev);
#else
/* In protected transition, don't allow outside shader core request
* affect transition, return directly
*/
if (kbdev->pm.backend.protected_transition_override)
return;
shaders_desired = kbdev->pm.backend.pm_current_policy->shaders_needed(kbdev);
if (shaders_desired && kbase_pm_is_l2_desired(kbdev))
kbase_pm_update_state(kbdev);
#endif
}
void kbase_pm_update_cores_state_nolock(struct kbase_device *kbdev)
{
bool shaders_desired = false;
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.pm_current_policy == NULL)
return;
if (kbdev->pm.backend.poweroff_wait_in_progress)
return;
#if !MALI_USE_CSF
if (kbdev->pm.backend.protected_transition_override)
/* We are trying to change in/out of protected mode - force all
* cores off so that the L2 powers down
*/
shaders_desired = false;
else
shaders_desired = kbdev->pm.backend.pm_current_policy->shaders_needed(kbdev);
#endif
if (kbdev->pm.backend.shaders_desired != shaders_desired) {
KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_DESIRED, NULL,
kbdev->pm.backend.shaders_desired);
kbdev->pm.backend.shaders_desired = shaders_desired;
kbase_pm_update_state(kbdev);
}
}
void kbase_pm_update_cores_state(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbase_pm_update_cores_state_nolock(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
size_t kbase_pm_list_policies(struct kbase_device *kbdev,
const struct kbase_pm_policy *const **list)
{
CSTD_UNUSED(kbdev);
if (list)
*list = all_policy_list;
return ARRAY_SIZE(all_policy_list);
}
KBASE_EXPORT_TEST_API(kbase_pm_list_policies);
const struct kbase_pm_policy *kbase_pm_get_policy(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev != NULL);
return kbdev->pm.backend.pm_current_policy;
}
KBASE_EXPORT_TEST_API(kbase_pm_get_policy);
#if MALI_USE_CSF
static int policy_change_wait_for_L2_off(struct kbase_device *kbdev)
{
long remaining;
long timeout = kbase_csf_timeout_in_jiffies(kbase_get_timeout_ms(kbdev, CSF_PM_TIMEOUT));
int err = 0;
/* Wait for L2 becoming off, by which the MCU is also implicitly off
* since the L2 state machine would only start its power-down
* sequence when the MCU is in off state. The L2 off is required
* as the tiler may need to be power cycled for MCU reconfiguration
* for host control of shader cores.
*/
#if KERNEL_VERSION(4, 13, 1) <= LINUX_VERSION_CODE
remaining = wait_event_killable_timeout(kbdev->pm.backend.gpu_in_desired_state_wait,
kbdev->pm.backend.l2_state == KBASE_L2_OFF,
timeout);
#else
remaining = wait_event_timeout(kbdev->pm.backend.gpu_in_desired_state_wait,
kbdev->pm.backend.l2_state == KBASE_L2_OFF, timeout);
#endif
if (!remaining) {
err = -ETIMEDOUT;
} else if (remaining < 0) {
dev_info(kbdev->dev, "Wait for L2_off got interrupted");
err = (int)remaining;
}
dev_dbg(kbdev->dev, "%s: err=%d mcu_state=%d, L2_state=%d\n", __func__, err,
kbdev->pm.backend.mcu_state, kbdev->pm.backend.l2_state);
return err;
}
#endif
void kbase_pm_set_policy(struct kbase_device *kbdev, const struct kbase_pm_policy *new_policy)
{
const struct kbase_pm_policy *old_policy;
unsigned long flags;
#if MALI_USE_CSF
unsigned int new_policy_csf_pm_sched_flags;
bool sched_suspend;
bool reset_gpu = false;
bool reset_op_prevented = true;
struct kbase_csf_scheduler *scheduler = NULL;
u64 pwroff_ns;
bool switching_to_always_on;
#endif
KBASE_DEBUG_ASSERT(kbdev != NULL);
KBASE_DEBUG_ASSERT(new_policy != NULL);
KBASE_KTRACE_ADD(kbdev, PM_SET_POLICY, NULL, new_policy->id);
#if MALI_USE_CSF
pwroff_ns = kbase_csf_firmware_get_mcu_core_pwroff_time(kbdev);
switching_to_always_on = new_policy == &kbase_pm_always_on_policy_ops;
if (pwroff_ns == 0 && !switching_to_always_on) {
dev_warn(
kbdev->dev,
"power_policy: cannot switch away from always_on with mcu_shader_pwroff_timeout set to 0\n");
dev_warn(
kbdev->dev,
"power_policy: resetting mcu_shader_pwroff_timeout to default value to switch policy from always_on\n");
kbase_csf_firmware_reset_mcu_core_pwroff_time(kbdev);
}
scheduler = &kbdev->csf.scheduler;
KBASE_DEBUG_ASSERT(scheduler != NULL);
/* Serialize calls on kbase_pm_set_policy() */
mutex_lock(&kbdev->pm.backend.policy_change_lock);
if (kbase_reset_gpu_prevent_and_wait(kbdev)) {
dev_warn(kbdev->dev, "Set PM policy failing to prevent gpu reset");
reset_op_prevented = false;
}
/* In case of CSF, the scheduler may be invoked to suspend. In that
* case, there is a risk that the L2 may be turned on by the time we
* check it here. So we hold the scheduler lock to avoid other operations
* interfering with the policy change and vice versa.
*/
mutex_lock(&scheduler->lock);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* policy_change_clamp_state_to_off, when needed, is set/cleared in
* this function, a very limited temporal scope for covering the
* change transition.
*/
WARN_ON(kbdev->pm.backend.policy_change_clamp_state_to_off);
new_policy_csf_pm_sched_flags = new_policy->pm_sched_flags;
/* Requiring the scheduler PM suspend operation when changes involving
* the always_on policy, reflected by the CSF_DYNAMIC_PM_CORE_KEEP_ON
* flag bit.
*/
sched_suspend = reset_op_prevented &&
(CSF_DYNAMIC_PM_CORE_KEEP_ON &
(new_policy_csf_pm_sched_flags | kbdev->pm.backend.csf_pm_sched_flags));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (sched_suspend) {
/* Update the suspend flag to reflect actually suspend being done ! */
sched_suspend = !kbase_csf_scheduler_pm_suspend_no_lock(kbdev);
/* Set the reset recovery flag if the required suspend failed */
reset_gpu = !sched_suspend;
}
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.policy_change_clamp_state_to_off = sched_suspend;
kbase_pm_update_state(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (sched_suspend)
reset_gpu = policy_change_wait_for_L2_off(kbdev);
#endif
/* During a policy change we pretend the GPU is active */
/* A suspend won't happen here, because we're in a syscall from a
* userspace thread
*/
kbase_pm_context_active(kbdev);
kbase_pm_lock(kbdev);
/* Remove the policy to prevent IRQ handlers from working on it */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
old_policy = kbdev->pm.backend.pm_current_policy;
kbdev->pm.backend.pm_current_policy = NULL;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
KBASE_KTRACE_ADD(kbdev, PM_CURRENT_POLICY_TERM, NULL, old_policy->id);
if (old_policy->term)
old_policy->term(kbdev);
memset(&kbdev->pm.backend.pm_policy_data, 0, sizeof(union kbase_pm_policy_data));
KBASE_KTRACE_ADD(kbdev, PM_CURRENT_POLICY_INIT, NULL, new_policy->id);
if (new_policy->init)
new_policy->init(kbdev);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.pm_current_policy = new_policy;
#if MALI_USE_CSF
kbdev->pm.backend.csf_pm_sched_flags = new_policy_csf_pm_sched_flags;
/* New policy in place, release the clamping on mcu/L2 off state */
kbdev->pm.backend.policy_change_clamp_state_to_off = false;
kbase_pm_update_state(kbdev);
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* If any core power state changes were previously attempted, but
* couldn't be made because the policy was changing (current_policy was
* NULL), then re-try them here.
*/
kbase_pm_update_active(kbdev);
kbase_pm_update_cores_state(kbdev);
kbase_pm_unlock(kbdev);
/* Now the policy change is finished, we release our fake context active
* reference
*/
kbase_pm_context_idle(kbdev);
#if MALI_USE_CSF
/* Reverse the suspension done */
if (sched_suspend)
kbase_csf_scheduler_pm_resume_no_lock(kbdev);
mutex_unlock(&scheduler->lock);
if (reset_op_prevented)
kbase_reset_gpu_allow(kbdev);
if (reset_gpu) {
dev_warn(kbdev->dev, "Resorting to GPU reset for policy change\n");
if (kbase_prepare_to_reset_gpu(kbdev, RESET_FLAGS_NONE))
kbase_reset_gpu(kbdev);
kbase_reset_gpu_wait(kbdev);
}
mutex_unlock(&kbdev->pm.backend.policy_change_lock);
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_set_policy);

View File

@@ -0,0 +1,104 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Power policy API definitions
*/
#ifndef _KBASE_PM_POLICY_H_
#define _KBASE_PM_POLICY_H_
/**
* kbase_pm_policy_init - Initialize power policy framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Must be called before calling any other policy function
*/
void kbase_pm_policy_init(struct kbase_device *kbdev);
/**
* kbase_pm_policy_term - Terminate power policy framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*/
void kbase_pm_policy_term(struct kbase_device *kbdev);
/**
* kbase_pm_update_active - Update the active power state of the GPU
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Calls into the current power policy
*/
void kbase_pm_update_active(struct kbase_device *kbdev);
/**
* kbase_pm_update_cores - Update the desired core state of the GPU
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Calls into the current power policy
*/
void kbase_pm_update_cores(struct kbase_device *kbdev);
/**
* kbase_pm_cores_requested - Check that a power request has been locked into
* the HW.
* @kbdev: Kbase device
* @shader_required: true if shaders are required
*
* Called by the scheduler to check if a power on request has been locked into
* the HW.
*
* Note that there is no guarantee that the cores are actually ready, however
* when the request has been locked into the HW, then it is safe to submit work
* since the HW will wait for the transition to ready.
*
* A reference must first be taken prior to making this call.
*
* Caller must hold the hwaccess_lock.
*
* Return: true if the request to the HW was successfully made else false if the
* request is still pending.
*/
static inline bool kbase_pm_cores_requested(struct kbase_device *kbdev, bool shader_required)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
/* If the L2 & tiler are not on or pending, then the tiler is not yet
* available, and shaders are definitely not powered.
*/
if (kbdev->pm.backend.l2_state != KBASE_L2_PEND_ON &&
kbdev->pm.backend.l2_state != KBASE_L2_ON &&
kbdev->pm.backend.l2_state != KBASE_L2_ON_HWCNT_ENABLE)
return false;
if (shader_required &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_PEND_ON_CORESTACK_ON &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_ON_CORESTACK_ON &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_ON_CORESTACK_ON_RECHECK)
return false;
return true;
}
#endif /* _KBASE_PM_POLICY_H_ */

View File

@@ -0,0 +1,79 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager shader core state definitions.
* The function-like macro KBASEP_SHADER_STATE() must be defined before
* including this header file. This header file can be included multiple
* times in the same compilation unit with different definitions of
* KBASEP_SHADER_STATE().
*
* @OFF_CORESTACK_OFF: The shaders and core stacks are off
* @OFF_CORESTACK_PEND_ON: The shaders are off, core stacks have been
* requested to power on and hwcnt is being
* disabled
* @PEND_ON_CORESTACK_ON: Core stacks are on, shaders have been
* requested to power on. Or after doing
* partial shader on/off, checking whether
* it's the desired state.
* @ON_CORESTACK_ON: The shaders and core stacks are on, and
* hwcnt already enabled.
* @ON_CORESTACK_ON_RECHECK: The shaders and core stacks are on, hwcnt
* disabled, and checks to powering down or
* re-enabling hwcnt.
* @WAIT_OFF_CORESTACK_ON: The shaders have been requested to power
* off, but they remain on for the duration
* of the hysteresis timer
* @WAIT_GPU_IDLE: The shaders partial poweroff needs to
* reach a state where jobs on the GPU are
* finished including jobs currently running
* and in the GPU queue because of
* GPU2017-861
* @WAIT_FINISHED_CORESTACK_ON: The hysteresis timer has expired
* @L2_FLUSHING_CORESTACK_ON: The core stacks are on and the level 2
* cache is being flushed.
* @READY_OFF_CORESTACK_ON: The core stacks are on and the shaders are
* ready to be powered off.
* @PEND_OFF_CORESTACK_ON: The core stacks are on, and the shaders
* have been requested to power off
* @OFF_CORESTACK_PEND_OFF: The shaders are off, and the core stacks
* have been requested to power off
* @OFF_CORESTACK_OFF_TIMER_PEND_OFF: Shaders and corestacks are off, but the
* tick timer cancellation is still pending.
* @RESET_WAIT: The GPU is resetting, shader and core
* stack power states are unknown
*/
KBASEP_SHADER_STATE(OFF_CORESTACK_OFF)
KBASEP_SHADER_STATE(OFF_CORESTACK_PEND_ON)
KBASEP_SHADER_STATE(PEND_ON_CORESTACK_ON)
KBASEP_SHADER_STATE(ON_CORESTACK_ON)
KBASEP_SHADER_STATE(ON_CORESTACK_ON_RECHECK)
KBASEP_SHADER_STATE(WAIT_OFF_CORESTACK_ON)
#if !MALI_USE_CSF
KBASEP_SHADER_STATE(WAIT_GPU_IDLE)
#endif /* !MALI_USE_CSF */
KBASEP_SHADER_STATE(WAIT_FINISHED_CORESTACK_ON)
KBASEP_SHADER_STATE(L2_FLUSHING_CORESTACK_ON)
KBASEP_SHADER_STATE(READY_OFF_CORESTACK_ON)
KBASEP_SHADER_STATE(PEND_OFF_CORESTACK_ON)
KBASEP_SHADER_STATE(OFF_CORESTACK_PEND_OFF)
KBASEP_SHADER_STATE(OFF_CORESTACK_OFF_TIMER_PEND_OFF)
KBASEP_SHADER_STATE(RESET_WAIT)

View File

@@ -0,0 +1,367 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_time.h>
#if MALI_USE_CSF
#include <linux/gcd.h>
#include <csf/mali_kbase_csf_timeout.h>
#endif
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <mali_kbase_config_defaults.h>
#include <linux/version_compat_defs.h>
#include <asm/arch_timer.h>
#if !IS_ENABLED(CONFIG_MALI_REAL_HW)
#include <backend/gpu/mali_kbase_model_linux.h>
#endif
struct kbase_timeout_info {
char *selector_str;
u64 timeout_cycles;
};
#if MALI_USE_CSF
static struct kbase_timeout_info timeout_info[KBASE_TIMEOUT_SELECTOR_COUNT] = {
[CSF_FIRMWARE_TIMEOUT] = { "CSF_FIRMWARE_TIMEOUT", MIN(CSF_FIRMWARE_TIMEOUT_CYCLES,
CSF_FIRMWARE_PING_TIMEOUT_CYCLES) },
[CSF_PM_TIMEOUT] = { "CSF_PM_TIMEOUT", CSF_PM_TIMEOUT_CYCLES },
[CSF_GPU_RESET_TIMEOUT] = { "CSF_GPU_RESET_TIMEOUT", CSF_GPU_RESET_TIMEOUT_CYCLES },
[CSF_CSG_SUSPEND_TIMEOUT] = { "CSF_CSG_SUSPEND_TIMEOUT", CSF_CSG_SUSPEND_TIMEOUT_CYCLES },
[CSF_FIRMWARE_BOOT_TIMEOUT] = { "CSF_FIRMWARE_BOOT_TIMEOUT",
CSF_FIRMWARE_BOOT_TIMEOUT_CYCLES },
[CSF_FIRMWARE_PING_TIMEOUT] = { "CSF_FIRMWARE_PING_TIMEOUT",
CSF_FIRMWARE_PING_TIMEOUT_CYCLES },
[CSF_SCHED_PROTM_PROGRESS_TIMEOUT] = { "CSF_SCHED_PROTM_PROGRESS_TIMEOUT",
DEFAULT_PROGRESS_TIMEOUT_CYCLES },
[MMU_AS_INACTIVE_WAIT_TIMEOUT] = { "MMU_AS_INACTIVE_WAIT_TIMEOUT",
MMU_AS_INACTIVE_WAIT_TIMEOUT_CYCLES },
[KCPU_FENCE_SIGNAL_TIMEOUT] = { "KCPU_FENCE_SIGNAL_TIMEOUT",
KCPU_FENCE_SIGNAL_TIMEOUT_CYCLES },
[KBASE_PRFCNT_ACTIVE_TIMEOUT] = { "KBASE_PRFCNT_ACTIVE_TIMEOUT",
KBASE_PRFCNT_ACTIVE_TIMEOUT_CYCLES },
[KBASE_CLEAN_CACHE_TIMEOUT] = { "KBASE_CLEAN_CACHE_TIMEOUT",
KBASE_CLEAN_CACHE_TIMEOUT_CYCLES },
[KBASE_AS_INACTIVE_TIMEOUT] = { "KBASE_AS_INACTIVE_TIMEOUT",
KBASE_AS_INACTIVE_TIMEOUT_CYCLES },
[IPA_INACTIVE_TIMEOUT] = { "IPA_INACTIVE_TIMEOUT", IPA_INACTIVE_TIMEOUT_CYCLES },
[CSF_FIRMWARE_STOP_TIMEOUT] = { "CSF_FIRMWARE_STOP_TIMEOUT",
CSF_FIRMWARE_STOP_TIMEOUT_CYCLES },
};
#else
static struct kbase_timeout_info timeout_info[KBASE_TIMEOUT_SELECTOR_COUNT] = {
[MMU_AS_INACTIVE_WAIT_TIMEOUT] = { "MMU_AS_INACTIVE_WAIT_TIMEOUT",
MMU_AS_INACTIVE_WAIT_TIMEOUT_CYCLES },
[JM_DEFAULT_JS_FREE_TIMEOUT] = { "JM_DEFAULT_JS_FREE_TIMEOUT",
JM_DEFAULT_JS_FREE_TIMEOUT_CYCLES },
[KBASE_PRFCNT_ACTIVE_TIMEOUT] = { "KBASE_PRFCNT_ACTIVE_TIMEOUT",
KBASE_PRFCNT_ACTIVE_TIMEOUT_CYCLES },
[KBASE_CLEAN_CACHE_TIMEOUT] = { "KBASE_CLEAN_CACHE_TIMEOUT",
KBASE_CLEAN_CACHE_TIMEOUT_CYCLES },
[KBASE_AS_INACTIVE_TIMEOUT] = { "KBASE_AS_INACTIVE_TIMEOUT",
KBASE_AS_INACTIVE_TIMEOUT_CYCLES },
};
#endif
void kbase_backend_get_gpu_time_norequest(struct kbase_device *kbdev, u64 *cycle_counter,
u64 *system_time, struct timespec64 *ts)
{
if (cycle_counter)
*cycle_counter = kbase_backend_get_cycle_cnt(kbdev);
if (system_time) {
*system_time = kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(TIMESTAMP));
}
/* Record the CPU's idea of current time */
if (ts != NULL)
#if (KERNEL_VERSION(4, 17, 0) > LINUX_VERSION_CODE)
*ts = ktime_to_timespec64(ktime_get_raw());
#else
ktime_get_raw_ts64(ts);
#endif
}
#if !MALI_USE_CSF
/**
* timedwait_cycle_count_active() - Timed wait till CYCLE_COUNT_ACTIVE is active
*
* @kbdev: Kbase device
*
* Return: true if CYCLE_COUNT_ACTIVE is active within the timeout.
*/
static bool timedwait_cycle_count_active(struct kbase_device *kbdev)
{
#if IS_ENABLED(CONFIG_MALI_NO_MALI)
return true;
#else
bool success = false;
const unsigned int timeout = 100;
const unsigned long remaining = jiffies + msecs_to_jiffies(timeout);
while (time_is_after_jiffies(remaining)) {
if ((kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_STATUS)) &
GPU_STATUS_CYCLE_COUNT_ACTIVE)) {
success = true;
break;
}
}
return success;
#endif
}
#endif
void kbase_backend_get_gpu_time(struct kbase_device *kbdev, u64 *cycle_counter, u64 *system_time,
struct timespec64 *ts)
{
#if !MALI_USE_CSF
kbase_pm_request_gpu_cycle_counter(kbdev);
WARN_ONCE(kbdev->pm.backend.l2_state != KBASE_L2_ON, "L2 not powered up");
WARN_ONCE((!timedwait_cycle_count_active(kbdev)), "Timed out on CYCLE_COUNT_ACTIVE");
#endif
kbase_backend_get_gpu_time_norequest(kbdev, cycle_counter, system_time, ts);
#if !MALI_USE_CSF
kbase_pm_release_gpu_cycle_counter(kbdev);
#endif
}
static u64 kbase_device_get_scaling_frequency(struct kbase_device *kbdev)
{
u64 freq_khz = kbdev->lowest_gpu_freq_khz;
if (!freq_khz) {
dev_dbg(kbdev->dev,
"Lowest frequency uninitialized! Using reference frequency for scaling");
return DEFAULT_REF_TIMEOUT_FREQ_KHZ;
}
return freq_khz;
}
void kbase_device_set_timeout_ms(struct kbase_device *kbdev, enum kbase_timeout_selector selector,
unsigned int timeout_ms)
{
char *selector_str;
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
selector = KBASE_DEFAULT_TIMEOUT;
selector_str = timeout_info[selector].selector_str;
dev_warn(kbdev->dev,
"Unknown timeout selector passed, falling back to default: %s\n",
timeout_info[selector].selector_str);
}
selector_str = timeout_info[selector].selector_str;
kbdev->backend_time.device_scaled_timeouts[selector] = timeout_ms;
dev_dbg(kbdev->dev, "\t%-35s: %ums\n", selector_str, timeout_ms);
}
void kbase_device_set_timeout(struct kbase_device *kbdev, enum kbase_timeout_selector selector,
u64 timeout_cycles, u32 cycle_multiplier)
{
u64 final_cycles;
u64 timeout;
u64 freq_khz = kbase_device_get_scaling_frequency(kbdev);
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
selector = KBASE_DEFAULT_TIMEOUT;
dev_warn(kbdev->dev,
"Unknown timeout selector passed, falling back to default: %s\n",
timeout_info[selector].selector_str);
}
/* If the multiplication overflows, we will have unsigned wrap-around, and so might
* end up with a shorter timeout. In those cases, we then want to have the largest
* timeout possible that will not run into these issues. Note that this will not
* wait for U64_MAX/frequency ms, as it will be clamped to a max of UINT_MAX
* milliseconds by subsequent steps.
*/
if (check_mul_overflow(timeout_cycles, (u64)cycle_multiplier, &final_cycles))
final_cycles = U64_MAX;
/* Timeout calculation:
* dividing number of cycles by freq in KHz automatically gives value
* in milliseconds. nr_cycles will have to be multiplied by 1e3 to
* get result in microseconds, and 1e6 to get result in nanoseconds.
*/
timeout = div_u64(final_cycles, freq_khz);
if (unlikely(timeout > UINT_MAX)) {
dev_dbg(kbdev->dev,
"Capping excessive timeout %llums for %s at freq %llukHz to UINT_MAX ms",
timeout, timeout_info[selector].selector_str,
kbase_device_get_scaling_frequency(kbdev));
timeout = UINT_MAX;
}
kbase_device_set_timeout_ms(kbdev, selector, (unsigned int)timeout);
}
/**
* kbase_timeout_scaling_init - Initialize the table of scaled timeout
* values associated with a @kbase_device.
*
* @kbdev: KBase device pointer.
*
* Return: 0 on success, negative error code otherwise.
*/
static int kbase_timeout_scaling_init(struct kbase_device *kbdev)
{
int err;
enum kbase_timeout_selector selector;
/* First, we initialize the minimum and maximum device frequencies, which
* are used to compute the timeouts.
*/
err = kbase_pm_gpu_freq_init(kbdev);
if (unlikely(err < 0)) {
dev_dbg(kbdev->dev, "Could not initialize GPU frequency\n");
return err;
}
dev_dbg(kbdev->dev, "Scaling kbase timeouts:\n");
for (selector = 0; selector < KBASE_TIMEOUT_SELECTOR_COUNT; selector++) {
u32 cycle_multiplier = 1;
u64 nr_cycles = timeout_info[selector].timeout_cycles;
#if MALI_USE_CSF
/* Special case: the scheduler progress timeout can be set manually,
* and does not have a canonical length defined in the headers. Hence,
* we query it once upon startup to get a baseline, and change it upon
* every invocation of the appropriate functions
*/
if (selector == CSF_SCHED_PROTM_PROGRESS_TIMEOUT)
nr_cycles = kbase_csf_timeout_get(kbdev);
#endif
/* Since we are in control of the iteration bounds for the selector,
* we don't have to worry about bounds checking when setting the timeout.
*/
kbase_device_set_timeout(kbdev, selector, nr_cycles, cycle_multiplier);
}
return 0;
}
unsigned int kbase_get_timeout_ms(struct kbase_device *kbdev, enum kbase_timeout_selector selector)
{
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
dev_warn(kbdev->dev, "Querying wrong selector, falling back to default\n");
selector = KBASE_DEFAULT_TIMEOUT;
}
return kbdev->backend_time.device_scaled_timeouts[selector];
}
KBASE_EXPORT_TEST_API(kbase_get_timeout_ms);
u64 kbase_backend_get_cycle_cnt(struct kbase_device *kbdev)
{
return kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(CYCLE_COUNT));
}
#if MALI_USE_CSF
u64 __maybe_unused kbase_backend_time_convert_gpu_to_cpu(struct kbase_device *kbdev, u64 gpu_ts)
{
if (WARN_ON(!kbdev))
return 0;
return div64_u64(gpu_ts * kbdev->backend_time.multiplier, kbdev->backend_time.divisor) +
kbdev->backend_time.offset;
}
/**
* get_cpu_gpu_time() - Get current CPU and GPU timestamps.
*
* @kbdev: Kbase device.
* @cpu_ts: Output CPU timestamp.
* @gpu_ts: Output GPU timestamp.
* @gpu_cycle: Output GPU cycle counts.
*/
static void get_cpu_gpu_time(struct kbase_device *kbdev, u64 *cpu_ts, u64 *gpu_ts, u64 *gpu_cycle)
{
struct timespec64 ts;
kbase_backend_get_gpu_time(kbdev, gpu_cycle, gpu_ts, &ts);
if (cpu_ts)
*cpu_ts = (u64)(ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec);
}
#endif
u64 kbase_arch_timer_get_cntfrq(struct kbase_device *kbdev)
{
u64 freq = arch_timer_get_cntfrq();
#if !IS_ENABLED(CONFIG_MALI_REAL_HW)
freq = midgard_model_arch_timer_get_cntfrq(kbdev->model);
#endif
dev_dbg(kbdev->dev, "System Timer Freq = %lluHz", freq);
return freq;
}
int kbase_backend_time_init(struct kbase_device *kbdev)
{
int err = 0;
#if MALI_USE_CSF
u64 cpu_ts = 0;
u64 gpu_ts = 0;
u64 freq;
u64 common_factor;
kbase_pm_register_access_enable(kbdev);
get_cpu_gpu_time(kbdev, &cpu_ts, &gpu_ts, NULL);
freq = kbase_arch_timer_get_cntfrq(kbdev);
if (!freq) {
dev_warn(kbdev->dev, "arch_timer_get_rate() is zero!");
err = -EINVAL;
goto disable_registers;
}
common_factor = gcd(NSEC_PER_SEC, freq);
kbdev->backend_time.multiplier = div64_u64(NSEC_PER_SEC, common_factor);
kbdev->backend_time.divisor = div64_u64(freq, common_factor);
if (!kbdev->backend_time.divisor) {
dev_warn(kbdev->dev, "CPU to GPU divisor is zero!");
err = -EINVAL;
goto disable_registers;
}
kbdev->backend_time.offset =
(s64)(cpu_ts - div64_u64(gpu_ts * kbdev->backend_time.multiplier,
kbdev->backend_time.divisor));
#endif
if (kbase_timeout_scaling_init(kbdev)) {
dev_warn(kbdev->dev, "Could not initialize timeout scaling");
err = -EINVAL;
}
#if MALI_USE_CSF
disable_registers:
kbase_pm_register_access_disable(kbdev);
#endif
return err;
}

View File

@@ -0,0 +1,281 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2017-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/* Kernel-side tests may include mali_kbase's headers. Therefore any config
* options which affect the sizes of any structs (e.g. adding extra members)
* must be included in these defaults, so that the structs are consistent in
* both mali_kbase and the test modules. */
bob_defaults {
name: "mali_kbase_shared_config_defaults",
defaults: [
"kernel_defaults",
],
mali_no_mali: {
kbuild_options: [
"CONFIG_MALI_NO_MALI=y",
"CONFIG_MALI_NO_MALI_DEFAULT_GPU={{.gpu}}",
"CONFIG_GPU_HWVER={{.hwver}}",
],
},
gpu_has_csf: {
kbuild_options: ["CONFIG_MALI_CSF_SUPPORT=y"],
},
mali_devfreq: {
kbuild_options: ["CONFIG_MALI_DEVFREQ=y"],
},
mali_midgard_dvfs: {
kbuild_options: ["CONFIG_MALI_MIDGARD_DVFS=y"],
},
mali_gator_support: {
kbuild_options: ["CONFIG_MALI_GATOR_SUPPORT=y"],
},
mali_midgard_enable_trace: {
kbuild_options: ["CONFIG_MALI_MIDGARD_ENABLE_TRACE=y"],
},
mali_arbiter_support: {
kbuild_options: ["CONFIG_MALI_ARBITER_SUPPORT=y"],
},
mali_dma_buf_map_on_demand: {
kbuild_options: ["CONFIG_MALI_DMA_BUF_MAP_ON_DEMAND=y"],
},
mali_dma_buf_legacy_compat: {
kbuild_options: ["CONFIG_MALI_DMA_BUF_LEGACY_COMPAT=y"],
},
page_migration_support: {
kbuild_options: ["CONFIG_PAGE_MIGRATION_SUPPORT=y"],
},
large_page_support: {
kbuild_options: ["CONFIG_LARGE_PAGE_SUPPORT=y"],
},
mali_corestack: {
kbuild_options: ["CONFIG_MALI_CORESTACK=y"],
},
mali_real_hw: {
kbuild_options: ["CONFIG_MALI_REAL_HW=y"],
},
mali_error_inject_none: {
kbuild_options: ["CONFIG_MALI_ERROR_INJECT_NONE=y"],
},
mali_error_inject_track_list: {
kbuild_options: ["CONFIG_MALI_ERROR_INJECT_TRACK_LIST=y"],
},
mali_error_inject_random: {
kbuild_options: ["CONFIG_MALI_ERROR_INJECT_RANDOM=y"],
},
mali_error_inject: {
kbuild_options: ["CONFIG_MALI_ERROR_INJECT=y"],
},
mali_debug: {
kbuild_options: [
"CONFIG_MALI_DEBUG=y",
"MALI_KERNEL_TEST_API={{.debug}}",
],
},
mali_fence_debug: {
kbuild_options: ["CONFIG_MALI_FENCE_DEBUG=y"],
},
mali_system_trace: {
kbuild_options: ["CONFIG_MALI_SYSTEM_TRACE=y"],
},
cinstr_vector_dump: {
kbuild_options: ["CONFIG_MALI_VECTOR_DUMP=y"],
},
cinstr_gwt: {
kbuild_options: ["CONFIG_MALI_CINSTR_GWT=y"],
},
cinstr_primary_hwc: {
kbuild_options: ["CONFIG_MALI_PRFCNT_SET_PRIMARY=y"],
},
cinstr_secondary_hwc: {
kbuild_options: ["CONFIG_MALI_PRFCNT_SET_SECONDARY=y"],
},
cinstr_tertiary_hwc: {
kbuild_options: ["CONFIG_MALI_PRFCNT_SET_TERTIARY=y"],
},
cinstr_hwc_set_select_via_debug_fs: {
kbuild_options: ["CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS=y"],
},
mali_job_dump: {
kbuild_options: ["CONFIG_MALI_JOB_DUMP"],
},
mali_pwrsoft_765: {
kbuild_options: ["CONFIG_MALI_PWRSOFT_765=y"],
},
mali_hw_errata_1485982_not_affected: {
kbuild_options: ["CONFIG_MALI_HW_ERRATA_1485982_NOT_AFFECTED=y"],
},
mali_hw_errata_1485982_use_clock_alternative: {
kbuild_options: ["CONFIG_MALI_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE=y"],
},
platform_is_fpga: {
kbuild_options: ["CONFIG_MALI_IS_FPGA=y"],
},
mali_coresight: {
kbuild_options: ["CONFIG_MALI_CORESIGHT=y"],
},
mali_fw_trace_mode_manual: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_MANUAL=y"],
},
mali_fw_trace_mode_auto_print: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_AUTO_PRINT=y"],
},
mali_fw_trace_mode_auto_discard: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_AUTO_DISCARD=y"],
},
kbuild_options: [
"CONFIG_MALI_PLATFORM_NAME={{.mali_platform_name}}",
"MALI_CUSTOMER_RELEASE={{.release}}",
"MALI_UNIT_TEST={{.unit_test_code}}",
"MALI_USE_CSF={{.gpu_has_csf}}",
"MALI_JIT_PRESSURE_LIMIT_BASE={{.jit_pressure_limit_base}}",
// Start of CS experimental features definitions.
// If there is nothing below, definition should be added as follows:
// "MALI_EXPERIMENTAL_FEATURE={{.experimental_feature}}"
// experimental_feature above comes from Mconfig in
// <ddk_root>/product/base/
// However, in Mconfig, experimental_feature should be looked up (for
// similar explanation to this one) as ALLCAPS, i.e.
// EXPERIMENTAL_FEATURE.
//
// IMPORTANT: MALI_CS_EXPERIMENTAL should NEVER be defined below as it
// is an umbrella feature that would be open for inappropriate use
// (catch-all for experimental CS code without separating it into
// different features).
"MALI_INCREMENTAL_RENDERING_JM={{.incremental_rendering_jm}}",
"MALI_BASE_CSF_PERFORMANCE_TESTS={{.base_csf_performance_tests}}",
],
}
bob_kernel_module {
name: "mali_kbase",
defaults: [
"mali_kbase_shared_config_defaults",
],
srcs: [
"*.c",
"*.h",
"Kbuild",
"backend/gpu/*.c",
"backend/gpu/*.h",
"backend/gpu/Kbuild",
"context/*.c",
"context/*.h",
"context/Kbuild",
"hwcnt/*.c",
"hwcnt/*.h",
"hwcnt/backend/*.h",
"hwcnt/Kbuild",
"ipa/*.c",
"ipa/*.h",
"ipa/Kbuild",
"platform/*.h",
"platform/*/*.c",
"platform/*/*.h",
"platform/*/Kbuild",
"platform/*/*/*.c",
"platform/*/*/*.h",
"platform/*/*/Kbuild",
"platform/*/*/*.c",
"platform/*/*/*.h",
"platform/*/*/Kbuild",
"platform/*/*/*/*.c",
"platform/*/*/*/*.h",
"platform/*/*/*/Kbuild",
"thirdparty/*.c",
"thirdparty/*.h",
"thirdparty/Kbuild",
"debug/*.c",
"debug/*.h",
"debug/Kbuild",
"device/*.c",
"device/*.h",
"device/Kbuild",
"gpu/*.c",
"gpu/*.h",
"gpu/Kbuild",
"hw_access/*.c",
"hw_access/*.h",
"hw_access/*/*.c",
"hw_access/*/*.h",
"hw_access/Kbuild",
"tl/*.c",
"tl/*.h",
"tl/Kbuild",
"mmu/*.c",
"mmu/*.h",
"mmu/Kbuild",
],
gpu_has_job_manager: {
srcs: [
"context/backend/*_jm.c",
"debug/backend/*_jm.c",
"debug/backend/*_jm.h",
"device/backend/*_jm.c",
"gpu/backend/*_jm.c",
"gpu/backend/*_jm.h",
"hwcnt/backend/*_jm.c",
"hwcnt/backend/*_jm.h",
"hwcnt/backend/*_jm_*.c",
"hwcnt/backend/*_jm_*.h",
"jm/*.h",
"tl/backend/*_jm.c",
"mmu/backend/*_jm.c",
"ipa/backend/*_jm.c",
"ipa/backend/*_jm.h",
],
},
gpu_has_csf: {
srcs: [
"context/backend/*_csf.c",
"csf/*.c",
"csf/*.h",
"csf/Kbuild",
"csf/ipa_control/*.c",
"csf/ipa_control/*.h",
"csf/ipa_control/Kbuild",
"debug/backend/*_csf.c",
"debug/backend/*_csf.h",
"device/backend/*_csf.c",
"gpu/backend/*_csf.c",
"gpu/backend/*_csf.h",
"hwcnt/backend/*_csf.c",
"hwcnt/backend/*_csf.h",
"hwcnt/backend/*_csf_*.c",
"hwcnt/backend/*_csf_*.h",
"tl/backend/*_csf.c",
"mmu/backend/*_csf.c",
"ipa/backend/*_csf.c",
"ipa/backend/*_csf.h",
],
},
mali_arbiter_support: {
srcs: [
"arbiter/*.c",
"arbiter/*.h",
"arbiter/Kbuild",
],
},
kbuild_options: [
"CONFIG_MALI_MIDGARD=m",
"CONFIG_MALI_KUTF=n",
],
}

View File

@@ -0,0 +1,27 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2013, 2016-2017, 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
mali_kbase-y += context/mali_kbase_context.o
ifeq ($(CONFIG_MALI_CSF_SUPPORT),y)
mali_kbase-y += context/backend/mali_kbase_context_csf.o
else
mali_kbase-y += context/backend/mali_kbase_context_jm.o
endif

View File

@@ -0,0 +1,212 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs for CSF GPUs
*/
#include <context/mali_kbase_context_internal.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_mem_pool_group.h>
#include <mmu/mali_kbase_mmu.h>
#include <tl/mali_kbase_timeline.h>
#include <mali_kbase_ctx_sched.h>
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include <csf/mali_kbase_csf_csg_debugfs.h>
#include <csf/mali_kbase_csf_kcpu_debugfs.h>
#include <csf/mali_kbase_csf_sync_debugfs.h>
#include <csf/mali_kbase_csf_tiler_heap_debugfs.h>
#include <csf/mali_kbase_csf_cpu_queue_debugfs.h>
#include <mali_kbase_debug_mem_view.h>
#include <mali_kbase_debug_mem_zones.h>
#include <mali_kbase_debug_mem_allocs.h>
#include <mali_kbase_mem_pool_debugfs.h>
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
kbase_debug_mem_view_init(kctx);
kbase_debug_mem_zones_init(kctx);
kbase_debug_mem_allocs_init(kctx);
kbase_mem_pool_debugfs_init(kctx->kctx_dentry, kctx);
kbase_jit_debugfs_init(kctx);
kbase_csf_queue_group_debugfs_init(kctx);
kbase_csf_kcpu_debugfs_init(kctx);
kbase_csf_sync_debugfs_init(kctx);
kbase_csf_tiler_heap_debugfs_init(kctx);
kbase_csf_tiler_heap_total_debugfs_init(kctx);
kbase_csf_cpu_queue_debugfs_init(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
debugfs_remove_recursive(kctx->kctx_dentry);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#else
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#endif /* CONFIG_DEBUG_FS */
static void kbase_context_free(struct kbase_context *kctx)
{
kbase_timeline_post_kbase_context_destroy(kctx);
vfree(kctx);
}
static const struct kbase_context_init context_init[] = {
{ NULL, kbase_context_free, NULL },
{ kbase_context_common_init, kbase_context_common_term,
"Common context initialization failed" },
{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
"Memory pool group initialization failed" },
{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
"Memory evictable initialization failed" },
{ kbase_ctx_sched_init_ctx, NULL, NULL },
{ kbase_context_mmu_init, kbase_context_mmu_term, "MMU initialization failed" },
{ kbase_context_mem_alloc_page, kbase_context_mem_pool_free, "Memory alloc page failed" },
{ kbase_region_tracker_init, kbase_region_tracker_term,
"Region tracker initialization failed" },
{ kbase_sticky_resource_init, kbase_context_sticky_resource_term,
"Sticky resource initialization failed" },
{ kbase_jit_init, kbase_jit_term, "JIT initialization failed" },
{ kbase_csf_ctx_init, kbase_csf_ctx_term, "CSF context initialization failed" },
{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
"Adding kctx to device failed" },
};
static void kbase_context_term_partial(struct kbase_context *kctx, unsigned int i)
{
while (i-- > 0) {
if (context_init[i].term)
context_init[i].term(kctx);
}
}
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long const api_version,
struct kbase_file *const kfile)
{
struct kbase_context *kctx;
unsigned int i = 0;
if (WARN_ON(!kbdev))
return NULL;
/* Validate flags */
if (WARN_ON(flags != (flags & BASEP_CONTEXT_CREATE_KERNEL_FLAGS)))
return NULL;
/* zero-inited as lot of code assume it's zero'ed out on create */
kctx = vzalloc(sizeof(*kctx));
if (WARN_ON(!kctx))
return NULL;
kctx->kbdev = kbdev;
kctx->api_version = api_version;
kctx->kfile = kfile;
kctx->create_flags = flags;
memcpy(kctx->comm, current->comm, sizeof(current->comm));
if (is_compat)
kbase_ctx_flag_set(kctx, KCTX_COMPAT);
#if defined(CONFIG_64BIT)
else
kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
#endif /* defined(CONFIG_64BIT) */
for (i = 0; i < ARRAY_SIZE(context_init); i++) {
int err = 0;
if (context_init[i].init)
err = context_init[i].init(kctx);
if (err) {
dev_err(kbdev->dev, "%s error = %d\n", context_init[i].err_mes, err);
/* kctx should be freed by kbase_context_free().
* Otherwise it will result in memory leak.
*/
WARN_ON(i == 0);
kbase_context_term_partial(kctx, i);
return NULL;
}
}
return kctx;
}
KBASE_EXPORT_SYMBOL(kbase_create_context);
void kbase_destroy_context(struct kbase_context *kctx)
{
struct kbase_device *kbdev;
if (WARN_ON(!kctx))
return;
kbdev = kctx->kbdev;
if (WARN_ON(!kbdev))
return;
/* Context termination could happen whilst the system suspend of
* the GPU device is ongoing or has completed. It has been seen on
* Customer side that a hang could occur if context termination is
* not blocked until the resume of GPU device.
*/
while (kbase_pm_context_active_handle_suspend(kbdev,
KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
dev_info(kbdev->dev, "Suspend in progress when destroying context");
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_suspending(kbdev));
}
/* Have synchronized against the System suspend and incremented the
* pm.active_count. So any subsequent invocation of System suspend
* callback would get blocked.
* If System suspend callback was already in progress then the above loop
* would have waited till the System resume callback has begun.
* So wait for the System resume callback to also complete as we want to
* avoid context termination during System resume also.
*/
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_resuming(kbdev));
kbase_mem_pool_group_mark_dying(&kctx->mem_pools);
kbase_context_term_partial(kctx, ARRAY_SIZE(context_init));
kbase_pm_context_idle(kbdev);
}
KBASE_EXPORT_SYMBOL(kbase_destroy_context);

View File

@@ -0,0 +1,269 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs for Job Manager GPUs
*/
#include <context/mali_kbase_context_internal.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase.h>
#include <mali_kbase_ctx_sched.h>
#include <mali_kbase_kinstr_jm.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_mem_pool_group.h>
#include <mmu/mali_kbase_mmu.h>
#include <tl/mali_kbase_timeline.h>
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include <mali_kbase_debug_mem_view.h>
#include <mali_kbase_debug_mem_zones.h>
#include <mali_kbase_debug_mem_allocs.h>
#include <mali_kbase_mem_pool_debugfs.h>
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
kbase_debug_mem_view_init(kctx);
kbase_debug_mem_zones_init(kctx);
kbase_debug_mem_allocs_init(kctx);
kbase_mem_pool_debugfs_init(kctx->kctx_dentry, kctx);
kbase_jit_debugfs_init(kctx);
kbasep_jd_debugfs_ctx_init(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
debugfs_remove_recursive(kctx->kctx_dentry);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#else
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#endif /* CONFIG_DEBUG_FS */
static int kbase_context_kbase_kinstr_jm_init(struct kbase_context *kctx)
{
return kbase_kinstr_jm_init(&kctx->kinstr_jm);
}
static void kbase_context_kbase_kinstr_jm_term(struct kbase_context *kctx)
{
kbase_kinstr_jm_term(kctx->kinstr_jm);
}
static int kbase_context_kbase_timer_setup(struct kbase_context *kctx)
{
kbase_timer_setup(&kctx->soft_job_timeout, kbasep_soft_job_timeout_worker);
return 0;
}
static int kbase_context_submit_check(struct kbase_context *kctx)
{
struct kbasep_js_kctx_info *js_kctx_info = &kctx->jctx.sched_info;
unsigned long irq_flags = 0;
base_context_create_flags const flags = kctx->create_flags;
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
spin_lock_irqsave(&kctx->kbdev->hwaccess_lock, irq_flags);
/* Translate the flags */
if ((flags & BASE_CONTEXT_SYSTEM_MONITOR_SUBMIT_DISABLED) == 0)
kbase_ctx_flag_clear(kctx, KCTX_SUBMIT_DISABLED);
spin_unlock_irqrestore(&kctx->kbdev->hwaccess_lock, irq_flags);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return 0;
}
static void kbase_context_flush_jobs(struct kbase_context *kctx)
{
kbase_jd_zap_context(kctx);
flush_workqueue(kctx->jctx.job_done_wq);
}
/**
* kbase_context_free - Free kcontext at its destruction
*
* @kctx: kcontext to be freed
*/
static void kbase_context_free(struct kbase_context *kctx)
{
kbase_timeline_post_kbase_context_destroy(kctx);
vfree(kctx);
}
static const struct kbase_context_init context_init[] = {
{ NULL, kbase_context_free, NULL },
{ kbase_context_common_init, kbase_context_common_term,
"Common context initialization failed" },
{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
"Memory pool group initialization failed" },
{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
"Memory evictable initialization failed" },
{ kbase_ctx_sched_init_ctx, NULL, NULL },
{ kbase_context_mmu_init, kbase_context_mmu_term, "MMU initialization failed" },
{ kbase_context_mem_alloc_page, kbase_context_mem_pool_free, "Memory alloc page failed" },
{ kbase_region_tracker_init, kbase_region_tracker_term,
"Region tracker initialization failed" },
{ kbase_sticky_resource_init, kbase_context_sticky_resource_term,
"Sticky resource initialization failed" },
{ kbase_jit_init, kbase_jit_term, "JIT initialization failed" },
{ kbase_context_kbase_kinstr_jm_init, kbase_context_kbase_kinstr_jm_term,
"JM instrumentation initialization failed" },
{ kbase_context_kbase_timer_setup, NULL, "Timers initialization failed" },
{ kbase_event_init, kbase_event_cleanup, "Event initialization failed" },
{ kbasep_js_kctx_init, kbasep_js_kctx_term, "JS kctx initialization failed" },
{ kbase_jd_init, kbase_jd_exit, "JD initialization failed" },
{ kbase_context_submit_check, NULL, "Enabling job submission failed" },
#if IS_ENABLED(CONFIG_DEBUG_FS)
{ kbase_debug_job_fault_context_init, kbase_debug_job_fault_context_term,
"Job fault context initialization failed" },
#endif
{ kbasep_platform_context_init, kbasep_platform_context_term,
"Platform callback for kctx initialization failed" },
{ NULL, kbase_context_flush_jobs, NULL },
{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
"Adding kctx to device failed" },
};
static void kbase_context_term_partial(struct kbase_context *kctx, unsigned int i)
{
while (i-- > 0) {
if (context_init[i].term)
context_init[i].term(kctx);
}
}
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long const api_version,
struct kbase_file *const kfile)
{
struct kbase_context *kctx;
unsigned int i = 0;
if (WARN_ON(!kbdev))
return NULL;
/* Validate flags */
if (WARN_ON(flags != (flags & BASEP_CONTEXT_CREATE_KERNEL_FLAGS)))
return NULL;
/* zero-inited as lot of code assume it's zero'ed out on create */
kctx = vzalloc(sizeof(*kctx));
if (WARN_ON(!kctx))
return NULL;
kctx->kbdev = kbdev;
kctx->api_version = api_version;
kctx->kfile = kfile;
kctx->create_flags = flags;
if (is_compat)
kbase_ctx_flag_set(kctx, KCTX_COMPAT);
#if defined(CONFIG_64BIT)
else
kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
#endif /* defined(CONFIG_64BIT) */
for (i = 0; i < ARRAY_SIZE(context_init); i++) {
int err = 0;
if (context_init[i].init)
err = context_init[i].init(kctx);
if (err) {
dev_err(kbdev->dev, "%s error = %d\n", context_init[i].err_mes, err);
/* kctx should be freed by kbase_context_free().
* Otherwise it will result in memory leak.
*/
WARN_ON(i == 0);
kbase_context_term_partial(kctx, i);
return NULL;
}
}
return kctx;
}
KBASE_EXPORT_SYMBOL(kbase_create_context);
void kbase_destroy_context(struct kbase_context *kctx)
{
struct kbase_device *kbdev;
if (WARN_ON(!kctx))
return;
kbdev = kctx->kbdev;
if (WARN_ON(!kbdev))
return;
/* Context termination could happen whilst the system suspend of
* the GPU device is ongoing or has completed. It has been seen on
* Customer side that a hang could occur if context termination is
* not blocked until the resume of GPU device.
*/
#ifdef CONFIG_MALI_ARBITER_SUPPORT
atomic_inc(&kbdev->pm.gpu_users_waiting);
#endif /* CONFIG_MALI_ARBITER_SUPPORT */
while (kbase_pm_context_active_handle_suspend(kbdev,
KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
dev_dbg(kbdev->dev, "Suspend in progress when destroying context");
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_suspending(kbdev));
}
/* Have synchronized against the System suspend and incremented the
* pm.active_count. So any subsequent invocation of System suspend
* callback would get blocked.
* If System suspend callback was already in progress then the above loop
* would have waited till the System resume callback has begun.
* So wait for the System resume callback to also complete as we want to
* avoid context termination during System resume also.
*/
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_resuming(kbdev));
#ifdef CONFIG_MALI_ARBITER_SUPPORT
atomic_dec(&kbdev->pm.gpu_users_waiting);
#endif /* CONFIG_MALI_ARBITER_SUPPORT */
kbase_mem_pool_group_mark_dying(&kctx->mem_pools);
kbase_context_term_partial(kctx, ARRAY_SIZE(context_init));
kbase_pm_context_idle(kbdev);
}
KBASE_EXPORT_SYMBOL(kbase_destroy_context);

View File

@@ -0,0 +1,370 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs
*/
#include <linux/version.h>
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
#include <linux/sched/task.h>
#endif
#if KERNEL_VERSION(4, 19, 0) <= LINUX_VERSION_CODE
#include <linux/sched/signal.h>
#else
#include <linux/sched.h>
#endif
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_ctx_sched.h>
#include <mali_kbase_mem_pool_group.h>
#include <tl/mali_kbase_timeline.h>
#include <mmu/mali_kbase_mmu.h>
#include <context/mali_kbase_context_internal.h>
/**
* find_process_node - Used to traverse the process rb_tree to find if
* process exists already in process rb_tree.
*
* @node: Pointer to root node to start search.
* @tgid: Thread group PID to search for.
*
* Return: Pointer to kbase_process if exists otherwise NULL.
*/
static struct kbase_process *find_process_node(struct rb_node *node, pid_t tgid)
{
struct kbase_process *kprcs = NULL;
/* Check if the kctx creation request is from a existing process.*/
while (node) {
struct kbase_process *prcs_node = rb_entry(node, struct kbase_process, kprcs_node);
if (prcs_node->tgid == tgid) {
kprcs = prcs_node;
break;
}
if (tgid < prcs_node->tgid)
node = node->rb_left;
else
node = node->rb_right;
}
return kprcs;
}
/**
* kbase_insert_kctx_to_process - Initialise kbase process context.
*
* @kctx: Pointer to kbase context.
*
* Here we initialise per process rb_tree managed by kbase_device.
* We maintain a rb_tree of each unique process that gets created.
* and Each process maintains a list of kbase context.
* This setup is currently used by kernel trace functionality
* to trace and visualise gpu memory consumption.
*
* Return: 0 on success and error number on failure.
*/
static int kbase_insert_kctx_to_process(struct kbase_context *kctx)
{
struct rb_root *const prcs_root = &kctx->kbdev->process_root;
const pid_t tgid = kctx->tgid;
struct kbase_process *kprcs = NULL;
lockdep_assert_held(&kctx->kbdev->kctx_list_lock);
kprcs = find_process_node(prcs_root->rb_node, tgid);
/* if the kctx is from new process then create a new kbase_process
* and add it to the &kbase_device->rb_tree
*/
if (!kprcs) {
struct rb_node **new = &prcs_root->rb_node, *parent = NULL;
kprcs = kzalloc(sizeof(*kprcs), GFP_KERNEL);
if (kprcs == NULL)
return -ENOMEM;
kprcs->tgid = tgid;
INIT_LIST_HEAD(&kprcs->kctx_list);
kprcs->dma_buf_root = RB_ROOT;
kprcs->total_gpu_pages = 0;
while (*new) {
struct kbase_process *prcs_node;
parent = *new;
prcs_node = rb_entry(parent, struct kbase_process, kprcs_node);
if (tgid < prcs_node->tgid)
new = &(*new)->rb_left;
else
new = &(*new)->rb_right;
}
rb_link_node(&kprcs->kprcs_node, parent, new);
rb_insert_color(&kprcs->kprcs_node, prcs_root);
}
kctx->kprcs = kprcs;
list_add(&kctx->kprcs_link, &kprcs->kctx_list);
return 0;
}
int kbase_context_common_init(struct kbase_context *kctx)
{
const unsigned long cookies_mask = KBASE_COOKIE_MASK;
int err = 0;
/* creating a context is considered a disjoint event */
kbase_disjoint_event(kctx->kbdev);
kctx->tgid = current->tgid;
kctx->pid = current->pid;
/* Check if this is a Userspace created context */
if (likely(kctx->kfile)) {
struct pid *pid_struct;
rcu_read_lock();
pid_struct = get_pid(task_tgid(current));
if (likely(pid_struct)) {
struct task_struct *task = pid_task(pid_struct, PIDTYPE_PID);
if (likely(task)) {
/* Take a reference on the task to avoid slow lookup
* later on from the page allocation loop.
*/
get_task_struct(task);
kctx->task = task;
} else {
dev_err(kctx->kbdev->dev, "Failed to get task pointer for %s/%d",
current->comm, current->pid);
err = -ESRCH;
}
put_pid(pid_struct);
} else {
dev_err(kctx->kbdev->dev, "Failed to get pid pointer for %s/%d",
current->comm, current->pid);
err = -ESRCH;
}
rcu_read_unlock();
if (unlikely(err))
return err;
kbase_mem_mmgrab();
kctx->process_mm = current->mm;
}
mutex_init(&kctx->reg_lock);
spin_lock_init(&kctx->mem_partials_lock);
INIT_LIST_HEAD(&kctx->mem_partials);
spin_lock_init(&kctx->waiting_soft_jobs_lock);
INIT_LIST_HEAD(&kctx->waiting_soft_jobs);
kbase_gpu_vm_lock(kctx);
bitmap_copy(kctx->cookies, &cookies_mask, BITS_PER_LONG);
kbase_gpu_vm_unlock(kctx);
kctx->id = (u32)atomic_add_return(1, &(kctx->kbdev->ctx_num)) - 1;
mutex_lock(&kctx->kbdev->kctx_list_lock);
err = kbase_insert_kctx_to_process(kctx);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
if (err) {
dev_err(kctx->kbdev->dev, "(err:%d) failed to insert kctx to kbase_process", err);
if (likely(kctx->kfile)) {
mmdrop(kctx->process_mm);
put_task_struct(kctx->task);
}
}
return err;
}
int kbase_context_add_to_dev_list(struct kbase_context *kctx)
{
if (WARN_ON(!kctx))
return -EINVAL;
if (WARN_ON(!kctx->kbdev))
return -EINVAL;
mutex_lock(&kctx->kbdev->kctx_list_lock);
list_add(&kctx->kctx_list_link, &kctx->kbdev->kctx_list);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
kbase_timeline_post_kbase_context_create(kctx);
return 0;
}
void kbase_context_remove_from_dev_list(struct kbase_context *kctx)
{
if (WARN_ON(!kctx))
return;
if (WARN_ON(!kctx->kbdev))
return;
kbase_timeline_pre_kbase_context_destroy(kctx);
mutex_lock(&kctx->kbdev->kctx_list_lock);
list_del_init(&kctx->kctx_list_link);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
}
/**
* kbase_remove_kctx_from_process - remove a terminating context from
* the process list.
*
* @kctx: Pointer to kbase context.
*
* Remove the tracking of context from the list of contexts maintained under
* kbase process and if the list if empty then there no outstanding contexts
* we can remove the process node as well.
*/
static void kbase_remove_kctx_from_process(struct kbase_context *kctx)
{
struct kbase_process *kprcs = kctx->kprcs;
lockdep_assert_held(&kctx->kbdev->kctx_list_lock);
list_del(&kctx->kprcs_link);
/* if there are no outstanding contexts in current process node,
* we can remove it from the process rb_tree.
*/
if (list_empty(&kprcs->kctx_list)) {
rb_erase(&kprcs->kprcs_node, &kctx->kbdev->process_root);
/* Add checks, so that the terminating process Should not
* hold any gpu_memory.
*/
spin_lock(&kctx->kbdev->gpu_mem_usage_lock);
WARN_ON(kprcs->total_gpu_pages);
spin_unlock(&kctx->kbdev->gpu_mem_usage_lock);
WARN_ON(!RB_EMPTY_ROOT(&kprcs->dma_buf_root));
kfree(kprcs);
}
}
void kbase_context_common_term(struct kbase_context *kctx)
{
int pages;
pages = atomic_read(&kctx->used_pages);
if (pages != 0)
dev_warn(kctx->kbdev->dev, "%s: %d pages in use!\n", __func__, pages);
WARN_ON(atomic_read(&kctx->nonmapped_pages) != 0);
mutex_lock(&kctx->kbdev->kctx_list_lock);
kbase_remove_kctx_from_process(kctx);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
if (likely(kctx->kfile)) {
mmdrop(kctx->process_mm);
put_task_struct(kctx->task);
}
KBASE_KTRACE_ADD(kctx->kbdev, CORE_CTX_DESTROY, kctx, 0u);
}
int kbase_context_mem_pool_group_init(struct kbase_context *kctx)
{
return kbase_mem_pool_group_init(&kctx->mem_pools, kctx->kbdev,
&kctx->kbdev->mem_pool_defaults, &kctx->kbdev->mem_pools);
}
void kbase_context_mem_pool_group_term(struct kbase_context *kctx)
{
kbase_mem_pool_group_term(&kctx->mem_pools);
}
int kbase_context_mmu_init(struct kbase_context *kctx)
{
return kbase_mmu_init(kctx->kbdev, &kctx->mmu, kctx,
kbase_context_mmu_group_id_get(kctx->create_flags));
}
void kbase_context_mmu_term(struct kbase_context *kctx)
{
kbase_mmu_term(kctx->kbdev, &kctx->mmu);
}
int kbase_context_mem_alloc_page(struct kbase_context *kctx)
{
struct page *p;
p = kbase_mem_alloc_page(&kctx->mem_pools.small[KBASE_MEM_GROUP_SINK], false);
if (!p)
return -ENOMEM;
kctx->aliasing_sink_page = as_tagged(page_to_phys(p));
return 0;
}
void kbase_context_mem_pool_free(struct kbase_context *kctx)
{
/* drop the aliasing sink page now that it can't be mapped anymore */
kbase_mem_pool_free(&kctx->mem_pools.small[KBASE_MEM_GROUP_SINK],
as_page(kctx->aliasing_sink_page), false);
}
void kbase_context_sticky_resource_term(struct kbase_context *kctx)
{
unsigned long pending_regions_to_clean;
kbase_gpu_vm_lock(kctx);
kbase_sticky_resource_term(kctx);
/* free pending region setups */
pending_regions_to_clean = KBASE_COOKIE_MASK;
bitmap_andnot(&pending_regions_to_clean, &pending_regions_to_clean, kctx->cookies,
BITS_PER_LONG);
while (pending_regions_to_clean) {
unsigned int cookie = find_first_bit(&pending_regions_to_clean, BITS_PER_LONG);
if (!WARN_ON(!kctx->pending_regions[cookie])) {
dev_dbg(kctx->kbdev->dev, "Freeing pending unmapped region\n");
kbase_mem_phy_alloc_put(kctx->pending_regions[cookie]->cpu_alloc);
kbase_mem_phy_alloc_put(kctx->pending_regions[cookie]->gpu_alloc);
kfree(kctx->pending_regions[cookie]);
kctx->pending_regions[cookie] = NULL;
}
bitmap_clear(&pending_regions_to_clean, cookie, 1);
}
kbase_gpu_vm_unlock(kctx);
}
bool kbase_ctx_compat_mode(struct kbase_context *kctx)
{
return !IS_ENABLED(CONFIG_64BIT) ||
(IS_ENABLED(CONFIG_64BIT) && kbase_ctx_flag(kctx, KCTX_COMPAT));
}
KBASE_EXPORT_TEST_API(kbase_ctx_compat_mode);

View File

@@ -0,0 +1,134 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CONTEXT_H_
#define _KBASE_CONTEXT_H_
#include <linux/atomic.h>
/**
* kbase_context_debugfs_init - Initialize the kctx platform
* specific debugfs
*
* @kctx: kbase context
*
* This initializes some debugfs interfaces specific to the platform the source
* is compiled for.
*/
void kbase_context_debugfs_init(struct kbase_context *const kctx);
/**
* kbase_context_debugfs_term - Terminate the kctx platform
* specific debugfs
*
* @kctx: kbase context
*
* This terminates some debugfs interfaces specific to the platform the source
* is compiled for.
*/
void kbase_context_debugfs_term(struct kbase_context *const kctx);
/**
* kbase_create_context() - Create a kernel base context.
*
* @kbdev: Object representing an instance of GPU platform device,
* allocated from the probe method of the Mali driver.
* @is_compat: Force creation of a 32-bit context
* @flags: Flags to set, which shall be any combination of
* BASEP_CONTEXT_CREATE_KERNEL_FLAGS.
* @api_version: Application program interface version, as encoded in
* a single integer by the KBASE_API_VERSION macro.
* @kfile: Pointer to the object representing the /dev/malixx device
* file instance. Shall be passed as NULL for internally created
* contexts.
*
* Up to one context can be created for each client that opens the device file
* /dev/malixx. Context creation is deferred until a special ioctl() system call
* is made on the device file. Each context has its own GPU address space.
*
* Return: new kbase context or NULL on failure
*/
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long api_version,
struct kbase_file *const kfile);
/**
* kbase_destroy_context - Destroy a kernel base context.
* @kctx: Context to destroy
*
* Will release all outstanding regions.
*/
void kbase_destroy_context(struct kbase_context *kctx);
/**
* kbase_ctx_flag - Check if @flag is set on @kctx
* @kctx: Pointer to kbase context to check
* @flag: Flag to check
*
* Return: true if @flag is set on @kctx, false if not.
*/
static inline bool kbase_ctx_flag(struct kbase_context *kctx, enum kbase_context_flags flag)
{
return atomic_read(&kctx->flags) & (int)flag;
}
/**
* kbase_ctx_compat_mode - Indicate whether a kbase context needs to operate
* in compatibility mode for 32-bit userspace.
* @kctx: kbase context
*
* Return: True if needs to maintain compatibility, False otherwise.
*/
bool kbase_ctx_compat_mode(struct kbase_context *kctx);
/**
* kbase_ctx_flag_clear - Clear @flag on @kctx
* @kctx: Pointer to kbase context
* @flag: Flag to clear
*
* Clear the @flag on @kctx. This is done atomically, so other flags being
* cleared or set at the same time will be safe.
*
* Some flags have locking requirements, check the documentation for the
* respective flags.
*/
static inline void kbase_ctx_flag_clear(struct kbase_context *kctx, enum kbase_context_flags flag)
{
atomic_andnot(flag, &kctx->flags);
}
/**
* kbase_ctx_flag_set - Set @flag on @kctx
* @kctx: Pointer to kbase context
* @flag: Flag to set
*
* Set the @flag on @kctx. This is done atomically, so other flags being
* cleared or set at the same time will be safe.
*
* Some flags have locking requirements, check the documentation for the
* respective flags.
*/
static inline void kbase_ctx_flag_set(struct kbase_context *kctx, enum kbase_context_flags flag)
{
atomic_or(flag, &kctx->flags);
}
#endif /* _KBASE_CONTEXT_H_ */

View File

@@ -0,0 +1,62 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
typedef int kbase_context_init_method(struct kbase_context *kctx);
typedef void kbase_context_term_method(struct kbase_context *kctx);
/**
* struct kbase_context_init - Device init/term methods.
* @init: Function pointer to a initialise method.
* @term: Function pointer to a terminate method.
* @err_mes: Error message to be printed when init method fails.
*/
struct kbase_context_init {
kbase_context_init_method *init;
kbase_context_term_method *term;
char *err_mes;
};
/**
* kbase_context_common_init() - Initialize kbase context
* @kctx: Pointer to the kbase context to be initialized.
*
* This function must be called only when a kbase context is instantiated.
*
* Return: 0 on success.
*/
int kbase_context_common_init(struct kbase_context *kctx);
void kbase_context_common_term(struct kbase_context *kctx);
int kbase_context_mem_pool_group_init(struct kbase_context *kctx);
void kbase_context_mem_pool_group_term(struct kbase_context *kctx);
int kbase_context_mmu_init(struct kbase_context *kctx);
void kbase_context_mmu_term(struct kbase_context *kctx);
int kbase_context_mem_alloc_page(struct kbase_context *kctx);
void kbase_context_mem_pool_free(struct kbase_context *kctx);
void kbase_context_sticky_resource_term(struct kbase_context *kctx);
int kbase_context_add_to_dev_list(struct kbase_context *kctx);
void kbase_context_remove_from_dev_list(struct kbase_context *kctx);

View File

@@ -0,0 +1,63 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2018-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
mali_kbase-y += \
csf/mali_kbase_csf_util.o \
csf/mali_kbase_csf_firmware_cfg.o \
csf/mali_kbase_csf_trace_buffer.o \
csf/mali_kbase_csf.o \
csf/mali_kbase_csf_scheduler.o \
csf/mali_kbase_csf_kcpu.o \
csf/mali_kbase_csf_tiler_heap.o \
csf/mali_kbase_csf_timeout.o \
csf/mali_kbase_csf_tl_reader.o \
csf/mali_kbase_csf_heap_context_alloc.o \
csf/mali_kbase_csf_reset_gpu.o \
csf/mali_kbase_csf_csg.o \
csf/mali_kbase_csf_csg_debugfs.o \
csf/mali_kbase_csf_kcpu_debugfs.o \
csf/mali_kbase_csf_sync.o \
csf/mali_kbase_csf_sync_debugfs.o \
csf/mali_kbase_csf_kcpu_fence_debugfs.o \
csf/mali_kbase_csf_protected_memory.o \
csf/mali_kbase_csf_tiler_heap_debugfs.o \
csf/mali_kbase_csf_cpu_queue.o \
csf/mali_kbase_csf_cpu_queue_debugfs.o \
csf/mali_kbase_csf_event.o \
csf/mali_kbase_csf_firmware_log.o \
csf/mali_kbase_csf_firmware_core_dump.o \
csf/mali_kbase_csf_tiler_heap_reclaim.o \
csf/mali_kbase_csf_mcu_shared_reg.o
ifeq ($(CONFIG_MALI_NO_MALI),y)
mali_kbase-y += csf/mali_kbase_csf_firmware_no_mali.o
else
mali_kbase-y += csf/mali_kbase_csf_firmware.o
endif
mali_kbase-$(CONFIG_DEBUG_FS) += csf/mali_kbase_debug_csf_fault.o
ifeq ($(KBUILD_EXTMOD),)
# in-tree
-include $(src)/csf/ipa_control/Kbuild
else
# out-of-tree
include $(src)/csf/ipa_control/Kbuild
endif

View File

@@ -0,0 +1,22 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
mali_kbase-y += \
csf/ipa_control/mali_kbase_csf_ipa_control.o

View File

@@ -0,0 +1,983 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#include "mali_kbase_csf_ipa_control.h"
/*
* Status flags from the STATUS register of the IPA Control interface.
*/
#define STATUS_COMMAND_ACTIVE ((u32)1 << 0)
#define STATUS_PROTECTED_MODE ((u32)1 << 8)
#define STATUS_RESET ((u32)1 << 9)
#define STATUS_TIMER_ENABLED ((u32)1 << 31)
/*
* Commands for the COMMAND register of the IPA Control interface.
*/
#define COMMAND_APPLY ((u32)1)
#define COMMAND_SAMPLE ((u32)3)
#define COMMAND_PROTECTED_ACK ((u32)4)
#define COMMAND_RESET_ACK ((u32)5)
/*
* Number of timer events per second.
*/
#define TIMER_EVENTS_PER_SECOND ((u32)1000 / IPA_CONTROL_TIMER_DEFAULT_VALUE_MS)
/*
* Number of bits used to configure a performance counter in SELECT registers.
*/
#define IPA_CONTROL_SELECT_BITS_PER_CNT ((u64)8)
/*
* Maximum value of a performance counter.
*/
#define MAX_PRFCNT_VALUE (((u64)1 << 48) - 1)
/**
* struct kbase_ipa_control_listener_data - Data for the GPU clock frequency
* listener
*
* @listener: GPU clock frequency listener.
* @kbdev: Pointer to kbase device.
* @clk_chg_wq: Dedicated workqueue to process the work item corresponding to
* a clock rate notification.
* @clk_chg_work: Work item to process the clock rate change
* @rate: The latest notified rate change, in unit of Hz
*/
struct kbase_ipa_control_listener_data {
struct kbase_clk_rate_listener listener;
struct kbase_device *kbdev;
struct workqueue_struct *clk_chg_wq;
struct work_struct clk_chg_work;
atomic_t rate;
};
static u32 timer_value(u32 gpu_rate)
{
return gpu_rate / TIMER_EVENTS_PER_SECOND;
}
static int wait_status(struct kbase_device *kbdev, u32 flags)
{
u32 val;
const u32 timeout_us = kbase_get_timeout_ms(kbdev, IPA_INACTIVE_TIMEOUT) * USEC_PER_MSEC;
/*
* Wait for the STATUS register to indicate that flags have been
* cleared, in case a transition is pending.
*/
const int err = kbase_reg_poll32_timeout(kbdev, IPA_CONTROL_ENUM(STATUS), val,
!(val & flags), 0, timeout_us, false);
if (err) {
dev_err(kbdev->dev, "IPA_CONTROL STATUS register stuck");
return -EBUSY;
}
return 0;
}
static int apply_select_config(struct kbase_device *kbdev, u64 *select)
{
int ret;
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_CSHW), select[KBASE_IPA_CORE_TYPE_CSHW]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_MEMSYS),
select[KBASE_IPA_CORE_TYPE_MEMSYS]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_TILER), select[KBASE_IPA_CORE_TYPE_TILER]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_SHADER),
select[KBASE_IPA_CORE_TYPE_SHADER]);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (!ret) {
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_APPLY);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
} else {
dev_err(kbdev->dev, "Wait for the pending command failed");
}
return ret;
}
static u64 read_value_cnt(struct kbase_device *kbdev, u8 type, u8 select_idx)
{
switch (type) {
case KBASE_IPA_CORE_TYPE_CSHW:
return kbase_reg_read64(kbdev, IPA_VALUE_CSHW_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_MEMSYS:
return kbase_reg_read64(kbdev, IPA_VALUE_MEMSYS_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_TILER:
return kbase_reg_read64(kbdev, IPA_VALUE_TILER_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_SHADER:
return kbase_reg_read64(kbdev, IPA_VALUE_SHADER_OFFSET(select_idx));
default:
WARN(1, "Unknown core type: %u\n", type);
return 0;
}
}
static void build_select_config(struct kbase_ipa_control *ipa_ctrl, u64 *select_config)
{
size_t i;
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
size_t j;
select_config[i] = 0ULL;
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config =
&ipa_ctrl->blocks[i].select[j];
select_config[i] |=
((u64)prfcnt_config->idx << (IPA_CONTROL_SELECT_BITS_PER_CNT * j));
}
}
}
static int update_select_registers(struct kbase_device *kbdev)
{
u64 select_config[KBASE_IPA_CORE_TYPE_NUM];
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
build_select_config(&kbdev->csf.ipa_control, select_config);
return apply_select_config(kbdev, select_config);
}
static inline void calc_prfcnt_delta(struct kbase_device *kbdev,
struct kbase_ipa_control_prfcnt *prfcnt, bool gpu_ready)
{
u64 delta_value, raw_value;
if (gpu_ready)
raw_value = read_value_cnt(kbdev, (u8)prfcnt->type, prfcnt->select_idx);
else
raw_value = prfcnt->latest_raw_value;
if (raw_value < prfcnt->latest_raw_value) {
delta_value = (MAX_PRFCNT_VALUE - prfcnt->latest_raw_value) + raw_value;
} else {
delta_value = raw_value - prfcnt->latest_raw_value;
}
delta_value *= prfcnt->scaling_factor;
if (kbdev->csf.ipa_control.cur_gpu_rate == 0) {
static bool warned;
if (!warned) {
dev_warn(kbdev->dev, "%s: GPU freq is unexpectedly 0", __func__);
warned = true;
}
} else if (prfcnt->gpu_norm)
delta_value = div_u64(delta_value, kbdev->csf.ipa_control.cur_gpu_rate);
prfcnt->latest_raw_value = raw_value;
/* Accumulate the difference */
prfcnt->accumulated_diff += delta_value;
}
/**
* kbase_ipa_control_rate_change_notify - GPU frequency change callback
*
* @listener: Clock frequency change listener.
* @clk_index: Index of the clock for which the change has occurred.
* @clk_rate_hz: Clock frequency(Hz).
*
* This callback notifies kbase_ipa_control about GPU frequency changes.
* Only top-level clock changes are meaningful. GPU frequency updates
* affect all performance counters which require GPU normalization
* in every session.
*/
static void kbase_ipa_control_rate_change_notify(struct kbase_clk_rate_listener *listener,
u32 clk_index, u32 clk_rate_hz)
{
if ((clk_index == KBASE_CLOCK_DOMAIN_TOP) && (clk_rate_hz != 0)) {
struct kbase_ipa_control_listener_data *listener_data =
container_of(listener, struct kbase_ipa_control_listener_data, listener);
/* Save the rate and delegate the job to a work item */
atomic_set(&listener_data->rate, clk_rate_hz);
queue_work(listener_data->clk_chg_wq, &listener_data->clk_chg_work);
}
}
static void kbase_ipa_ctrl_rate_change_worker(struct work_struct *data)
{
struct kbase_ipa_control_listener_data *listener_data =
container_of(data, struct kbase_ipa_control_listener_data, clk_chg_work);
struct kbase_device *kbdev = listener_data->kbdev;
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
unsigned long flags;
u32 rate;
size_t i;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_ready) {
dev_err(kbdev->dev, "%s: GPU frequency cannot change while GPU is off", __func__);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return;
}
spin_lock(&ipa_ctrl->lock);
/* Picking up the latest notified rate */
rate = (u32)atomic_read(&listener_data->rate);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[i];
if (session->active) {
size_t j;
for (j = 0; j < session->num_prfcnts; j++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[j];
if (prfcnt->gpu_norm)
calc_prfcnt_delta(kbdev, prfcnt, true);
}
}
}
ipa_ctrl->cur_gpu_rate = rate;
/* Update the timer for automatic sampling if active sessions
* are present. Counters have already been manually sampled.
*/
if (ipa_ctrl->num_active_sessions > 0)
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), timer_value(rate));
spin_unlock(&ipa_ctrl->lock);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
void kbase_ipa_control_init(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
struct kbase_ipa_control_listener_data *listener_data;
size_t i;
unsigned long flags;
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
ipa_ctrl->blocks[i].num_available_counters = KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS;
}
spin_lock_init(&ipa_ctrl->lock);
listener_data = kmalloc(sizeof(struct kbase_ipa_control_listener_data), GFP_KERNEL);
if (listener_data) {
listener_data->clk_chg_wq =
alloc_workqueue("ipa_ctrl_wq", WQ_HIGHPRI | WQ_UNBOUND, 1);
if (listener_data->clk_chg_wq) {
INIT_WORK(&listener_data->clk_chg_work, kbase_ipa_ctrl_rate_change_worker);
listener_data->listener.notify = kbase_ipa_control_rate_change_notify;
listener_data->kbdev = kbdev;
ipa_ctrl->rtm_listener_data = listener_data;
/* Initialise to 0, which is out of normal notified rates */
atomic_set(&listener_data->rate, 0);
} else {
dev_warn(kbdev->dev,
"%s: failed to allocate workqueue, clock rate update disabled",
__func__);
kfree(listener_data);
listener_data = NULL;
}
} else
dev_warn(kbdev->dev,
"%s: failed to allocate memory, IPA control clock rate update disabled",
__func__);
spin_lock_irqsave(&clk_rtm->lock, flags);
if (clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP])
ipa_ctrl->cur_gpu_rate = clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP]->clock_val;
if (listener_data)
kbase_clk_rate_trace_manager_subscribe_no_lock(clk_rtm, &listener_data->listener);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_init);
void kbase_ipa_control_term(struct kbase_device *kbdev)
{
unsigned long flags;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_ipa_control_listener_data *listener_data = ipa_ctrl->rtm_listener_data;
WARN_ON(ipa_ctrl->num_active_sessions);
if (listener_data) {
kbase_clk_rate_trace_manager_unsubscribe(clk_rtm, &listener_data->listener);
destroy_workqueue(listener_data->clk_chg_wq);
}
kfree(ipa_ctrl->rtm_listener_data);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (kbdev->pm.backend.gpu_powered)
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), 0);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_term);
/** session_read_raw_values - Read latest raw values for a sessions
* @kbdev: Pointer to kbase device.
* @session: Pointer to the session whose performance counters shall be read.
*
* Read and update the latest raw values of all the performance counters
* belonging to a given session.
*/
static void session_read_raw_values(struct kbase_device *kbdev,
struct kbase_ipa_control_session *session)
{
size_t i;
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
u64 raw_value = read_value_cnt(kbdev, (u8)prfcnt->type, prfcnt->select_idx);
prfcnt->latest_raw_value = raw_value;
}
}
/** session_gpu_start - Start one or all sessions
* @kbdev: Pointer to kbase device.
* @ipa_ctrl: Pointer to IPA_CONTROL descriptor.
* @session: Pointer to the session to initialize, or NULL to initialize
* all sessions.
*
* This function starts one or all sessions by capturing a manual sample,
* reading the latest raw value of performance counters and possibly enabling
* the timer for automatic sampling if necessary.
*
* If a single session is given, it is assumed to be active, regardless of
* the number of active sessions. The number of performance counters belonging
* to the session shall be set in advance.
*
* If no session is given, the function shall start all sessions.
* The function does nothing if there are no active sessions.
*
* Return: 0 on success, or error code on failure.
*/
static int session_gpu_start(struct kbase_device *kbdev, struct kbase_ipa_control *ipa_ctrl,
struct kbase_ipa_control_session *session)
{
bool first_start = (session != NULL) && (ipa_ctrl->num_active_sessions == 0);
int ret = 0;
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
/*
* Exit immediately if the caller intends to start all sessions
* but there are no active sessions. It's important that no operation
* is done on the IPA_CONTROL interface in that case.
*/
if (!session && ipa_ctrl->num_active_sessions == 0)
return ret;
/*
* Take a manual sample unconditionally if the caller intends
* to start all sessions. Otherwise, only take a manual sample
* if this is the first session to be initialized, for accumulator
* registers are empty and no timer has been configured for automatic
* sampling.
*/
if (!session || first_start) {
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_SAMPLE);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (ret)
dev_err(kbdev->dev, "%s: failed to sample new counters", __func__);
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER),
timer_value(ipa_ctrl->cur_gpu_rate));
}
/*
* Read current raw value to start the session.
* This is necessary to put the first query in condition
* to generate a correct value by calculating the difference
* from the beginning of the session. This consideration
* is true regardless of the number of sessions the caller
* intends to start.
*/
if (!ret) {
if (session) {
/* On starting a session, value read is required for
* IPA power model's calculation initialization.
*/
session_read_raw_values(kbdev, session);
} else {
size_t session_idx;
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS;
session_idx++) {
struct kbase_ipa_control_session *session_to_check =
&ipa_ctrl->sessions[session_idx];
if (session_to_check->active)
session_read_raw_values(kbdev, session_to_check);
}
}
}
return ret;
}
int kbase_ipa_control_register(struct kbase_device *kbdev,
const struct kbase_ipa_control_perf_counter *perf_counters,
size_t num_counters, void **client)
{
int ret = 0;
size_t i, session_idx, req_counters[KBASE_IPA_CORE_TYPE_NUM];
bool already_configured[KBASE_IPA_CONTROL_MAX_COUNTERS];
bool new_config = false;
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session = NULL;
unsigned long flags;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(perf_counters == NULL) || WARN_ON(client == NULL) ||
WARN_ON(num_counters > KBASE_IPA_CONTROL_MAX_COUNTERS)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
kbase_pm_context_active(kbdev);
ipa_ctrl = &kbdev->csf.ipa_control;
spin_lock_irqsave(&ipa_ctrl->lock, flags);
if (ipa_ctrl->num_active_sessions == KBASE_IPA_CONTROL_MAX_SESSIONS) {
dev_err(kbdev->dev, "%s: too many sessions", __func__);
ret = -EBUSY;
goto exit;
}
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
req_counters[i] = 0;
/*
* Count how many counters would need to be configured in order to
* satisfy the request. Requested counters which happen to be already
* configured can be skipped.
*/
for (i = 0; i < num_counters; i++) {
size_t j;
enum kbase_ipa_core_type type = perf_counters[i].type;
u8 idx = perf_counters[i].idx;
if ((type >= KBASE_IPA_CORE_TYPE_NUM) || (idx >= KBASE_IPA_CONTROL_CNT_MAX_IDX)) {
dev_err(kbdev->dev, "%s: invalid requested type %u and/or index %u",
__func__, type, idx);
ret = -EINVAL;
goto exit;
}
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config =
&ipa_ctrl->blocks[type].select[j];
if (prfcnt_config->refcount > 0) {
if (prfcnt_config->idx == idx) {
already_configured[i] = true;
break;
}
}
}
if (j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS) {
already_configured[i] = false;
req_counters[type]++;
new_config = true;
}
}
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
if (req_counters[i] > ipa_ctrl->blocks[i].num_available_counters) {
dev_err(kbdev->dev,
"%s: more counters (%zu) than available (%zu) have been requested for type %zu",
__func__, req_counters[i],
ipa_ctrl->blocks[i].num_available_counters, i);
ret = -EINVAL;
goto exit;
}
/*
* The request has been validated.
* Firstly, find an available session and then set up the initial state
* of the session and update the configuration of performance counters
* in the internal state of kbase_ipa_control.
*/
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS; session_idx++) {
if (!ipa_ctrl->sessions[session_idx].active) {
session = &ipa_ctrl->sessions[session_idx];
break;
}
}
if (!session) {
dev_err(kbdev->dev, "%s: wrong or corrupt session state", __func__);
ret = -EBUSY;
goto exit;
}
for (i = 0; i < num_counters; i++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config;
size_t j;
u8 type = perf_counters[i].type;
u8 idx = perf_counters[i].idx;
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
prfcnt_config = &ipa_ctrl->blocks[type].select[j];
if (already_configured[i]) {
if ((prfcnt_config->refcount > 0) && (prfcnt_config->idx == idx)) {
break;
}
} else {
if (prfcnt_config->refcount == 0)
break;
}
}
if (WARN_ON((prfcnt_config->refcount > 0 && prfcnt_config->idx != idx) ||
(j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS))) {
dev_err(kbdev->dev,
"%s: invalid internal state: counter already configured or no counter available to configure",
__func__);
ret = -EBUSY;
goto exit;
}
if (prfcnt_config->refcount == 0) {
prfcnt_config->idx = idx;
ipa_ctrl->blocks[type].num_available_counters--;
}
session->prfcnts[i].accumulated_diff = 0;
session->prfcnts[i].type = type;
session->prfcnts[i].select_idx = j;
session->prfcnts[i].scaling_factor = perf_counters[i].scaling_factor;
session->prfcnts[i].gpu_norm = perf_counters[i].gpu_norm;
/* Reports to this client for GPU time spent in protected mode
* should begin from the point of registration.
*/
session->last_query_time = ktime_get_raw_ns();
/* Initially, no time has been spent in protected mode */
session->protm_time = 0;
prfcnt_config->refcount++;
}
/*
* Apply new configuration, if necessary.
* As a temporary solution, make sure that the GPU is on
* before applying the new configuration.
*/
if (new_config) {
ret = update_select_registers(kbdev);
if (ret)
dev_err(kbdev->dev, "%s: failed to apply new SELECT configuration",
__func__);
}
if (!ret) {
session->num_prfcnts = num_counters;
ret = session_gpu_start(kbdev, ipa_ctrl, session);
}
if (!ret) {
session->active = true;
ipa_ctrl->num_active_sessions++;
*client = session;
}
exit:
spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
kbase_pm_context_idle(kbdev);
return ret;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_register);
int kbase_ipa_control_unregister(struct kbase_device *kbdev, const void *client)
{
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session;
int ret = 0;
size_t i;
unsigned long flags;
bool new_config = false, valid_session = false;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(client == NULL)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
kbase_pm_context_active(kbdev);
ipa_ctrl = &kbdev->csf.ipa_control;
session = (struct kbase_ipa_control_session *)client;
spin_lock_irqsave(&ipa_ctrl->lock, flags);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
if (session == &ipa_ctrl->sessions[i]) {
valid_session = true;
break;
}
}
if (!valid_session) {
dev_err(kbdev->dev, "%s: invalid session handle", __func__);
ret = -EINVAL;
goto exit;
}
if (ipa_ctrl->num_active_sessions == 0) {
dev_err(kbdev->dev, "%s: no active sessions found", __func__);
ret = -EINVAL;
goto exit;
}
if (!session->active) {
dev_err(kbdev->dev, "%s: session is already inactive", __func__);
ret = -EINVAL;
goto exit;
}
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config;
u8 type = session->prfcnts[i].type;
u8 idx = session->prfcnts[i].select_idx;
prfcnt_config = &ipa_ctrl->blocks[type].select[idx];
if (!WARN_ON(prfcnt_config->refcount == 0)) {
prfcnt_config->refcount--;
if (prfcnt_config->refcount == 0) {
new_config = true;
ipa_ctrl->blocks[type].num_available_counters++;
}
}
}
if (new_config) {
ret = update_select_registers(kbdev);
if (ret)
dev_err(kbdev->dev, "%s: failed to apply SELECT configuration", __func__);
}
session->num_prfcnts = 0;
session->active = false;
ipa_ctrl->num_active_sessions--;
exit:
spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
kbase_pm_context_idle(kbdev);
return ret;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_unregister);
int kbase_ipa_control_query(struct kbase_device *kbdev, const void *client, u64 *values,
size_t num_values, u64 *protected_time)
{
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session;
size_t i;
unsigned long flags;
bool gpu_ready;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(client == NULL) || WARN_ON(values == NULL)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
ipa_ctrl = &kbdev->csf.ipa_control;
session = (struct kbase_ipa_control_session *)client;
if (!session->active) {
dev_err(kbdev->dev, "%s: attempt to query inactive session", __func__);
return -EINVAL;
}
if (WARN_ON(num_values < session->num_prfcnts)) {
dev_err(kbdev->dev, "%s: not enough space (%zu) to return all counter values (%zu)",
__func__, num_values, session->num_prfcnts);
return -EINVAL;
}
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
gpu_ready = kbdev->pm.backend.gpu_ready;
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
calc_prfcnt_delta(kbdev, prfcnt, gpu_ready);
/* Return all the accumulated difference */
values[i] = prfcnt->accumulated_diff;
prfcnt->accumulated_diff = 0;
}
if (protected_time) {
u64 time_now = ktime_get_raw_ns();
/* This is the amount of protected-mode time spent prior to
* the current protm period.
*/
*protected_time = session->protm_time;
if (kbdev->protected_mode) {
*protected_time +=
time_now - MAX(session->last_query_time, ipa_ctrl->protm_start);
}
session->last_query_time = time_now;
session->protm_time = 0;
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
for (i = session->num_prfcnts; i < num_values; i++)
values[i] = 0;
return 0;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_query);
void kbase_ipa_control_handle_gpu_power_off(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
size_t session_idx;
int ret;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should still be ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
/* First disable the automatic sampling through TIMER */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), 0);
ret = wait_status(kbdev, STATUS_TIMER_ENABLED);
if (ret) {
dev_err(kbdev->dev, "Wait for disabling of IPA control timer failed: %d", ret);
}
/* Now issue the manual SAMPLE command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_SAMPLE);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (ret) {
dev_err(kbdev->dev, "Wait for the completion of manual sample failed: %d", ret);
}
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS; session_idx++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[session_idx];
if (session->active) {
size_t i;
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
calc_prfcnt_delta(kbdev, prfcnt, true);
}
}
}
spin_unlock(&ipa_ctrl->lock);
}
void kbase_ipa_control_handle_gpu_power_on(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
int ret;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should have become ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
ret = update_select_registers(kbdev);
if (ret) {
dev_err(kbdev->dev, "Failed to reconfigure the select registers: %d", ret);
}
/* Accumulator registers would not contain any sample after GPU power
* cycle if the timer has not been enabled first. Initialize all sessions.
*/
ret = session_gpu_start(kbdev, ipa_ctrl, NULL);
spin_unlock(&ipa_ctrl->lock);
}
void kbase_ipa_control_handle_gpu_reset_pre(struct kbase_device *kbdev)
{
/* A soft reset is treated as a power down */
kbase_ipa_control_handle_gpu_power_off(kbdev);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_pre);
void kbase_ipa_control_handle_gpu_reset_post(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
int ret;
u32 status;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should have become ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
/* Check the status reset bit is set before acknowledging it */
status = kbase_reg_read32(kbdev, IPA_CONTROL_ENUM(STATUS));
if (status & STATUS_RESET) {
/* Acknowledge the reset command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_RESET_ACK);
ret = wait_status(kbdev, STATUS_RESET);
if (ret) {
dev_err(kbdev->dev, "Wait for the reset ack command failed: %d", ret);
}
}
spin_unlock(&ipa_ctrl->lock);
kbase_ipa_control_handle_gpu_power_on(kbdev);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_post);
#ifdef KBASE_PM_RUNTIME
void kbase_ipa_control_handle_gpu_sleep_enter(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.mcu_state == KBASE_MCU_IN_SLEEP) {
/* GPU Sleep is treated as a power down */
kbase_ipa_control_handle_gpu_power_off(kbdev);
/* SELECT_CSHW register needs to be cleared to prevent any
* IPA control message to be sent to the top level GPU HWCNT.
*/
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_CSHW), 0);
/* No need to issue the APPLY command here */
}
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_sleep_enter);
void kbase_ipa_control_handle_gpu_sleep_exit(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.mcu_state == KBASE_MCU_IN_SLEEP) {
/* To keep things simple, currently exit from
* GPU Sleep is treated as a power on event where
* all 4 SELECT registers are reconfigured.
* On exit from sleep, reconfiguration is needed
* only for the SELECT_CSHW register.
*/
kbase_ipa_control_handle_gpu_power_on(kbdev);
}
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_sleep_exit);
#endif
#if MALI_UNIT_TEST
void kbase_ipa_control_rate_change_notify_test(struct kbase_device *kbdev, u32 clk_index,
u32 clk_rate_hz)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_ipa_control_listener_data *listener_data = ipa_ctrl->rtm_listener_data;
kbase_ipa_control_rate_change_notify(&listener_data->listener, clk_index, clk_rate_hz);
/* Ensure the callback has taken effect before returning back to the test caller */
flush_work(&listener_data->clk_chg_work);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_rate_change_notify_test);
#endif
void kbase_ipa_control_protm_entered(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
lockdep_assert_held(&kbdev->hwaccess_lock);
ipa_ctrl->protm_start = ktime_get_raw_ns();
}
void kbase_ipa_control_protm_exited(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
size_t i;
u64 time_now = ktime_get_raw_ns();
u32 status;
lockdep_assert_held(&kbdev->hwaccess_lock);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[i];
if (session->active) {
u64 protm_time =
time_now - MAX(session->last_query_time, ipa_ctrl->protm_start);
session->protm_time += protm_time;
}
}
/* Acknowledge the protected_mode bit in the IPA_CONTROL STATUS
* register
*/
status = kbase_reg_read32(kbdev, IPA_CONTROL_ENUM(STATUS));
if (status & STATUS_PROTECTED_MODE) {
int ret;
/* Acknowledge the protm command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_PROTECTED_ACK);
ret = wait_status(kbdev, STATUS_PROTECTED_MODE);
if (ret) {
dev_err(kbdev->dev, "Wait for the protm ack command failed: %d", ret);
}
}
}

Some files were not shown because too many files have changed in this diff Show More