MALI: rockchip: Add separate src directory for Valhall driver from DDK g28p0-00eac0

Previously, Valhall and Bifrost GPUs shared a single driver source directory (drivers/gpu/arm/bifrost).
However, starting from DDK r52 (g27), Bifrost GPUs are no longer supported.
As a result, the Valhall GPU driver from DDK r53 (g28) must use a separate source directory
(drivers/gpu/arm/valhall).

There are also modifications in some header files outside of drivers/gpu/arm/.

In addition, the configs related to Bifrost and Valhall GPUs have been removed
from the defconfig file like rockchip_linux_defconfig,
which does not reflect the current SoC.
Instead, these configs have been migrated to the .config files
such as rk3576.config, whose file names can reflect the current SoC.
Therefore, for some SoCs, the kernel compilation command line needs to be adjusted.

Change-Id: I0c4384212b4b679a728401f7f89ae839530f002b
Signed-off-by: Zhen Chen <chenzhen@rock-chips.com>
This commit is contained in:
Zhen Chen
2024-09-24 17:19:46 +08:00
committed by 陈真
parent 74328f507f
commit e54469c723
486 changed files with 187883 additions and 186 deletions

View File

@@ -1,58 +1,55 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2022-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
.. SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
DebugFS interface:
------------------
==================
DebugFS interface
==================
**Copyright:** \(C) 2022-2024 ARM Limited. All rights reserved.
..
This program is free software and is provided to you under the terms of the
GNU General Public License version 2 as published by the Free Software
Foundation, and any use by you of this program is subject to the terms
of such GNU license.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, you can access it online at
http://www.gnu.org/licenses/gpl-2.0.html.
A new per-kbase-context debugfs file called csf_sync has been implemented
which captures the current KCPU & GPU queue state of the not-yet-completed
operations and displayed through the debugfs file.
This file is at:
=======================================================
/sys/kernel/debug/mali0/ctx/<pid>_<context id>/csf_sync
=======================================================
This file is at
Output Format:
----------------
::
/sys/kernel/debug/mali0/ctx/<pid>_<context id>/csf_sync
Output Format
-------------
The csf_sync file contains important data for the currently active queues.
This data is formatted into two segments, which are separated by a
pipe character: the common properties and the operation-specific properties.
Common Properties:
------------------
Common Properties
-----------------
* Queue type: GPU or KCPU.
* kbase context id and the queue id.
* If the queue type is a GPU queue then the group handle is also noted,
in the middle of the other two IDs. The slot value is also dumped.
* If the queue type is a GPU queue then the group handle is also noted,in the middle of the other two IDs. The slot value is also dumped.
* Execution status, which can either be 'P' for pending or 'S' for started.
* Command type is then output which indicates the type of dependency
(i.e. wait or signal).
* Object address which is a pointer to the sync object that the
command operates on.
* The live value, which is the value of the synchronization object
at the time of dumping. This could help to determine why wait
operations might be blocked.
* Command type is then output which indicates the type of dependency (i.e. wait or signal).
* Object address which is a pointer to the sync object that the command operates on.
Operation-Specific Properties:
* The live value, which is the value of the synchronization object at the time of dumping. This could help to determine why wait operations might be blocked.
Operation-Specific Properties
------------------------------
The operation-specific values for KCPU queue fence operations
@@ -67,48 +64,49 @@ which are always shown; the argument value to wait on or set/add to,
and the operation type (set/add) or wait condition (e.g. LE, GT, GE).
Examples
--------
========
GPU Queue Example
------------------
The following output is of a GPU queue, from a process that has a KCTX ID of 52,
is in Queue Group (CSG) 0, and has Queue ID 0. It has started and is waiting on
the object at address 0x0000007f81ffc800. The live value is 0,
the object at address **0x0000007f81ffc800**. The live value is 0,
as is the arg value. However, the operation "op" is GT, indicating it's waiting
for the live value to surpass the arg value:
======================================================================================================================================
queue:GPU-52-0-0 exec:S cmd:SYNC_WAIT slot:4 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
======================================================================================================================================
::
queue:GPU-52-0-0 exec:S cmd:SYNC_WAIT slot:4 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
The following is an example of GPU queue dump, where the SYNC SET operation
is blocked by the preceding SYNC WAIT operation. This shows two GPU queues,
with the same KCTX ID of 8, Queue Group (CSG) 0, and Queue ID 0. The SYNC WAIT
operation has started, while the SYNC SET is pending, blocked by the SYNC WAIT.
Both operations are on the same slot, 2 and have live value of 0. The SYNC WAIT
is waiting on the object at address 0x0000007f81ffc800, while the SYNC SET will
set the object at address 0x00000000a3bad4fb when it is unblocked.
is waiting on the object at address **0x0000007f81ffc800**, while the SYNC SET will
set the object at address **0x00000000a3bad4fb** when it is unblocked.
The operation "op" is GT for the SYNC WAIT, indicating it's waiting for the
live value to surpass the arg value, while the operation and arg value for the
SYNC SET is "set" and "1" respectively:
SYNC SET is "set" and "1" respectively.
======================================================================================================================================
queue:GPU-8-0-0 exec:S cmd:SYNC_WAIT slot:2 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
queue:GPU-8-0-0 exec:P cmd:SYNC_SET slot:2 obj:0x00000000a3bad4fb live_value:0x0000000000000000 | op:set arg_value:0x0000000000000001
======================================================================================================================================
::
queue:GPU-8-0-0 exec:S cmd:SYNC_WAIT slot:2 obj:0x0000007f81ffc800 live_value:0x0000000000000000 | op:gt arg_value:0x0000000000000000
queue:GPU-8-0-0 exec:P cmd:SYNC_SET slot:2 obj:0x00000000a3bad4fb live_value:0x0000000000000000 | op:set arg_value:0x0000000000000001
KCPU Queue Example
------------------
The following is an example of a KCPU queue, from a process that has
a KCTX ID of 0 and has Queue ID 1. It has started and is waiting on the
object at address 0x0000007fbf6f2ff8. The live value is currently 0 with
object at address **0x0000007fbf6f2ff8**. The live value is currently 0 with
the "op" being GT indicating it is waiting on the live value to
surpass the arg value.
===============================================================================================================================
queue:KCPU-0-1 exec:S cmd:CQS_WAIT_OPERATION obj:0x0000007fbf6f2ff8 live_value:0x0000000000000000 | op:gt arg_value: 0x00000000
===============================================================================================================================
::
queue:KCPU-0-1 exec:S cmd:CQS_WAIT_OPERATION obj:0x0000007fbf6f2ff8 live_value:0x0000000000000000 | op:gt arg_value: 0x00000000
CSF Sync State Dump For Fence Signal Timeouts
---------------------------------------------
@@ -142,18 +140,23 @@ which is written to, to turn this feature on and off.
Example:
--------
when writing to fence_signal_timeout_enable entry:
echo 1 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is enabled.
echo 0 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is disabled.
when writing to fence_signal_timeout_enable entry
::
echo 1 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is enabled.
echo 0 > /sys/kernel/debug/mali0/fence_signal_timeout_enable -> feature is disabled.
It is also possible to read from this file to check if the feature is currently
enabled or not checking the return value of fence_signal_timeout_enable.
Example:
--------
when reading from fence_signal_timeout_enable entry, if:
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 1 -> feature is enabled.
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 0 -> feature is disabled.
when reading from fence_signal_timeout_enable entry, if
::
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 1 -> feature is enabled.
cat /sys/kernel/debug/mali0/fence_signal_timeout_enable returns 0 -> feature is disabled.
Update Timer Duration
---------------------
@@ -163,11 +166,15 @@ milliseconds.
Example:
--------
cat /sys/kernel/debug/mali0/fence_signal_timeout_ms
::
cat /sys/kernel/debug/mali0/fence_signal_timeout_ms
The 'fence_signal_timeout_ms' debugfs entry can also be written to, to update
the time in milliseconds.
Example:
--------
echo 10000 > /sys/kernel/debug/mali0/fence_signal_timeout_ms
::
echo 10000 > /sys/kernel/debug/mali0/fence_signal_timeout_ms

View File

@@ -0,0 +1,42 @@
.. SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
=====================
dma-buf-test-exporter
=====================
**Copyright:** \(C) 2012-2013, 2020-2022, 2024 ARM Limited. All rights reserved.
..
This program is free software and is provided to you under the terms of the
GNU General Public License version 2 as published by the Free Software
Foundation, and any use by you of this program is subject to the terms
of such GNU license.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, you can access it online at
http://www.gnu.org/licenses/gpl-2.0.html.
Overview
--------
The dma-buf-test-exporter is a simple exporter of dma_buf objects.
It has a private API to allocate and manipulate the buffers which are represented as dma_buf fds.
The private API allows:
* simple allocation of physically non-contiguous buffers
* simple allocation of physically contiguous buffers
* query kernel side API usage stats (number of attachments, number of mappings, mmaps)
* failure mode configuration (fail attach, mapping, mmap)
* kernel side memset of buffers
The buffers support all of the dma_buf API, including mmap.
It supports being compiled as a module both in-tree and out-of-tree.
See **include/uapi/base/arm/dma_buf_test_exporter/dma-buf-test-exporter.h** for the ioctl interface.
See **Documentation/dma-buf-sharing.txt** for details on dma_buf.

View File

@@ -1,42 +0,0 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2013, 2020-2022 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
=====================
dma-buf-test-exporter
=====================
Overview
--------
The dma-buf-test-exporter is a simple exporter of dma_buf objects.
It has a private API to allocate and manipulate the buffers which are represented as dma_buf fds.
The private API allows:
* simple allocation of physically non-contiguous buffers
* simple allocation of physically contiguous buffers
* query kernel side API usage stats (number of attachments, number of mappings, mmaps)
* failure mode configuration (fail attach, mapping, mmap)
* kernel side memset of buffers
The buffers support all of the dma_buf API, including mmap.
It supports being compiled as a module both in-tree and out-of-tree.
See include/uapi/base/arm/dma_buf_test_exporter/dma-buf-test-exporter.h for the ioctl interface.
See Documentation/dma-buf-sharing.txt for details on dma_buf.

View File

@@ -2957,7 +2957,7 @@
};
gpu: gpu@fb000000 {
compatible = "arm,mali-bifrost";
compatible = "arm,mali-valhall";
reg = <0x0 0xfb000000 0x0 0x200000>;
interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 93 IRQ_TYPE_LEVEL_HIGH>,

View File

@@ -0,0 +1,4 @@
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -241,3 +241,7 @@ CONFIG_PREEMPT_RCU=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -0,0 +1,4 @@
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -0,0 +1,4 @@
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -4,3 +4,7 @@ CONFIG_HZ=100
CONFIG_HZ_100=y
# CONFIG_HZ_300 is not set
# CONFIG_MALI_MIDGARD is not set
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -1 +1,4 @@
# CONFIG_MALI_CSF_SUPPORT is not set
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -3,3 +3,7 @@ CONFIG_DMABUF_PARTIAL=y
CONFIG_HZ=100
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -144,3 +144,7 @@ CONFIG_VIDEO_MAXIM_SER_MAX9295=y
CONFIG_VIDEO_MAXIM_SER_MAX96715=y
CONFIG_VIDEO_MAXIM_SER_MAX96717=y
# CONFIG_VIDEO_REVERSE_IMAGE is not set
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y

View File

@@ -0,0 +1,4 @@
CONFIG_MALI_VALHALL=y
CONFIG_MALI_VALHALL_PLATFORM_NAME="rk"
CONFIG_MALI_VALHALL_EXPERT=y
CONFIG_MALI_VALHALL_DEBUG=y

View File

@@ -5,7 +5,10 @@ CONFIG_AP6XXX=y
# CONFIG_WIFI_BUILD_MODULE is not set
# CONFIG_BCMDHD_SDIO is not set
CONFIG_BCMDHD_PCIE=y
CONFIG_MALI_CSF_SUPPORT=y
CONFIG_MALI_VALHALL=y
CONFIG_MALI_VALHALL_PLATFORM_NAME="rk"
CONFIG_MALI_VALHALL_EXPERT=y
CONFIG_MALI_VALHALL_DEBUG=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_USB_CONFIGFS_RNDIS=y
CONFIG_USB_CONFIGFS_F_UAC1=y

View File

@@ -1,4 +1,7 @@
# CONFIG_BCMDHD_SDIO=y is not set
CONFIG_BCMDHD_PCIE=y
CONFIG_MALI_CSF_SUPPORT=y
CONFIG_MALI_VALHALL=y
CONFIG_MALI_VALHALL_PLATFORM_NAME="rk"
CONFIG_MALI_VALHALL_EXPERT=y
CONFIG_MALI_VALHALL_DEBUG=y
CONFIG_ROCKCHIP_RGA_PROC_FS=y

View File

@@ -1,3 +1,2 @@
# CONFIG_BCMDHD_SDIO=y is not set
CONFIG_BCMDHD_PCIE=y
CONFIG_MALI_CSF_SUPPORT=y

View File

@@ -11,7 +11,10 @@ CONFIG_DRM_ITE_IT6161=y
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_MALI400 is not set
CONFIG_MALI_CSF_SUPPORT=y
CONFIG_MALI_VALHALL=y
CONFIG_MALI_VALHALL_PLATFORM_NAME="rk"
CONFIG_MALI_VALHALL_EXPERT=y
CONFIG_MALI_VALHALL_DEBUG=y
# CONFIG_MALI_MIDGARD is not set
# CONFIG_MEDIA_CEC_SUPPORT is not set
# CONFIG_MEDIA_USB_SUPPORT is not set

View File

@@ -144,3 +144,7 @@ CONFIG_VIDEO_MAXIM_SER_MAX9295=y
CONFIG_VIDEO_MAXIM_SER_MAX96715=y
CONFIG_VIDEO_MAXIM_SER_MAX96717=y
# CONFIG_VIDEO_REVERSE_IMAGE is not set
CONFIG_MALI_VALHALL=y
CONFIG_MALI_VALHALL_PLATFORM_NAME="rk"
CONFIG_MALI_VALHALL_EXPERT=y
CONFIG_MALI_VALHALL_DEBUG=y

View File

@@ -363,10 +363,6 @@ CONFIG_MALI_PLATFORM_THIRDPARTY=y
CONFIG_MALI_PLATFORM_THIRDPARTY_NAME="rk"
CONFIG_MALI_DEBUG=y
CONFIG_MALI_PWRSOFT_765=y
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_PWM=y
CONFIG_IEP=y

View File

@@ -384,10 +384,6 @@ CONFIG_MALI_PLATFORM_THIRDPARTY=y
CONFIG_MALI_PLATFORM_THIRDPARTY_NAME="rk"
CONFIG_MALI_DEBUG=y
CONFIG_MALI_PWRSOFT_765=y
CONFIG_MALI_BIFROST=y
CONFIG_MALI_PLATFORM_NAME="rk"
CONFIG_MALI_BIFROST_EXPERT=y
CONFIG_MALI_BIFROST_DEBUG=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_PWM=y
CONFIG_ROCKCHIP_MULTI_RGA=y

View File

@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
# (C) COPYRIGHT 2021-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
@@ -18,6 +18,10 @@
#
#
ifeq ($(MALI_CSF_SUPPORT),n)
$(error [GPUBUILD-2005] Only CSF builds are supported on this branch)
endif
#
# ccflags
#

View File

@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2021-2023 ARM Limited. All rights reserved.
# (C) COPYRIGHT 2021-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
@@ -127,6 +127,8 @@ CFLAGS_MODULE += -Wno-shift-negative-value
CFLAGS_MODULE += $(call cc-option, -Wno-cast-function-type)
# The following ensures the stack frame does not get larger than a page
CFLAGS_MODULE += -Wframe-larger-than=4096
# This flag was added on v6.6 kernel
CFLAGS_MODULE += $(call cc-option,-Werror=designated-init)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1

View File

@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
* (C) COPYRIGHT 2012-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
@@ -375,7 +375,7 @@ static int dma_buf_te_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
if (alloc->fail_mmap)
return -ENOMEM;
vm_flags_set(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP);
__vm_flags_mod(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP, 0);
vma->vm_ops = &dma_buf_te_vm_ops;
vma->vm_private_data = dmabuf;
@@ -476,7 +476,7 @@ static int do_dma_buf_te_ioctl_alloc(struct dma_buf_te_ioctl_alloc __user *buf,
if (copy_from_user(&alloc_req, buf, sizeof(alloc_req))) {
dev_err(te_device.this_device, "%s: couldn't get user data", __func__);
goto no_input;
return -EFAULT;
}
if (!alloc_req.size) {
@@ -604,7 +604,6 @@ free_alloc_object:
kfree(alloc);
no_alloc_object:
invalid_size:
no_input:
return -EFAULT;
}

View File

@@ -446,7 +446,11 @@ static int memory_group_manager_probe(struct platform_device *pdev)
return 0;
}
#if (KERNEL_VERSION(6, 11, 0) > LINUX_VERSION_CODE)
static int memory_group_manager_remove(struct platform_device *pdev)
#else
static void memory_group_manager_remove(struct platform_device *pdev)
#endif
{
struct memory_group_manager_device *mgm_dev = platform_get_drvdata(pdev);
struct mgm_groups *mgm_data = mgm_dev->data;
@@ -458,7 +462,9 @@ static int memory_group_manager_remove(struct platform_device *pdev)
dev_info(&pdev->dev, "Memory group manager removed successfully\n");
#if (KERNEL_VERSION(6, 11, 0) > LINUX_VERSION_CODE)
return 0;
#endif
}
static const struct of_device_id memory_group_manager_dt_ids[] = {

View File

@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
@@ -22,6 +22,7 @@
#include <linux/version.h>
#include <linux/of.h>
#include <linux/of_reserved_mem.h>
#include <linux/of_address.h>
#include <linux/platform_device.h>
#include <linux/module.h>
#include <linux/slab.h>
@@ -419,6 +420,7 @@ static int protected_memory_allocator_probe(struct platform_device *pdev)
phys_addr_t rmem_base;
size_t rmem_size;
size_t alloc_bitmap_pages_arr_size;
struct resource *mem_res;
#if (KERNEL_VERSION(4, 15, 0) <= LINUX_VERSION_CODE)
struct reserved_mem *rmem;
#endif
@@ -430,6 +432,14 @@ static int protected_memory_allocator_probe(struct platform_device *pdev)
return -ENODEV;
}
/* Try to get reserved memory from IO resource memory */
mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (mem_res) {
rmem_base = mem_res->start;
rmem_size = resource_size(mem_res) >> PAGE_SHIFT;
goto skip_reserved_lookup;
}
np = of_parse_phandle(np, "memory-region", 0);
if (!np) {
dev_err(&pdev->dev, "memory-region node not set\n");
@@ -449,6 +459,8 @@ static int protected_memory_allocator_probe(struct platform_device *pdev)
return -ENODEV;
}
skip_reserved_lookup:
of_node_put(np);
epma_dev = devm_kzalloc(&pdev->dev, sizeof(*epma_dev), GFP_KERNEL);
if (!epma_dev)
@@ -495,14 +507,18 @@ static int protected_memory_allocator_probe(struct platform_device *pdev)
return 0;
}
#if (KERNEL_VERSION(6, 11, 0) > LINUX_VERSION_CODE)
static int protected_memory_allocator_remove(struct platform_device *pdev)
#else
static void protected_memory_allocator_remove(struct platform_device *pdev)
#endif
{
struct protected_memory_allocator_device *pma_dev = platform_get_drvdata(pdev);
struct simple_pma_device *epma_dev;
struct device *dev;
if (!pma_dev)
return -EINVAL;
if (unlikely(!pma_dev))
goto out_err;
epma_dev = container_of(pma_dev, struct simple_pma_device, pma_dev);
dev = epma_dev->dev;
@@ -518,7 +534,12 @@ static int protected_memory_allocator_remove(struct platform_device *pdev)
dev_info(&pdev->dev, "Protected memory allocator removed successfully\n");
out_err:
#if (KERNEL_VERSION(6, 11, 0) > LINUX_VERSION_CODE)
return 0;
#else
return;
#endif
}
static const struct of_device_id protected_memory_allocator_dt_ids[] = {

View File

@@ -23,3 +23,5 @@ obj-$(CONFIG_MALI_MIDGARD) += midgard/
obj-$(CONFIG_MALI400) += mali400/
obj-$(CONFIG_MALI_BIFROST) += bifrost/
obj-$(CONFIG_MALI_VALHALL) += valhall/

View File

@@ -23,3 +23,5 @@ source "drivers/gpu/arm/mali400/mali/Kconfig"
source "drivers/gpu/arm/midgard/Kconfig"
source "drivers/gpu/arm/bifrost/Kconfig"
source "drivers/gpu/arm/valhall/Kconfig"

View File

@@ -0,0 +1,235 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
# make $(src) as absolute path if it is not already, by prefixing $(srctree)
# This is to prevent any build issue due to wrong path.
src:=$(if $(patsubst /%,,$(src)),$(srctree)/$(src),$(src))
#
# Prevent misuse when Kernel configurations are not present by default
# in out-of-tree builds
#
ifeq ($(CONFIG_DMA_SHARED_BUFFER),n)
$(error CONFIG_DMA_SHARED_BUFFER must be set in Kernel configuration)
endif
ifeq ($(CONFIG_PM_DEVFREQ),n)
$(error CONFIG_PM_DEVFREQ must be set in Kernel configuration)
endif
ifeq ($(CONFIG_DEVFREQ_THERMAL),n)
$(error CONFIG_DEVFREQ_THERMAL must be set in Kernel configuration)
endif
ifeq ($(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND),n)
$(error CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND must be set in Kernel configuration)
endif
ifeq ($(CONFIG_FW_LOADER), n)
$(error CONFIG_FW_LOADER must be set in Kernel configuration)
endif
ifeq ($(CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS), y)
ifneq ($(CONFIG_DEBUG_FS), y)
$(error CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS depends on CONFIG_DEBUG_FS to be set in Kernel configuration)
endif
endif
ifeq ($(CONFIG_MALI_VALHALL_FENCE_DEBUG), y)
ifneq ($(CONFIG_SYNC_FILE), y)
$(error CONFIG_MALI_VALHALL_FENCE_DEBUG depends on CONFIG_SYNC_FILE to be set in Kernel configuration)
endif
endif
#
# Configurations
#
# Driver version string which is returned to userspace via an ioctl
MALI_RELEASE_NAME ?= '"g28p0-00eac0"'
# Set up defaults if not defined by build system
ifeq ($(CONFIG_MALI_VALHALL_DEBUG), y)
MALI_UNIT_TEST = 1
MALI_CUSTOMER_RELEASE ?= 0
else
MALI_UNIT_TEST ?= 0
MALI_CUSTOMER_RELEASE ?= 1
endif
MALI_COVERAGE ?= 0
# Kconfig passes in the name with quotes for in-tree builds - remove them.
MALI_PLATFORM_DIR := $(shell echo $(CONFIG_MALI_VALHALL_PLATFORM_NAME))
ifeq ($(CONFIG_MALI_VALHALL_CSF_SUPPORT),y)
MALI_JIT_PRESSURE_LIMIT_BASE = 0
MALI_USE_CSF = 1
else
MALI_JIT_PRESSURE_LIMIT_BASE ?= 1
MALI_USE_CSF ?= 0
endif
ifneq ($(CONFIG_MALI_VALHALL_KUTF), n)
MALI_KERNEL_TEST_API ?= 1
else
MALI_KERNEL_TEST_API ?= 0
endif
# Experimental features (corresponding -D definition should be appended to
# ccflags-y below, e.g. for MALI_EXPERIMENTAL_FEATURE,
# -DMALI_EXPERIMENTAL_FEATURE=$(MALI_EXPERIMENTAL_FEATURE) should be appended)
#
# Experimental features must default to disabled, e.g.:
# MALI_EXPERIMENTAL_FEATURE ?= 0
#
# ccflags
#
ccflags-y = \
-DMALI_CUSTOMER_RELEASE=$(MALI_CUSTOMER_RELEASE) \
-DMALI_USE_CSF=$(MALI_USE_CSF) \
-DMALI_KERNEL_TEST_API=$(MALI_KERNEL_TEST_API) \
-DMALI_UNIT_TEST=$(MALI_UNIT_TEST) \
-DMALI_COVERAGE=$(MALI_COVERAGE) \
-DMALI_RELEASE_NAME=$(MALI_RELEASE_NAME) \
-DMALI_JIT_PRESSURE_LIMIT_BASE=$(MALI_JIT_PRESSURE_LIMIT_BASE) \
-DMALI_PLATFORM_DIR=$(MALI_PLATFORM_DIR)
ifeq ($(KBUILD_EXTMOD),)
# in-tree
ccflags-y +=-DMALI_KBASE_PLATFORM_PATH=../../$(src)/platform/$(CONFIG_MALI_VALHALL_PLATFORM_NAME)
else
# out-of-tree
ccflags-y +=-DMALI_KBASE_PLATFORM_PATH=$(src)/platform/$(CONFIG_MALI_VALHALL_PLATFORM_NAME)
endif
ccflags-y += \
-I$(srctree)/include/linux \
-I$(srctree)/drivers/staging/android \
-I$(src) \
-I$(src)/platform/$(MALI_PLATFORM_DIR) \
-I$(src)/../../../base \
-I$(src)/../../../../include
subdir-ccflags-y += $(ccflags-y)
#
# Kernel Modules
#
obj-$(CONFIG_MALI_VALHALL) += valhall_kbase.o
obj-$(CONFIG_MALI_VALHALL_KUTF) += tests/
valhall_kbase-y := \
mali_kbase_cache_policy.o \
mali_kbase_ccswe.o \
mali_kbase_mem.o \
mali_kbase_reg_track.o \
mali_kbase_mem_migrate.o \
mali_kbase_mem_pool_group.o \
mali_kbase_native_mgm.o \
mali_kbase_ctx_sched.o \
mali_kbase_gpuprops.o \
mali_kbase_pm.o \
mali_kbase_config.o \
mali_kbase_kinstr_prfcnt.o \
mali_kbase_softjobs.o \
mali_kbase_hw.o \
mali_kbase_debug.o \
mali_kbase_gpu_memory_debugfs.o \
mali_kbase_mem_linux.o \
mali_kbase_core_linux.o \
mali_kbase_mem_profile_debugfs.o \
mali_kbase_disjoint_events.o \
mali_kbase_debug_mem_view.o \
mali_kbase_debug_mem_zones.o \
mali_kbase_debug_mem_allocs.o \
mali_kbase_smc.o \
mali_kbase_mem_pool.o \
mali_kbase_mem_pool_debugfs.o \
mali_kbase_debugfs_helper.o \
mali_kbase_as_fault_debugfs.o \
mali_kbase_dvfs_debugfs.o \
mali_power_gpu_frequency_trace.o \
mali_kbase_trace_gpu_mem.o \
mali_kbase_pbha.o \
mali_kbase_io.o
valhall_kbase-$(CONFIG_DEBUG_FS) += mali_kbase_pbha_debugfs.o
valhall_kbase-$(CONFIG_SYNC_FILE) += \
mali_kbase_fence_ops.o \
mali_kbase_sync_file.o \
mali_kbase_sync_common.o
valhall_kbase-$(CONFIG_MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD) += \
mali_power_gpu_work_period_trace.o \
mali_kbase_gpu_metrics.o
ifneq ($(CONFIG_MALI_VALHALL_CSF_SUPPORT),y)
valhall_kbase-y += \
mali_kbase_jm.o \
mali_kbase_dummy_job_wa.o \
mali_kbase_debug_job_fault.o \
mali_kbase_event.o \
mali_kbase_jd.o \
mali_kbase_jd_debugfs.o \
mali_kbase_js.o \
mali_kbase_js_ctx_attr.o \
mali_kbase_kinstr_jm.o
valhall_kbase-$(CONFIG_SYNC_FILE) += \
mali_kbase_fence_ops.o \
mali_kbase_fence.o
endif
INCLUDE_SUBDIR = \
$(src)/arbiter/Kbuild \
$(src)/context/Kbuild \
$(src)/debug/Kbuild \
$(src)/device/Kbuild \
$(src)/backend/gpu/Kbuild \
$(src)/mmu/Kbuild \
$(src)/tl/Kbuild \
$(src)/hwcnt/Kbuild \
$(src)/gpu/Kbuild \
$(src)/hw_access/Kbuild \
$(src)/thirdparty/Kbuild \
$(src)/platform/$(MALI_PLATFORM_DIR)/Kbuild
ifeq ($(CONFIG_MALI_VALHALL_CSF_SUPPORT),y)
INCLUDE_SUBDIR += $(src)/csf/Kbuild
endif
ifeq ($(CONFIG_MALI_VALHALL_DEVFREQ),y)
ifeq ($(CONFIG_DEVFREQ_THERMAL),y)
INCLUDE_SUBDIR += $(src)/ipa/Kbuild
endif
endif
ifeq ($(KBUILD_EXTMOD),)
# in-tree
-include $(INCLUDE_SUBDIR)
else
# out-of-tree
include $(INCLUDE_SUBDIR)
endif

View File

@@ -0,0 +1,352 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
menuconfig MALI_VALHALL
tristate "Mali Valhall series support"
select DMA_SHARED_BUFFER
select FW_LOADER
default n
help
Enable this option to build support for a ARM Mali Valhall GPU.
To compile this driver as a module, choose M here:
this will generate a single module, called mali_kbase.
if MALI_VALHALL
config MALI_VALHALL_PLATFORM_NAME
depends on MALI_VALHALL
string "Platform name"
default "devicetree"
help
Enter the name of the desired platform configuration directory to
include in the build. 'platform/$(MALI_VALHALL_PLATFORM_NAME)/Kbuild' must
exist.
choice
prompt "Mali HW backend"
depends on MALI_VALHALL
default MALI_VALHALL_REAL_HW
config MALI_VALHALL_REAL_HW
bool "Enable build of Mali kernel driver for real HW"
depends on MALI_VALHALL
help
This is the default HW backend.
config MALI_VALHALL_NO_MALI
bool "Enable build of Mali kernel driver for No Mali"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
help
This can be used to test the driver in a simulated environment
whereby the hardware is not physically present. If the hardware is physically
present it will not be used. This can be used to test the majority of the
driver without needing actual hardware or for software benchmarking.
All calls to the simulated hardware will complete immediately as if the hardware
completed the task.
endchoice
config MALI_VALHALL_NO_MALI_DEFAULT_GPU
string "Default GPU for No Mali"
depends on MALI_VALHALL_NO_MALI
default "tMIx"
help
This option sets the default GPU to identify as for No Mali builds.
menu "Platform specific options"
source "$(MALI_KCONFIG_EXT_PREFIX)drivers/gpu/arm/valhall/platform/Kconfig"
endmenu
config MALI_VALHALL_CSF_SUPPORT
bool "Enable Mali CSF based GPU support"
default y
help
Enables support for CSF based GPUs.
config MALI_VALHALL_DEVFREQ
bool "Enable devfreq support for Mali"
depends on MALI_VALHALL && PM_DEVFREQ
select DEVFREQ_GOV_SIMPLE_ONDEMAND
default y
help
Support devfreq for Mali.
Using the devfreq framework and, by default, the simple on-demand
governor, the frequency of Mali will be dynamically selected from the
available OPPs.
config MALI_VALHALL_DVFS
bool "Enable legacy DVFS"
depends on MALI_VALHALL && !MALI_VALHALL_DEVFREQ
default n
help
Choose this option to enable legacy DVFS in the Mali Midgard DDK.
config MALI_VALHALL_GATOR_SUPPORT
bool "Enable Streamline tracing support"
depends on MALI_VALHALL
default y
help
Enables kbase tracing used by the Arm Streamline Performance Analyzer.
The tracepoints are used to derive GPU activity charts in Streamline.
config MALI_VALHALL_ENABLE_TRACE
bool "Enable kbase tracing"
depends on MALI_VALHALL
default y if MALI_VALHALL_DEBUG
default n
help
Enables tracing in kbase. Trace log available through
the "mali_trace" debugfs file, when the CONFIG_DEBUG_FS is enabled
config MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND
bool "Enable map imported dma-bufs on demand"
depends on MALI_VALHALL
default n
help
This option will cause kbase to set up the GPU mapping of imported
dma-buf when needed to run atoms. This is the legacy behavior.
This is intended for testing and the option will get removed in the
future.
config MALI_VALHALL_DMA_BUF_LEGACY_COMPAT
bool "Enable legacy compatibility cache flush on dma-buf map"
depends on MALI_VALHALL && !MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND
default n
help
This option enables compatibility with legacy dma-buf mapping
behavior, then the dma-buf is mapped on import, by adding cache
maintenance where MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND would do the mapping,
including a cache flush.
This option might work-around issues related to missing cache
flushes in other drivers. This only has an effect for clients using
UK 11.18 or older. For later UK versions it is not possible.
config MALI_VALHALL_CORESIGHT
depends on MALI_VALHALL && MALI_VALHALL_CSF_SUPPORT && !MALI_VALHALL_NO_MALI
bool "Enable Kbase CoreSight tracing support"
default n
menuconfig MALI_VALHALL_EXPERT
depends on MALI_VALHALL
bool "Enable Expert Settings"
default n
help
Enabling this option and modifying the default settings may produce
a driver with performance or other limitations.
if MALI_VALHALL_EXPERT
config VALHALL_LARGE_PAGE_SUPPORT
bool "Support for 2MB page allocations"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default y
help
Rather than allocating all GPU memory page-by-page, allow the system
to decide whether to attempt to allocate 2MB pages from the kernel.
This reduces TLB pressure.
Note that this option only enables the support for the module parameter
and does not necessarily mean that 2MB pages will be used automatically.
This depends on GPU support.
If in doubt, say Y.
config MALI_VALHALL_CORESTACK
bool "Enable support of GPU core stack power control"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default n
help
Enabling this feature on supported GPUs will let the driver powering
on/off the GPU core stack independently without involving the Power
Domain Controller. This should only be enabled on platforms which
integration of the PDC to the Mali GPU is known to be problematic.
This feature is currently only supported on t-Six and t-HEx GPUs.
If unsure, say N.
comment "Debug options"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
config MALI_VALHALL_DEBUG
bool "Enable debug build"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default n
help
Select this option for increased checking and reporting of errors.
config MALI_VALHALL_FENCE_DEBUG
bool "Enable debug sync fence usage"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT && SYNC_FILE
default y if MALI_VALHALL_DEBUG
help
Select this option to enable additional checking and reporting on the
use of sync fences in the Mali driver.
This will add a 3s timeout to all sync fence waits in the Mali
driver, so that when work for Mali has been waiting on a sync fence
for a long time a debug message will be printed, detailing what fence
is causing the block, and which dependent Mali atoms are blocked as a
result of this.
The timeout can be changed at runtime through the js_soft_timeout
device attribute, where the timeout is specified in milliseconds.
config MALI_VALHALL_SYSTEM_TRACE
bool "Enable system event tracing support"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default y if MALI_VALHALL_DEBUG
default n
help
Choose this option to enable system trace events for each
kbase event. This is typically used for debugging but has
minimal overhead when not in use. Enable only if you know what
you are doing.
comment "Instrumentation options"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
choice
prompt "Select Performance counters set"
default MALI_VALHALL_PRFCNT_SET_PRIMARY
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
config MALI_VALHALL_PRFCNT_SET_PRIMARY
bool "Primary"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
help
Select this option to use primary set of performance counters.
config MALI_VALHALL_PRFCNT_SET_SECONDARY
bool "Secondary"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
help
Select this option to use secondary set of performance counters. Kernel
features that depend on an access to the primary set of counters may
become unavailable. Enabling this option will prevent power management
from working optimally and may cause instrumentation tools to return
bogus results.
If unsure, use MALI_VALHALL_PRFCNT_SET_PRIMARY.
config MALI_VALHALL_PRFCNT_SET_TERTIARY
bool "Tertiary"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
help
Select this option to use tertiary set of performance counters. Kernel
features that depend on an access to the primary set of counters may
become unavailable. Enabling this option will prevent power management
from working optimally and may cause instrumentation tools to return
bogus results.
If unsure, use MALI_VALHALL_PRFCNT_SET_PRIMARY.
endchoice
config MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS
bool "Enable runtime selection of performance counters set via debugfs"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT && DEBUG_FS && !MALI_VALHALL_CSF_SUPPORT
default n
help
Select this option to make the secondary set of performance counters
available at runtime via debugfs. Kernel features that depend on an
access to the primary set of counters may become unavailable.
If no runtime debugfs option is set, the build time counter set
choice will be used.
This feature is unsupported and unstable, and may break at any time.
Enabling this option will prevent power management from working
optimally and may cause instrumentation tools to return bogus results.
No validation is done on the debugfs input. Invalid input could cause
performance counter errors. Valid inputs are the values accepted by
the SET_SELECT bits of the PRFCNT_CONFIG register as defined in the
architecture specification.
If unsure, say N.
config MALI_VALHALL_JOB_DUMP
bool "Enable system level support needed for job dumping"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default n
help
Choose this option to enable system level support needed for
job dumping. This is typically used for instrumentation but has
minimal overhead when not in use. Enable only if you know what
you are doing.
comment "Workarounds"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
config MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED
bool "Disable workaround for KBASE_HW_ISSUE_GPU2017_1336"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT
default n
help
This option disables the default workaround for GPU2017-1336. The
workaround keeps the L2 cache powered up except for powerdown and reset.
The workaround introduces a limitation that will prevent the running of
protected mode content on fully coherent platforms, as the switch to IO
coherency mode requires the L2 to be turned off.
config MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE
bool "Use alternative workaround for KBASE_HW_ISSUE_GPU2017_1336"
depends on MALI_VALHALL && MALI_VALHALL_EXPERT && !MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED
default n
help
This option uses an alternative workaround for GPU2017-1336. Lowering
the GPU clock to a, platform specific, known good frequency before
powering down the L2 cache. The clock can be specified in the device
tree using the property, opp-mali-errata-1485982. Otherwise the
slowest clock will be selected.
endif
config MALI_VALHALL_ARBITRATION
tristate "Enable Virtualization reference code"
depends on MALI_VALHALL
default n
help
Enables the build of several reference modules used in the reference
virtualization setup for Mali
If unsure, say N.
config MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD
bool "Enable per-application GPU metrics tracepoints"
depends on MALI_VALHALL
default y
help
This option enables per-application GPU metrics tracepoints.
If unsure, say N.
config MALI_CSF_INCLUDE_FW
depends on MALI_VALHALL && MALI_VALHALL_CSF_SUPPORT
bool "Whether to include CSF firmware into driver"
default y
# source "$(MALI_KCONFIG_EXT_PREFIX)drivers/gpu/arm/valhall/tests/Kconfig"
endif

View File

@@ -0,0 +1,299 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2010-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
KDIR ?= $(KERNEL_SRC)
M ?= $(shell pwd)
ifeq ($(KDIR),)
$(error Must specify KDIR to point to the kernel to target))
endif
#
# Default configuration values
#
# Dependency resolution is done through statements as Kconfig
# is not supported for out-of-tree builds.
#
CONFIGS :=
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
CONFIG_MALI_BIFROST ?= m
ifeq ($(CONFIG_MALI_BIFROST),m)
CONFIG_MALI_VALHALL_PLATFORM_NAME ?= "devicetree"
CONFIG_MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD ?= y
CONFIG_MALI_BIFROST_GATOR_SUPPORT ?= y
CONFIG_MALI_VALHALL_ARBITRATION ?= n
CONFIG_MALI_KUTF_PTM_TESTS ?= n
ifneq ($(CONFIG_MALI_BIFROST_NO_MALI),y)
# Prevent misuse when CONFIG_MALI_BIFROST_NO_MALI!=y
CONFIG_MALI_VALHALL_REAL_HW ?= y
else
CONFIG_MALI_VALHALL_CORESIGHT = n
endif
ifeq ($(CONFIG_MALI_BIFROST_DVFS),y)
# Prevent misuse when CONFIG_MALI_BIFROST_DVFS=y
CONFIG_MALI_BIFROST_DEVFREQ ?= n
else
CONFIG_MALI_BIFROST_DEVFREQ ?= y
endif
ifeq ($(CONFIG_MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND), y)
# Prevent misuse when CONFIG_MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND=y
CONFIG_MALI_VALHALL_DMA_BUF_LEGACY_COMPAT = n
endif
ifeq ($(CONFIG_MALI_VALHALL_CSF_SUPPORT), y)
CONFIG_MALI_VALHALL_CORESIGHT ?= n
endif
#
# Expert/Debug/Test released configurations
#
ifeq ($(CONFIG_MALI_BIFROST_EXPERT), y)
ifeq ($(CONFIG_MALI_BIFROST_NO_MALI), y)
CONFIG_MALI_VALHALL_REAL_HW = n
CONFIG_MALI_VALHALL_NO_MALI_DEFAULT_GPU ?= "tMIx"
else
# Prevent misuse when CONFIG_MALI_BIFROST_NO_MALI=n
CONFIG_MALI_VALHALL_REAL_HW = y
endif
ifeq ($(CONFIG_MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED), y)
# Prevent misuse when CONFIG_MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED=y
CONFIG_MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE = n
endif
ifeq ($(CONFIG_MALI_BIFROST_DEBUG), y)
CONFIG_MALI_BIFROST_ENABLE_TRACE ?= y
CONFIG_MALI_BIFROST_SYSTEM_TRACE ?= y
ifeq ($(CONFIG_SYNC_FILE), y)
CONFIG_MALI_BIFROST_FENCE_DEBUG ?= y
else
CONFIG_MALI_BIFROST_FENCE_DEBUG = n
endif
else
# Prevent misuse when CONFIG_MALI_BIFROST_DEBUG=n
CONFIG_MALI_BIFROST_ENABLE_TRACE = n
CONFIG_MALI_BIFROST_SYSTEM_TRACE = n
CONFIG_MALI_BIFROST_FENCE_DEBUG = n
endif
else
# Prevent misuse when CONFIG_MALI_BIFROST_EXPERT=n
CONFIG_MALI_VALHALL_CORESTACK = n
CONFIG_VALHALL_LARGE_PAGE_SUPPORT = y
CONFIG_MALI_VALHALL_JOB_DUMP = n
CONFIG_MALI_BIFROST_NO_MALI = n
CONFIG_MALI_VALHALL_REAL_HW = y
CONFIG_MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED = n
CONFIG_MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE = n
CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS = n
CONFIG_MALI_BIFROST_DEBUG = n
CONFIG_MALI_BIFROST_ENABLE_TRACE = n
CONFIG_MALI_BIFROST_SYSTEM_TRACE = n
CONFIG_MALI_BIFROST_FENCE_DEBUG = n
endif
ifeq ($(CONFIG_MALI_BIFROST_DEBUG), y)
CONFIG_MALI_VALHALL_KUTF ?= y
ifeq ($(CONFIG_MALI_VALHALL_KUTF), y)
CONFIG_MALI_VALHALL_KUTF_IRQ_TEST ?= y
CONFIG_MALI_VALHALL_KUTF_CLK_RATE_TRACE ?= y
CONFIG_MALI_VALHALL_KUTF_MGM_INTEGRATION_TEST ?= y
ifeq ($(CONFIG_MALI_BIFROST_DEVFREQ), y)
ifeq ($(CONFIG_MALI_BIFROST_NO_MALI), y)
CONFIG_MALI_KUTF_IPA_UNIT_TEST ?= y
endif
endif
else
# Prevent misuse when CONFIG_MALI_VALHALL_KUTF=n
CONFIG_MALI_VALHALL_KUTF_IRQ_TEST = n
CONFIG_MALI_VALHALL_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_VALHALL_KUTF_MGM_INTEGRATION_TEST = n
endif
else
# Prevent misuse when CONFIG_MALI_BIFROST_DEBUG=n
CONFIG_MALI_VALHALL_KUTF = n
CONFIG_MALI_VALHALL_KUTF_IRQ_TEST = n
CONFIG_MALI_VALHALL_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_VALHALL_KUTF_MGM_INTEGRATION_TEST = n
endif
else
# Prevent misuse when CONFIG_MALI_BIFROST=n
CONFIG_MALI_VALHALL_ARBITRATION = n
CONFIG_MALI_VALHALL_KUTF = n
CONFIG_MALI_VALHALL_KUTF_IRQ_TEST = n
CONFIG_MALI_VALHALL_KUTF_CLK_RATE_TRACE = n
CONFIG_MALI_VALHALL_KUTF_MGM_INTEGRATION_TEST = n
endif
# All Mali CONFIG should be listed here
CONFIGS += \
CONFIG_MALI_BIFROST \
CONFIG_MALI_VALHALL_CSF_SUPPORT \
CONFIG_MALI_BIFROST_GATOR_SUPPORT \
CONFIG_MALI_VALHALL_ARBITRATION \
CONFIG_MALI_KUTF_PTM_TESTS \
CONFIG_MALI_VALHALL_REAL_HW \
CONFIG_MALI_BIFROST_DEVFREQ \
CONFIG_MALI_BIFROST_DVFS \
CONFIG_MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND \
CONFIG_MALI_VALHALL_DMA_BUF_LEGACY_COMPAT \
CONFIG_MALI_BIFROST_EXPERT \
CONFIG_MALI_VALHALL_CORESTACK \
CONFIG_VALHALL_LARGE_PAGE_SUPPORT \
CONFIG_MALI_VALHALL_JOB_DUMP \
CONFIG_MALI_BIFROST_NO_MALI \
CONFIG_MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED \
CONFIG_MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE \
CONFIG_MALI_VALHALL_PRFCNT_SET_PRIMARY \
CONFIG_MALI_BIFROST_PRFCNT_SET_SECONDARY \
CONFIG_MALI_VALHALL_PRFCNT_SET_TERTIARY \
CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS \
CONFIG_MALI_BIFROST_DEBUG \
CONFIG_MALI_BIFROST_ENABLE_TRACE \
CONFIG_MALI_BIFROST_SYSTEM_TRACE \
CONFIG_MALI_BIFROST_FENCE_DEBUG \
CONFIG_MALI_VALHALL_KUTF \
CONFIG_MALI_VALHALL_KUTF_IRQ_TEST \
CONFIG_MALI_VALHALL_KUTF_CLK_RATE_TRACE \
CONFIG_MALI_VALHALL_KUTF_MGM_INTEGRATION_TEST \
CONFIG_MALI_XEN \
CONFIG_MALI_VALHALL_CORESIGHT \
CONFIG_MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD
endif
THIS_DIR := $(dir $(lastword $(MAKEFILE_LIST)))
-include $(THIS_DIR)/../arbitration/Makefile
# MAKE_ARGS to pass the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs and values.
# $(value config) is the name of the CONFIG option.
# $(value $(value config)) is its value (y, m).
# When the CONFIG is not set to y or m, it defaults to n.
MAKE_ARGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
$(value config)=$(value $(value config)), \
$(value config)=n))
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
MAKE_ARGS += CONFIG_MALI_VALHALL_PLATFORM_NAME=$(CONFIG_MALI_VALHALL_PLATFORM_NAME)
endif
#
# EXTRA_CFLAGS to define the custom CONFIGs on out-of-tree build
#
# Generate the list of CONFIGs defines with values from CONFIGS.
# $(value config) is the name of the CONFIG option.
# When set to y or m, the CONFIG gets defined to 1.
EXTRA_CFLAGS := $(foreach config,$(CONFIGS), \
$(if $(filter y m,$(value $(value config))), \
-D$(value config)=1))
ifeq ($(MALI_KCONFIG_EXT_PREFIX),)
EXTRA_CFLAGS += -DCONFIG_MALI_PLATFORM_NAME='\"$(CONFIG_MALI_VALHALL_PLATFORM_NAME)\"'
EXTRA_CFLAGS += -DCONFIG_MALI_NO_MALI_DEFAULT_GPU='\"$(CONFIG_MALI_VALHALL_NO_MALI_DEFAULT_GPU)\"'
endif
#
# KBUILD_EXTRA_SYMBOLS to prevent warnings about unknown functions
#
BASE_SYMBOLS =
EXTRA_SYMBOLS += \
$(BASE_SYMBOLS)
CFLAGS_MODULE += -Wall -Werror
# The following were added to align with W=1 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wextra -Wunused -Wno-unused-parameter
CFLAGS_MODULE += -Wmissing-declarations
CFLAGS_MODULE += -Wmissing-format-attribute
CFLAGS_MODULE += -Wmissing-prototypes
CFLAGS_MODULE += -Wold-style-definition
# The -Wmissing-include-dirs cannot be enabled as the path to some of the
# included directories change depending on whether it is an in-tree or
# out-of-tree build.
CFLAGS_MODULE += $(call cc-option, -Wunused-but-set-variable)
CFLAGS_MODULE += $(call cc-option, -Wunused-const-variable)
CFLAGS_MODULE += $(call cc-option, -Wpacked-not-aligned)
CFLAGS_MODULE += $(call cc-option, -Wstringop-truncation)
# The following turn off the warnings enabled by -Wextra
CFLAGS_MODULE += -Wno-sign-compare
CFLAGS_MODULE += -Wno-shift-negative-value
# This flag is needed to avoid build errors on older kernels
CFLAGS_MODULE += $(call cc-option, -Wno-cast-function-type)
# This flag was added on v6.6 kernel
CFLAGS_MODULE += $(call cc-option,-Werror=designated-init)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
# The following were added to align with W=2 in scripts/Makefile.extrawarn
# from the Linux source tree (v5.18.14)
CFLAGS_MODULE += -Wdisabled-optimization
# The -Wshadow flag cannot be enabled unless upstream kernels are
# patched to fix redefinitions of certain built-in functions and
# global variables.
CFLAGS_MODULE += $(call cc-option, -Wlogical-op)
CFLAGS_MODULE += -Wmissing-field-initializers
# -Wtype-limits must be disabled due to build failures on kernel 5.x
CFLAGS_MODULE += -Wno-type-limits
CFLAGS_MODULE += $(call cc-option, -Wmaybe-uninitialized)
CFLAGS_MODULE += $(call cc-option, -Wunused-macros)
# The following ensures the stack frame does not get larger than a page
CFLAGS_MODULE += -Wframe-larger-than=4096
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN2
# This warning is disabled to avoid build failures in some kernel versions
CFLAGS_MODULE += -Wno-ignored-qualifiers
ifeq ($(CONFIG_GCOV_KERNEL),y)
CFLAGS_MODULE += $(call cc-option, -ftest-coverage)
CFLAGS_MODULE += $(call cc-option, -fprofile-arcs)
EXTRA_CFLAGS += -DGCOV_PROFILE=1
endif
ifeq ($(CONFIG_MALI_KCOV),y)
CFLAGS_MODULE += $(call cc-option, -fsanitize-coverage=trace-cmp)
EXTRA_CFLAGS += -DKCOV=1
EXTRA_CFLAGS += -DKCOV_ENABLE_COMPARISONS=1
endif
all:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) EXTRA_CFLAGS="$(EXTRA_CFLAGS)" KBUILD_EXTRA_SYMBOLS="$(EXTRA_SYMBOLS)" modules
modules_install:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) modules_install
clean:
$(MAKE) -C $(KDIR) M=$(M) $(MAKE_ARGS) clean

View File

@@ -0,0 +1,24 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
valhall_kbase-y += \
arbiter/mali_kbase_arbif.o \
arbiter/mali_kbase_arbiter_pm.o

View File

@@ -0,0 +1,395 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
*/
#include <mali_kbase.h>
#include "mali_kbase_arbif.h"
#include <tl/mali_kbase_tracepoints.h>
#include <linux/of.h>
#include <linux/of_platform.h>
#include "linux/mali_arbiter_interface.h"
/* Arbiter interface version against which was implemented this module */
#define MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION 5
#if MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION != MALI_ARBITER_INTERFACE_VERSION
#error "Unsupported Mali Arbiter interface version."
#endif
static void on_max_config(struct device *dev, uint32_t max_l2_slices, uint32_t max_core_mask)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
if (!max_l2_slices || !max_core_mask) {
dev_dbg(dev, "%s(): max_config ignored as one of the fields is zero", __func__);
return;
}
/* set the max config info in the kbase device */
kbase_arbiter_set_max_config(kbdev, max_l2_slices, max_core_mask);
}
/**
* on_update_freq() - Updates GPU clock frequency
* @dev: arbiter interface device handle
* @freq: GPU clock frequency value reported from arbiter
*
* call back function to update GPU clock frequency with
* new value from arbiter
*/
static void on_update_freq(struct device *dev, uint32_t freq)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
kbase_arbiter_pm_update_gpu_freq(&kbdev->arb.arb_freq, freq);
}
/**
* on_gpu_stop() - sends KBASE_VM_GPU_STOP_EVT event on VM stop
* @dev: arbiter interface device handle
*
* call back function to signal a GPU STOP event from arbiter interface
*/
static void on_gpu_stop(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
KBASE_TLSTREAM_TL_ARBITER_STOP_REQUESTED(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_STOP_REQUESTED, NULL, 0);
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_STOP_EVT);
}
/**
* on_gpu_granted() - sends KBASE_VM_GPU_GRANTED_EVT event on GPU granted
* @dev: arbiter interface device handle
*
* call back function to signal a GPU GRANT event from arbiter interface
*/
static void on_gpu_granted(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
KBASE_TLSTREAM_TL_ARBITER_GRANTED(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_GRANTED, NULL, 0);
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_GRANTED_EVT);
}
/**
* on_gpu_lost() - sends KBASE_VM_GPU_LOST_EVT event on GPU granted
* @dev: arbiter interface device handle
*
* call back function to signal a GPU LOST event from arbiter interface
*/
static void on_gpu_lost(struct device *dev)
{
struct kbase_device *kbdev;
if (!dev) {
pr_err("%s(): dev is NULL", __func__);
return;
}
kbdev = dev_get_drvdata(dev);
if (!kbdev) {
dev_err(dev, "%s(): kbdev is NULL", __func__);
return;
}
KBASE_TLSTREAM_TL_ARBITER_LOST(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_LOST, NULL, 0);
kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_LOST_EVT);
}
static int kbase_arbif_of_init(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if;
struct device_node *arbiter_if_node;
struct platform_device *pdev;
if (!IS_ENABLED(CONFIG_OF)) {
/*
* Return -ENODEV in the event CONFIG_OF is not available and let the
* internal AW check for suitability for arbitration.
*/
return -ENODEV;
}
arbiter_if_node = of_parse_phandle(kbdev->dev->of_node, "arbiter-if", 0);
if (!arbiter_if_node)
arbiter_if_node = of_parse_phandle(kbdev->dev->of_node, "arbiter_if", 0);
if (!arbiter_if_node) {
dev_dbg(kbdev->dev, "No arbiter_if in Device Tree");
/* no arbiter interface defined in device tree */
kbdev->arb.arb_dev = NULL;
kbdev->arb.arb_if = NULL;
return -ENODEV;
}
pdev = of_find_device_by_node(arbiter_if_node);
if (!pdev) {
dev_err(kbdev->dev, "Failed to find arbiter_if device");
return -EPROBE_DEFER;
}
if (!pdev->dev.driver || !try_module_get(pdev->dev.driver->owner)) {
dev_err(kbdev->dev, "arbiter_if driver not available");
put_device(&pdev->dev);
return -EPROBE_DEFER;
}
kbdev->arb.arb_dev = &pdev->dev;
arb_if = platform_get_drvdata(pdev);
if (!arb_if) {
dev_err(kbdev->dev, "arbiter_if driver not ready");
module_put(pdev->dev.driver->owner);
put_device(&pdev->dev);
return -EPROBE_DEFER;
}
kbdev->arb.arb_if = arb_if;
return 0;
}
static void kbase_arbif_of_term(struct kbase_device *kbdev)
{
if (!IS_ENABLED(CONFIG_OF))
return;
if (kbdev->arb.arb_dev) {
module_put(kbdev->arb.arb_dev->driver->owner);
put_device(kbdev->arb.arb_dev);
}
kbdev->arb.arb_dev = NULL;
}
/**
* kbase_arbif_init() - Kbase Arbiter interface initialisation.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialise Kbase Arbiter interface and assign callback functions.
*
* Return:
* * 0 - the interface was initialized or was not specified
* * in the device tree.
* * -EFAULT - the interface was specified but failed to initialize.
* * -EPROBE_DEFER - module dependencies are not yet available.
*/
int kbase_arbif_init(struct kbase_device *kbdev)
{
struct arbiter_if_arb_vm_ops ops;
struct arbiter_if_dev *arb_if;
int err = 0;
/* Tries to init with 'arbiter-if' if present in devicetree */
err = kbase_arbif_of_init(kbdev);
if (err == -ENODEV) {
/* devicetree does not support arbitration */
return -EPERM;
}
if (err)
return err;
ops.arb_vm_gpu_stop = on_gpu_stop;
ops.arb_vm_gpu_granted = on_gpu_granted;
ops.arb_vm_gpu_lost = on_gpu_lost;
ops.arb_vm_max_config = on_max_config;
ops.arb_vm_update_freq = on_update_freq;
kbdev->arb.arb_freq.arb_freq = 0;
kbdev->arb.arb_freq.freq_updated = false;
mutex_init(&kbdev->arb.arb_freq.arb_freq_lock);
arb_if = kbdev->arb.arb_if;
if (arb_if == NULL) {
dev_err(kbdev->dev, "No arbiter interface present");
goto failure_term;
}
if (!arb_if->vm_ops.vm_arb_register_dev) {
dev_err(kbdev->dev, "arbiter_if registration callback not present");
goto failure_term;
}
/* register kbase arbiter_if callbacks */
err = arb_if->vm_ops.vm_arb_register_dev(arb_if, kbdev->dev, &ops);
if (err) {
dev_err(kbdev->dev, "Failed to register with arbiter. (err = %d)", err);
goto failure_term;
}
return 0;
failure_term:
{
kbase_arbif_of_term(kbdev);
}
if (err != -EPROBE_DEFER)
err = -EFAULT;
return err;
}
/**
* kbase_arbif_destroy() - De-init Kbase arbiter interface
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* De-initialise Kbase arbiter interface
*/
void kbase_arbif_destroy(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_unregister_dev)
arb_if->vm_ops.vm_arb_unregister_dev(kbdev->arb.arb_if);
{
kbase_arbif_of_term(kbdev);
}
kbdev->arb.arb_if = NULL;
}
/**
* kbase_arbif_get_max_config() - Request max config info
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting max config info
*/
void kbase_arbif_get_max_config(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_get_max_config)
arb_if->vm_ops.vm_arb_get_max_config(arb_if);
}
/**
* kbase_arbif_gpu_request() - Request GPU from
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting GPU for VM
*/
void kbase_arbif_gpu_request(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_request) {
KBASE_TLSTREAM_TL_ARBITER_REQUESTED(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_REQUESTED, NULL, 0);
arb_if->vm_ops.vm_arb_gpu_request(arb_if);
}
}
/**
* kbase_arbif_gpu_stopped() - send GPU stopped message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @gpu_required: GPU request flag
*
*/
void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_stopped) {
KBASE_TLSTREAM_TL_ARBITER_STOPPED(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_STOPPED, NULL, 0);
if (gpu_required) {
KBASE_TLSTREAM_TL_ARBITER_REQUESTED(kbdev, kbdev);
KBASE_KTRACE_ADD(kbdev, ARB_GPU_REQUESTED, NULL, 0);
}
arb_if->vm_ops.vm_arb_gpu_stopped(arb_if, gpu_required);
}
}
/**
* kbase_arbif_gpu_active() - Sends a GPU_ACTIVE message to the Arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Informs the arbiter VM is active
*/
void kbase_arbif_gpu_active(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_active)
arb_if->vm_ops.vm_arb_gpu_active(arb_if);
}
/**
* kbase_arbif_gpu_idle() - Inform the arbiter that the VM has gone idle
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Informs the arbiter VM is idle
*/
void kbase_arbif_gpu_idle(struct kbase_device *kbdev)
{
struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
if (arb_if && arb_if->vm_ops.vm_arb_gpu_idle)
arb_if->vm_ops.vm_arb_gpu_idle(arb_if);
}

View File

@@ -0,0 +1,122 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
*/
#ifndef _MALI_KBASE_ARBIF_H_
#define _MALI_KBASE_ARBIF_H_
/**
* enum kbase_arbif_evt - Internal Arbiter event.
*
* @KBASE_VM_GPU_INITIALIZED_EVT: KBase has finished initializing
* and can be stopped
* @KBASE_VM_GPU_STOP_EVT: Stop message received from Arbiter
* @KBASE_VM_GPU_GRANTED_EVT: Grant message received from Arbiter
* @KBASE_VM_GPU_LOST_EVT: Lost message received from Arbiter
* @KBASE_VM_GPU_IDLE_EVENT: KBase has transitioned into an inactive state.
* @KBASE_VM_REF_EVENT: KBase has transitioned into an active state.
* @KBASE_VM_OS_SUSPEND_EVENT: KBase is suspending
* @KBASE_VM_OS_RESUME_EVENT: Kbase is resuming
*/
enum kbase_arbif_evt {
KBASE_VM_GPU_INITIALIZED_EVT = 1,
KBASE_VM_GPU_STOP_EVT,
KBASE_VM_GPU_GRANTED_EVT,
KBASE_VM_GPU_LOST_EVT,
KBASE_VM_GPU_IDLE_EVENT,
KBASE_VM_REF_EVENT,
KBASE_VM_OS_SUSPEND_EVENT,
KBASE_VM_OS_RESUME_EVENT,
};
/**
* kbase_arbif_init() - Initialize the arbiter interface functionality.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialize the arbiter interface and also determines
* if Arbiter functionality is required.
*
* Return:
* * 0 - the interface was initialized or was not specified
* * in the device tree.
* * -EFAULT - the interface was specified but failed to initialize.
* * -EPROBE_DEFER - module dependencies are not yet available.
*/
int kbase_arbif_init(struct kbase_device *kbdev);
/**
* kbase_arbif_destroy() - Cleanups the arbiter interface functionality.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Cleans up the arbiter interface functionality and resets the reference count
* of the arbif module used
*/
void kbase_arbif_destroy(struct kbase_device *kbdev);
/**
* kbase_arbif_get_max_config() - Request max config info
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* call back function from arb interface to arbiter requesting max config info
*/
void kbase_arbif_get_max_config(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_request() - Send GPU request message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to request GPU access.
*/
void kbase_arbif_gpu_request(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_stopped() - Send GPU stopped message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @gpu_required: true if GPU access is still required
* (Arbiter will automatically send another grant message)
*
* Sends a message to Arbiter to notify that the GPU has stopped.
* @note Once this call has been made, KBase must not attempt to access the GPU
* until the #KBASE_VM_GPU_GRANTED_EVT event has been received.
*/
void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required);
/**
* kbase_arbif_gpu_active() - Send a GPU active message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to report that KBase has gone active.
*/
void kbase_arbif_gpu_active(struct kbase_device *kbdev);
/**
* kbase_arbif_gpu_idle() - Send a GPU idle message to the arbiter
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Sends a message to Arbiter to report that KBase has gone idle.
*/
void kbase_arbif_gpu_idle(struct kbase_device *kbdev);
#endif /* _MALI_KBASE_ARBIF_H_ */

View File

@@ -0,0 +1,76 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali structures define to support arbitration feature
*/
#ifndef _MALI_KBASE_ARBITER_DEFS_H_
#define _MALI_KBASE_ARBITER_DEFS_H_
#include "mali_kbase_arbiter_pm.h"
/**
* struct kbase_arbiter_vm_state - Struct representing the state and containing the
* data of pm work
* @kbdev: Pointer to kbase device structure (must be a valid pointer)
* @vm_state_lock: The lock protecting the VM state when arbiter is used.
* This lock must also be held whenever the VM state is being
* transitioned
* @vm_state_wait: Wait queue set when GPU is granted
* @vm_state: Current state of VM
* @vm_arb_wq: Work queue for resuming or stopping work on the GPU for use
* with the Arbiter
* @vm_suspend_work: Work item for vm_arb_wq to stop current work on GPU
* @vm_resume_work: Work item for vm_arb_wq to resume current work on GPU
* @vm_arb_starting: Work queue resume in progress
* @vm_arb_stopping: Work queue suspend in progress
* @interrupts_installed: Flag set when interrupts are installed
* @vm_request_timer: Timer to monitor GPU request
*/
struct kbase_arbiter_vm_state {
struct kbase_device *kbdev;
struct mutex vm_state_lock;
wait_queue_head_t vm_state_wait;
enum kbase_vm_state vm_state;
struct workqueue_struct *vm_arb_wq;
struct work_struct vm_suspend_work;
struct work_struct vm_resume_work;
bool vm_arb_starting;
bool vm_arb_stopping;
bool interrupts_installed;
struct hrtimer vm_request_timer;
};
/**
* struct kbase_arbiter_device - Representing an instance of arbiter device,
* allocated from the probe method of Mali driver
* @arb_if: Pointer to the arbiter interface device
* @arb_dev: Pointer to the arbiter device
* @arb_freq: GPU clock frequency retrieved from arbiter.
*/
struct kbase_arbiter_device {
struct arbiter_if_dev *arb_if;
struct device *arb_dev;
struct kbase_arbiter_freq arb_freq;
};
#endif /* _MALI_KBASE_ARBITER_DEFS_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,195 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/**
* DOC: Mali arbiter power manager state machine and APIs
*/
#ifndef _MALI_KBASE_ARBITER_PM_H_
#define _MALI_KBASE_ARBITER_PM_H_
#include "mali_kbase_arbif.h"
/**
* enum kbase_vm_state - Current PM Arbitration state.
*
* @KBASE_VM_STATE_INITIALIZING: Special state before arbiter is initialized.
* @KBASE_VM_STATE_INITIALIZING_WITH_GPU: Initialization after GPU
* has been granted.
* @KBASE_VM_STATE_SUSPENDED: KBase is suspended by OS and GPU is not assigned.
* @KBASE_VM_STATE_STOPPED: GPU is not assigned to KBase and is not required.
* @KBASE_VM_STATE_STOPPED_GPU_REQUESTED: GPU is not assigned to KBase
* but a request has been made.
* @KBASE_VM_STATE_STARTING: GPU is assigned and KBase is getting ready to run.
* @KBASE_VM_STATE_IDLE: GPU is assigned but KBase has no work to do
* @KBASE_VM_STATE_ACTIVE: GPU is assigned and KBase is busy using it
* @KBASE_VM_STATE_SUSPEND_PENDING: OS is going into suspend mode.
* @KBASE_VM_STATE_SUSPEND_WAIT_FOR_GRANT: OS is going into suspend mode but GPU
* has already been requested.
* In this situation we must wait for
* the Arbiter to send a GRANTED message
* and respond immediately with
* a STOPPED message before entering
* the suspend mode.
* @KBASE_VM_STATE_STOPPING_IDLE: Arbiter has sent a stopped message and there
* is currently no work to do on the GPU.
* @KBASE_VM_STATE_STOPPING_ACTIVE: Arbiter has sent a stopped message when
* KBase has work to do.
*/
enum kbase_vm_state {
KBASE_VM_STATE_INITIALIZING,
KBASE_VM_STATE_INITIALIZING_WITH_GPU,
KBASE_VM_STATE_SUSPENDED,
KBASE_VM_STATE_STOPPED,
KBASE_VM_STATE_STOPPED_GPU_REQUESTED,
KBASE_VM_STATE_STARTING,
KBASE_VM_STATE_IDLE,
KBASE_VM_STATE_ACTIVE,
KBASE_VM_STATE_SUSPEND_PENDING,
KBASE_VM_STATE_SUSPEND_WAIT_FOR_GRANT,
KBASE_VM_STATE_STOPPING_IDLE,
KBASE_VM_STATE_STOPPING_ACTIVE
};
/**
* kbase_arbiter_pm_early_init() - Initialize arbiter for VM Paravirtualized use
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Initialize the arbiter and other required resources during the runtime
* and request the GPU for the VM for the first time.
*
* Return: 0 if successful, otherwise a standard Linux error code
*/
int kbase_arbiter_pm_early_init(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_early_term() - Shutdown arbiter and free resources.
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Clean up all the resources
*/
void kbase_arbiter_pm_early_term(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_release_interrupts() - Release the GPU interrupts
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Releases interrupts and set the interrupt flag to false
*/
void kbase_arbiter_pm_release_interrupts(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_install_interrupts() - Install the GPU interrupts
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Install interrupts and set the interrupt_install flag to true.
*
* Return: 0 if success or already installed. Otherwise a Linux error code
*/
int kbase_arbiter_pm_install_interrupts(struct kbase_device *kbdev);
/**
* kbase_arbiter_pm_vm_event() - Dispatch VM event to the state machine
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @event: The event to dispatch
*
* The state machine function. Receives events and transitions states
* according the event received and the current state
*/
void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev, enum kbase_arbif_evt event);
/**
* kbase_arbiter_pm_ctx_active_handle_suspend() - Handle suspend operation for
* arbitration mode
* @kbdev: The kbase device structure for the device (must be a valid pointer)
* @suspend_handler: The handler code for how to handle a suspend
* that might occur
* @sched_lock_held: Flag variable that tells whether the caller grabs the
* scheduler lock or not
*
* This function handles a suspend event from the driver,
* communicating with the arbiter and waiting synchronously for the GPU
* to be granted again depending on the VM state.
*
* Return: 0 if success, 1 if failure due to system suspending/suspended
*/
int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
enum kbase_pm_suspend_handler suspend_handler,
bool sched_lock_held);
/**
* kbase_arbiter_pm_vm_stopped() - Handle stop event for the VM
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* This function handles a stop event for the VM.
* It will update the VM state and forward the stop event to the driver.
*/
void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev);
/**
* kbase_arbiter_set_max_config() - Set the max config data in kbase device.
* @kbdev: The kbase device structure for the device (must be a valid pointer).
* @max_l2_slices: The maximum number of L2 slices.
* @max_core_mask: The largest core mask.
*
* This function handles a stop event for the VM.
* It will update the VM state and forward the stop event to the driver.
*/
void kbase_arbiter_set_max_config(struct kbase_device *kbdev, uint32_t max_l2_slices,
uint32_t max_core_mask);
/**
* kbase_arbiter_pm_gpu_assigned() - Determine if this VM has access to the GPU
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Return: 0 if the VM does not have access, 1 if it does, and a negative number
* if an error occurred
*/
int kbase_arbiter_pm_gpu_assigned(struct kbase_device *kbdev);
extern struct kbase_clk_rate_trace_op_conf arb_clk_rate_trace_ops;
/**
* struct kbase_arbiter_freq - Holding the GPU clock frequency data retrieved
* from arbiter
* @arb_freq: GPU clock frequency value
* @arb_freq_lock: Mutex protecting access to arbfreq value
* @nb: Notifier block to receive rate change callbacks
* @freq_updated: Flag to indicate whether a frequency changed has just been
* communicated to avoid "GPU_GRANTED when not expected" warning
*/
struct kbase_arbiter_freq {
uint32_t arb_freq;
struct mutex arb_freq_lock;
struct notifier_block *nb;
bool freq_updated;
};
/**
* kbase_arbiter_pm_update_gpu_freq() - Update GPU frequency
* @arb_freq: Pointer to GPU clock frequency data
* @freq: The new frequency
*
* Updates the GPU frequency and triggers any notifications
*/
void kbase_arbiter_pm_update_gpu_freq(struct kbase_arbiter_freq *arb_freq, uint32_t freq);
#endif /*_MALI_KBASE_ARBITER_PM_H_ */

View File

@@ -0,0 +1,53 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
valhall_kbase-y += \
backend/gpu/mali_kbase_cache_policy_backend.o \
backend/gpu/mali_kbase_gpuprops_backend.o \
backend/gpu/mali_kbase_irq_linux.o \
backend/gpu/mali_kbase_pm_backend.o \
backend/gpu/mali_kbase_pm_driver.o \
backend/gpu/mali_kbase_pm_metrics.o \
backend/gpu/mali_kbase_pm_ca.o \
backend/gpu/mali_kbase_pm_always_on.o \
backend/gpu/mali_kbase_pm_coarse_demand.o \
backend/gpu/mali_kbase_pm_policy.o \
backend/gpu/mali_kbase_time.o \
backend/gpu/mali_kbase_l2_mmu_config.o \
backend/gpu/mali_kbase_clk_rate_trace_mgr.o
ifeq ($(MALI_USE_CSF),0)
valhall_kbase-y += \
backend/gpu/mali_kbase_instr_backend.o \
backend/gpu/mali_kbase_jm_as.o \
backend/gpu/mali_kbase_debug_job_fault_backend.o \
backend/gpu/mali_kbase_jm_hw.o \
backend/gpu/mali_kbase_jm_rb.o \
backend/gpu/mali_kbase_js_backend.o
endif
valhall_kbase-$(CONFIG_MALI_VALHALL_DEVFREQ) += \
backend/gpu/mali_kbase_devfreq.o
valhall_kbase-$(CONFIG_MALI_VALHALL_NO_MALI) += backend/gpu/mali_kbase_model_linux.o
# NO_MALI Dummy model interface
valhall_kbase-$(CONFIG_MALI_VALHALL_NO_MALI) += backend/gpu/mali_kbase_model_dummy.o

View File

@@ -0,0 +1,64 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include "backend/gpu/mali_kbase_cache_policy_backend.h"
#include <device/mali_kbase_device.h>
void kbase_cache_set_coherency_mode(struct kbase_device *kbdev, u32 mode)
{
kbdev->current_gpu_coherency_mode = mode;
#if MALI_USE_CSF
if (kbdev->gpu_props.gpu_id.arch_id >= GPU_ID_ARCH_MAKE(12, 0, 1)) {
/* AMBA_ENABLE present from 12.0.1 */
u32 val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE));
val = AMBA_ENABLE_COHERENCY_PROTOCOL_SET(val, mode);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE), val);
} else {
/* Fallback to COHERENCY_ENABLE for older versions */
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(COHERENCY_ENABLE), mode);
}
#else /* MALI_USE_CSF */
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(COHERENCY_ENABLE), mode);
#endif /* MALI_USE_CSF */
}
void kbase_amba_set_shareable_cache_support(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
/* AMBA registers only present from 12.0.1 */
if (kbdev->gpu_props.gpu_id.arch_id < GPU_ID_ARCH_MAKE(12, 0, 1))
return;
if (kbdev->system_coherency != COHERENCY_NONE) {
u32 val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_FEATURES));
if (AMBA_FEATURES_SHAREABLE_CACHE_SUPPORT_GET(val)) {
val = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE));
val = AMBA_ENABLE_SHAREABLE_CACHE_SUPPORT_SET(val, 1);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(AMBA_ENABLE), val);
}
}
#endif /* MALI_USE_CSF */
}

View File

@@ -0,0 +1,45 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CACHE_POLICY_BACKEND_H_
#define _KBASE_CACHE_POLICY_BACKEND_H_
#include <linux/types.h>
struct kbase_device;
/**
* kbase_cache_set_coherency_mode() - Sets the system coherency mode
* in the GPU.
* @kbdev: Device pointer
* @mode: Coherency mode. COHERENCY_ACE/ACE_LITE
*/
void kbase_cache_set_coherency_mode(struct kbase_device *kbdev, u32 mode);
/**
* kbase_amba_set_shareable_cache_support() - Sets AMBA shareable cache support
* in the GPU.
* @kbdev: Device pointer
*
* Note: Only for arch version 12.x.1 onwards.
*/
void kbase_amba_set_shareable_cache_support(struct kbase_device *kbdev);
#endif /* _KBASE_CACHE_POLICY_BACKEND_H_ */

View File

@@ -0,0 +1,325 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2020-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Implementation of the GPU clock rate trace manager.
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <linux/clk.h>
#include <linux/pm_opp.h>
#include <asm/div64.h>
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#ifdef CONFIG_TRACE_POWER_GPU_FREQUENCY
#include <trace/events/power_gpu_frequency.h>
#else
#include "mali_power_gpu_frequency_trace.h"
#endif
#ifndef CLK_RATE_TRACE_OPS
#define CLK_RATE_TRACE_OPS (NULL)
#endif
/**
* get_clk_rate_trace_callbacks() - Returns pointer to clk trace ops.
* @kbdev: Pointer to kbase device, used to check if arbitration is enabled
* when compiled with arbiter support.
* Return: Pointer to clk trace ops if supported or NULL.
*/
static struct kbase_clk_rate_trace_op_conf *
get_clk_rate_trace_callbacks(__maybe_unused struct kbase_device *kbdev)
{
/* base case */
const void *arbiter_if_node;
struct kbase_clk_rate_trace_op_conf *callbacks =
(struct kbase_clk_rate_trace_op_conf *)CLK_RATE_TRACE_OPS;
/* Nothing left to do here if there is no Arbiter/virtualization or if
* CONFIG_OF is not enabled.
*/
if (!IS_ENABLED(CONFIG_OF))
return callbacks;
if (WARN_ON(!kbdev) || WARN_ON(!kbdev->dev))
return callbacks;
if (!kbase_has_arbiter(kbdev))
return callbacks;
arbiter_if_node = of_get_property(kbdev->dev->of_node, "arbiter-if", NULL);
if (!arbiter_if_node)
arbiter_if_node = of_get_property(kbdev->dev->of_node, "arbiter_if", NULL);
/* Arbitration enabled, override the callback pointer.*/
if (arbiter_if_node)
callbacks = &arb_clk_rate_trace_ops;
else
dev_dbg(kbdev->dev,
"Arbitration supported but disabled by platform. Leaving clk rate callbacks as default.\n");
return callbacks;
}
static int gpu_clk_rate_change_notifier(struct notifier_block *nb, unsigned long event, void *data)
{
struct kbase_gpu_clk_notifier_data *ndata = data;
struct kbase_clk_data *clk_data =
container_of(nb, struct kbase_clk_data, clk_rate_change_nb);
struct kbase_clk_rate_trace_manager *clk_rtm = clk_data->clk_rtm;
unsigned long flags;
if (WARN_ON_ONCE(clk_data->gpu_clk_handle != ndata->gpu_clk_handle))
return NOTIFY_BAD;
spin_lock_irqsave(&clk_rtm->lock, flags);
if (event == POST_RATE_CHANGE) {
if (!clk_rtm->gpu_idle && (clk_data->clock_val != ndata->new_rate)) {
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index,
ndata->new_rate);
}
clk_data->clock_val = ndata->new_rate;
}
spin_unlock_irqrestore(&clk_rtm->lock, flags);
return NOTIFY_DONE;
}
static int gpu_clk_data_init(struct kbase_device *kbdev, void *gpu_clk_handle, unsigned int index)
{
struct kbase_clk_rate_trace_op_conf *callbacks;
struct kbase_clk_data *clk_data;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
int ret = 0;
callbacks = get_clk_rate_trace_callbacks(kbdev);
if (WARN_ON(!callbacks) || WARN_ON(!gpu_clk_handle) ||
WARN_ON(index >= BASE_MAX_NR_CLOCKS_REGULATORS))
return -EINVAL;
clk_data = kzalloc(sizeof(*clk_data), GFP_KERNEL);
if (!clk_data) {
dev_err(kbdev->dev, "Failed to allocate data for clock enumerated at index %u",
index);
return -ENOMEM;
}
clk_data->index = (u8)index;
clk_data->gpu_clk_handle = gpu_clk_handle;
/* Store the initial value of clock */
clk_data->clock_val = callbacks->get_gpu_clk_rate(kbdev, gpu_clk_handle);
{
/* At the initialization time, GPU is powered off. */
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index, 0);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
clk_data->clk_rtm = clk_rtm;
clk_rtm->clks[index] = clk_data;
clk_data->clk_rate_change_nb.notifier_call = gpu_clk_rate_change_notifier;
if (callbacks->gpu_clk_notifier_register)
ret = callbacks->gpu_clk_notifier_register(kbdev, gpu_clk_handle,
&clk_data->clk_rate_change_nb);
if (ret) {
dev_err(kbdev->dev, "Failed to register notifier for clock enumerated at index %u",
index);
kfree(clk_data);
}
return ret;
}
int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_op_conf *callbacks;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
int ret = 0;
callbacks = get_clk_rate_trace_callbacks(kbdev);
spin_lock_init(&clk_rtm->lock);
INIT_LIST_HEAD(&clk_rtm->listeners);
/* Return early if no callbacks provided for clock rate tracing */
if (!callbacks) {
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
return 0;
}
clk_rtm->gpu_idle = true;
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
void *gpu_clk_handle = callbacks->enumerate_gpu_clk(kbdev, i);
if (!gpu_clk_handle)
break;
ret = gpu_clk_data_init(kbdev, gpu_clk_handle, i);
if (ret)
goto error;
}
/* Activate clock rate trace manager if at least one GPU clock was
* enumerated.
*/
if (i) {
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, callbacks);
} else {
dev_info(kbdev->dev, "No clock(s) available for rate tracing");
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
}
return 0;
error:
while (i--) {
clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister(
kbdev, clk_rtm->clks[i]->gpu_clk_handle,
&clk_rtm->clks[i]->clk_rate_change_nb);
kfree(clk_rtm->clks[i]);
}
return ret;
}
void kbase_clk_rate_trace_manager_term(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
WARN_ON(!list_empty(&clk_rtm->listeners));
if (!clk_rtm->clk_rate_trace_ops)
return;
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
if (!clk_rtm->clks[i])
break;
if (clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister)
clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister(
kbdev, clk_rtm->clks[i]->gpu_clk_handle,
&clk_rtm->clks[i]->clk_rate_change_nb);
kfree(clk_rtm->clks[i]);
}
WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
}
void kbase_clk_rate_trace_manager_gpu_active(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
unsigned long flags;
if (!clk_rtm->clk_rate_trace_ops)
return;
spin_lock_irqsave(&clk_rtm->lock, flags);
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
struct kbase_clk_data *clk_data = clk_rtm->clks[i];
if (!clk_data)
break;
if (unlikely(!clk_data->clock_val))
continue;
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index,
clk_data->clock_val);
}
clk_rtm->gpu_idle = false;
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
void kbase_clk_rate_trace_manager_gpu_idle(struct kbase_device *kbdev)
{
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
unsigned int i;
unsigned long flags;
if (!clk_rtm->clk_rate_trace_ops)
return;
spin_lock_irqsave(&clk_rtm->lock, flags);
for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
struct kbase_clk_data *clk_data = clk_rtm->clks[i];
if (!clk_data)
break;
if (unlikely(!clk_data->clock_val))
continue;
kbase_clk_rate_trace_manager_notify_all(clk_rtm, clk_data->index, 0);
}
clk_rtm->gpu_idle = true;
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
void kbase_clk_rate_trace_manager_notify_all(struct kbase_clk_rate_trace_manager *clk_rtm,
u32 clk_index, unsigned long new_rate)
{
struct kbase_clk_rate_listener *pos;
struct kbase_device *kbdev;
lockdep_assert_held(&clk_rtm->lock);
kbdev = container_of(clk_rtm, struct kbase_device, pm.clk_rtm);
dev_dbg(kbdev->dev, "%s - GPU clock %u rate changed to %lu, pid: %d", __func__, clk_index,
new_rate, current->pid);
/* Raise standard `power/gpu_frequency` ftrace event */
{
unsigned long new_rate_khz = new_rate;
#if BITS_PER_LONG == 64
do_div(new_rate_khz, 1000);
#elif BITS_PER_LONG == 32
new_rate_khz /= 1000;
#else
#error "unsigned long division is not supported for this architecture"
#endif
trace_gpu_frequency(new_rate_khz, clk_index);
}
/* Notify the listeners. */
list_for_each_entry(pos, &clk_rtm->listeners, node) {
pos->notify(pos, clk_index, new_rate);
}
}
KBASE_EXPORT_TEST_API(kbase_clk_rate_trace_manager_notify_all);

View File

@@ -0,0 +1,150 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CLK_RATE_TRACE_MGR_
#define _KBASE_CLK_RATE_TRACE_MGR_
/* The index of top clock domain in kbase_clk_rate_trace_manager:clks. */
#define KBASE_CLOCK_DOMAIN_TOP (0)
/* The index of shader-cores clock domain in
* kbase_clk_rate_trace_manager:clks.
*/
#define KBASE_CLOCK_DOMAIN_SHADER_CORES (1)
/**
* struct kbase_clk_data - Data stored per enumerated GPU clock.
*
* @clk_rtm: Pointer to clock rate trace manager object.
* @gpu_clk_handle: Handle unique to the enumerated GPU clock.
* @plat_private: Private data for the platform to store into
* @clk_rate_change_nb: notifier block containing the pointer to callback
* function that is invoked whenever the rate of
* enumerated GPU clock changes.
* @clock_val: Current rate of the enumerated GPU clock.
* @index: Index at which the GPU clock was enumerated.
*/
struct kbase_clk_data {
struct kbase_clk_rate_trace_manager *clk_rtm;
void *gpu_clk_handle;
void *plat_private;
struct notifier_block clk_rate_change_nb;
unsigned long clock_val;
u8 index;
};
/**
* kbase_clk_rate_trace_manager_init - Initialize GPU clock rate trace manager.
*
* @kbdev: Device pointer
*
* Return: 0 if success, or an error code on failure.
*/
int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_term - Terminate GPU clock rate trace manager.
*
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_term(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_gpu_active - Inform GPU clock rate trace
* manager of GPU becoming active.
*
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_gpu_active(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_gpu_idle - Inform GPU clock rate trace
* manager of GPU becoming idle.
* @kbdev: Device pointer
*/
void kbase_clk_rate_trace_manager_gpu_idle(struct kbase_device *kbdev);
/**
* kbase_clk_rate_trace_manager_subscribe_no_lock() - Add freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*
* kbase_clk_rate_trace_manager:lock must be held by the caller.
*/
static inline void
kbase_clk_rate_trace_manager_subscribe_no_lock(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
lockdep_assert_held(&clk_rtm->lock);
list_add(&listener->node, &clk_rtm->listeners);
}
/**
* kbase_clk_rate_trace_manager_subscribe() - Add freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*/
static inline void
kbase_clk_rate_trace_manager_subscribe(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
kbase_clk_rate_trace_manager_subscribe_no_lock(clk_rtm, listener);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
/**
* kbase_clk_rate_trace_manager_unsubscribe() - Remove freq change listener.
*
* @clk_rtm: Clock rate manager instance.
* @listener: Listener handle
*/
static inline void
kbase_clk_rate_trace_manager_unsubscribe(struct kbase_clk_rate_trace_manager *clk_rtm,
struct kbase_clk_rate_listener *listener)
{
unsigned long flags;
spin_lock_irqsave(&clk_rtm->lock, flags);
list_del(&listener->node);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
/**
* kbase_clk_rate_trace_manager_notify_all() - Notify all clock \
* rate listeners.
*
* @clk_rtm: Clock rate manager instance.
* @clock_index: Clock index.
* @new_rate: New clock frequency(Hz)
*
* kbase_clk_rate_trace_manager:lock must be locked.
* This function is exported to be used by clock rate trace test
* portal.
*/
void kbase_clk_rate_trace_manager_notify_all(struct kbase_clk_rate_trace_manager *clk_rtm,
u32 clock_index, unsigned long new_rate);
#endif /* _KBASE_CLK_RATE_TRACE_MGR_ */

View File

@@ -0,0 +1,167 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <hw_access/mali_kbase_hw_access.h>
#include "mali_kbase_debug_job_fault.h"
#if IS_ENABLED(CONFIG_DEBUG_FS)
/*GPU_CONTROL_REG(r)*/
static unsigned int gpu_control_reg_snapshot[] = { GPU_CONTROL_ENUM(GPU_ID),
GPU_CONTROL_ENUM(SHADER_READY),
GPU_CONTROL_ENUM(TILER_READY),
GPU_CONTROL_ENUM(L2_READY) };
/* JOB_CONTROL_REG(r) */
static unsigned int job_control_reg_snapshot[] = { JOB_CONTROL_ENUM(JOB_IRQ_MASK),
JOB_CONTROL_ENUM(JOB_IRQ_STATUS) };
/* JOB_SLOT_REG(n,r) */
static unsigned int job_slot_reg_snapshot[] = {
JOB_SLOT_ENUM(0, HEAD) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, TAIL) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, AFFINITY) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, CONFIG) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, STATUS) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, HEAD_NEXT) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, AFFINITY_NEXT) - JOB_SLOT_BASE_ENUM(0),
JOB_SLOT_ENUM(0, CONFIG_NEXT) - JOB_SLOT_BASE_ENUM(0)
};
/*MMU_CONTROL_REG(r)*/
static unsigned int mmu_reg_snapshot[] = { MMU_CONTROL_ENUM(IRQ_MASK),
MMU_CONTROL_ENUM(IRQ_STATUS) };
/* MMU_AS_REG(n,r) */
static unsigned int as_reg_snapshot[] = { MMU_AS_ENUM(0, TRANSTAB) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, TRANSCFG) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, MEMATTR) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, FAULTSTATUS) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, FAULTADDRESS) - MMU_AS_BASE_ENUM(0),
MMU_AS_ENUM(0, STATUS) - MMU_AS_BASE_ENUM(0) };
bool kbase_debug_job_fault_reg_snapshot_init(struct kbase_context *kctx, int reg_range)
{
uint i, j;
int offset = 0;
uint slot_number;
uint as_number;
if (kctx->reg_dump == NULL)
return false;
slot_number = kctx->kbdev->gpu_props.num_job_slots;
as_number = kctx->kbdev->gpu_props.num_address_spaces;
/* get the GPU control registers*/
for (i = 0; i < ARRAY_SIZE(gpu_control_reg_snapshot); i++) {
kctx->reg_dump[offset] = gpu_control_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Job control registers*/
for (i = 0; i < ARRAY_SIZE(job_control_reg_snapshot); i++) {
kctx->reg_dump[offset] = job_control_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Job Slot registers*/
for (j = 0; j < slot_number; j++) {
for (i = 0; i < ARRAY_SIZE(job_slot_reg_snapshot); i++) {
kctx->reg_dump[offset] = JOB_SLOT_BASE_OFFSET(j) + job_slot_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
}
/* get the MMU registers*/
for (i = 0; i < ARRAY_SIZE(mmu_reg_snapshot); i++) {
kctx->reg_dump[offset] = mmu_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
/* get the Address space registers*/
for (j = 0; j < as_number; j++) {
for (i = 0; i < ARRAY_SIZE(as_reg_snapshot); i++) {
kctx->reg_dump[offset] = MMU_AS_BASE_OFFSET(j) + as_reg_snapshot[i];
if (kbase_reg_is_size64(kctx->kbdev, kctx->reg_dump[offset]))
offset += 4;
else
offset += 2;
}
}
WARN_ON(offset >= (reg_range * 2 / 4));
/* set the termination flag*/
kctx->reg_dump[offset] = REGISTER_DUMP_TERMINATION_FLAG;
kctx->reg_dump[offset + 1] = REGISTER_DUMP_TERMINATION_FLAG;
dev_dbg(kctx->kbdev->dev, "kbase_job_fault_reg_snapshot_init:%d\n", offset);
return true;
}
bool kbase_job_fault_get_reg_snapshot(struct kbase_context *kctx)
{
int offset = 0;
u32 reg_enum;
u64 val64;
if (kctx->reg_dump == NULL)
return false;
while (kctx->reg_dump[offset] != REGISTER_DUMP_TERMINATION_FLAG) {
reg_enum = kctx->reg_dump[offset];
/* Get register offset from enum */
kbase_reg_get_offset(kctx->kbdev, reg_enum, &kctx->reg_dump[offset]);
if (kbase_reg_is_size64(kctx->kbdev, reg_enum)) {
val64 = kbase_reg_read64(kctx->kbdev, reg_enum);
/* offset computed offset to get _HI offset */
kctx->reg_dump[offset + 2] = kctx->reg_dump[offset] + 4;
kctx->reg_dump[offset + 1] = (u32)(val64 & 0xFFFFFFFF);
kctx->reg_dump[offset + 3] = (u32)(val64 >> 32);
offset += 4;
} else {
kctx->reg_dump[offset + 1] = kbase_reg_read32(kctx->kbdev, reg_enum);
offset += 2;
}
}
return true;
}
#endif

View File

@@ -0,0 +1,135 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific definitions
*/
#ifndef _KBASE_HWACCESS_GPU_DEFS_H_
#define _KBASE_HWACCESS_GPU_DEFS_H_
/* SLOT_RB_SIZE must be < 256 */
#define SLOT_RB_SIZE 2
#define SLOT_RB_MASK (SLOT_RB_SIZE - 1)
/**
* struct rb_entry - Ringbuffer entry
* @katom: Atom associated with this entry
*/
struct rb_entry {
struct kbase_jd_atom *katom;
};
/* SLOT_RB_TAG_PURGED assumes a value that is different from
* NULL (SLOT_RB_NULL_TAG_VAL) and will not be the result of
* any valid pointer via macro translation: SLOT_RB_TAG_KCTX(x).
*/
#define SLOT_RB_TAG_PURGED ((u64)(1 << 1))
#define SLOT_RB_NULL_TAG_VAL ((u64)0)
/**
* SLOT_RB_TAG_KCTX() - a function-like macro for converting a pointer to a
* u64 for serving as tagged value.
* @kctx: Pointer to kbase context.
*/
#define SLOT_RB_TAG_KCTX(kctx) ((u64)(uintptr_t)(kctx))
/**
* struct slot_rb - Slot ringbuffer
* @entries: Ringbuffer entries
* @last_kctx_tagged: The last context that submitted a job to the slot's
* HEAD_NEXT register. The value is a tagged variant so
* must not be dereferenced. It is used in operation to
* track when shader core L1 caches might contain a
* previous context's data, and so must only be set to
* SLOT_RB_NULL_TAG_VAL after reset/powerdown of the
* cores. In slot job submission, if there is a kctx
* change, and the relevant katom is configured with
* BASE_JD_REQ_SKIP_CACHE_START, a L1 read only cache
* maintenace operation is enforced.
* @read_idx: Current read index of buffer
* @write_idx: Current write index of buffer
* @job_chain_flag: Flag used to implement jobchain disambiguation
*/
struct slot_rb {
struct rb_entry entries[SLOT_RB_SIZE];
u64 last_kctx_tagged;
u8 read_idx;
u8 write_idx;
u8 job_chain_flag;
};
/**
* struct kbase_backend_data - GPU backend specific data for HW access layer
* @slot_rb: Slot ringbuffers
* @scheduling_timer: The timer tick used for rescheduling jobs
* @timer_running: Is the timer running? The runpool_mutex must be
* held whilst modifying this.
* @suspend_timer: Is the timer suspended? Set when a suspend
* occurs and cleared on resume. The runpool_mutex
* must be held whilst modifying this.
* @reset_gpu: Set to a KBASE_RESET_xxx value (see comments)
* @reset_workq: Work queue for performing the reset
* @reset_work: Work item for performing the reset
* @reset_wait: Wait event signalled when the reset is complete
* @reset_timer: Timeout for soft-stops before the reset
* @timeouts_updated: Have timeout values just been updated?
*
* The hwaccess_lock (a spinlock) must be held when accessing this structure
*/
struct kbase_backend_data {
#if !MALI_USE_CSF
struct slot_rb slot_rb[BASE_JM_MAX_NR_SLOTS];
struct hrtimer scheduling_timer;
bool timer_running;
#endif
bool suspend_timer;
atomic_t reset_gpu;
/* The GPU reset isn't pending */
#define KBASE_RESET_GPU_NOT_PENDING 0
/* kbase_prepare_to_reset_gpu has been called */
#define KBASE_RESET_GPU_PREPARED 1
/* kbase_reset_gpu has been called - the reset will now definitely happen
* within the timeout period
*/
#define KBASE_RESET_GPU_COMMITTED 2
/* The GPU reset process is currently occurring (timeout has expired or
* kbasep_try_reset_gpu_early was called)
*/
#define KBASE_RESET_GPU_HAPPENING 3
/* Reset the GPU silently, used when resetting the GPU as part of normal
* behavior (e.g. when exiting protected mode).
*/
#define KBASE_RESET_GPU_SILENT 4
struct workqueue_struct *reset_workq;
struct work_struct reset_work;
wait_queue_head_t reset_wait;
struct hrtimer reset_timer;
bool timeouts_updated;
};
#endif /* _KBASE_HWACCESS_GPU_DEFS_H_ */

View File

@@ -0,0 +1,747 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU licence.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
* SPDX-License-Identifier: GPL-2.0
*
*/
#include <mali_kbase.h>
#include <tl/mali_kbase_tracepoints.h>
#include <backend/gpu/mali_kbase_devfreq.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <linux/of.h>
#include <linux/clk.h>
#include <linux/clk-provider.h>
#include <linux/devfreq.h>
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
#include <linux/devfreq_cooling.h>
#endif
#include <linux/pm_domain.h>
#include <linux/version.h>
#include <linux/pm_opp.h>
#include <linux/pm_runtime.h>
#include "mali_kbase_devfreq.h"
#include <soc/rockchip/rockchip_ipa.h>
#include <soc/rockchip/rockchip_opp_select.h>
#include <soc/rockchip/rockchip_system_monitor.h>
static struct devfreq_simple_ondemand_data ondemand_data;
static struct monitor_dev_profile mali_mdevp = {
.type = MONITOR_TYPE_DEV,
.low_temp_adjust = rockchip_monitor_dev_low_temp_adjust,
.high_temp_adjust = rockchip_monitor_dev_high_temp_adjust,
.check_rate_volt = rockchip_monitor_check_rate_volt,
};
/**
* get_voltage() - Get the voltage value corresponding to the nominal frequency
* used by devfreq.
* @kbdev: Device pointer
* @freq: Nominal frequency in Hz passed by devfreq.
*
* This function will be called only when the opp table which is compatible with
* "operating-points-v2-mali", is not present in the devicetree for GPU device.
*
* Return: Voltage value in micro volts, 0 in case of error.
*/
static unsigned long get_voltage(struct kbase_device *kbdev, unsigned long freq)
{
struct dev_pm_opp *opp;
unsigned long voltage = 0;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
opp = dev_pm_opp_find_freq_exact(kbdev->dev, freq, true);
if (IS_ERR_OR_NULL(opp))
dev_err(kbdev->dev, "Failed to get opp (%d)\n", PTR_ERR_OR_ZERO(opp));
else {
voltage = dev_pm_opp_get_voltage(opp);
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
dev_pm_opp_put(opp);
#endif
}
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
/* Return the voltage in micro volts */
return voltage;
}
void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq, u64 *core_mask,
unsigned long *freqs, unsigned long *volts)
{
unsigned int i;
for (i = 0; i < kbdev->num_opps; i++) {
if (kbdev->devfreq_table[i].opp_freq == freq) {
unsigned int j;
*core_mask = kbdev->devfreq_table[i].core_mask;
for (j = 0; j < kbdev->nr_clocks; j++) {
freqs[j] = kbdev->devfreq_table[i].real_freqs[j];
volts[j] = kbdev->devfreq_table[i].opp_volts[j];
}
break;
}
}
/* If failed to find OPP, return all cores enabled
* and nominal frequency and the corresponding voltage.
*/
if (i == kbdev->num_opps) {
unsigned long voltage = get_voltage(kbdev, freq);
*core_mask = kbdev->gpu_props.shader_present;
for (i = 0; i < kbdev->nr_clocks; i++) {
freqs[i] = freq;
volts[i] = voltage;
}
}
}
static int kbase_devfreq_target(struct device *dev, unsigned long *freq, u32 flags)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
struct rockchip_opp_info *opp_info = &kbdev->opp_info;
struct dev_pm_opp *opp;
int err;
int ret = 0;
if (!opp_info->is_rate_volt_checked)
return -EINVAL;
opp = devfreq_recommended_opp(dev, freq, flags);
if (IS_ERR(opp))
return PTR_ERR(opp);
dev_pm_opp_put(opp);
err = dev_pm_genpd_set_performance_state(kbdev->dev, *freq);
/* For ENODEV or EOPNOTSUPP do not return error code */
if (err && !((err == -ENODEV) || (err == -EOPNOTSUPP))) {
dev_err(dev, "Failed to set opp (%d) (target %lu)\n", err, *freq);
return err;
}
if (*freq == kbdev->current_nominal_freq)
return 0;
rockchip_opp_dvfs_lock(opp_info);
if (pm_runtime_active(dev))
opp_info->is_runtime_active = true;
else
opp_info->is_runtime_active = false;
ret = dev_pm_opp_set_rate(dev, *freq);
if (!ret) {
kbdev->current_nominal_freq = *freq;
KBASE_TLSTREAM_AUX_DEVFREQ_TARGET(kbdev, (u64)*freq);
}
rockchip_opp_dvfs_unlock(opp_info);
return ret;
}
void kbase_devfreq_force_freq(struct kbase_device *kbdev, unsigned long freq)
{
unsigned long target_freq = freq;
kbase_devfreq_target(kbdev->dev, &target_freq, 0);
}
static int kbase_devfreq_cur_freq(struct device *dev, unsigned long *freq)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
*freq = kbdev->current_nominal_freq;
return 0;
}
static int kbase_devfreq_status(struct device *dev, struct devfreq_dev_status *stat)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
struct kbasep_pm_metrics diff;
kbase_pm_get_dvfs_metrics(kbdev, &kbdev->last_devfreq_metrics, &diff);
stat->busy_time = diff.time_busy;
stat->total_time = diff.time_busy + diff.time_idle;
stat->current_frequency = kbdev->current_nominal_freq;
stat->private_data = NULL;
#if MALI_USE_CSF && defined CONFIG_DEVFREQ_THERMAL
if (!kbdev->devfreq_profile.is_cooling_device)
kbase_ipa_reset_data(kbdev);
#endif
return 0;
}
static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev, struct devfreq_dev_profile *dp)
{
int count;
unsigned int i = 0;
unsigned long freq;
struct dev_pm_opp *opp;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
count = dev_pm_opp_get_opp_count(kbdev->dev);
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
if (count < 0)
return count;
dp->freq_table = kmalloc_array((size_t)count, sizeof(dp->freq_table[0]), GFP_KERNEL);
if (!dp->freq_table)
return -ENOMEM;
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_lock();
#endif
for (i = 0, freq = ULONG_MAX; i < (unsigned int)count; i++, freq--) {
opp = dev_pm_opp_find_freq_floor(kbdev->dev, &freq);
if (IS_ERR(opp))
break;
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
dev_pm_opp_put(opp);
#endif /* KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE */
dp->freq_table[i] = freq;
}
#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
rcu_read_unlock();
#endif
if ((unsigned int)count != i)
dev_warn(kbdev->dev, "Unable to enumerate all OPPs (%d!=%u\n", count, i);
dp->max_state = i;
/* Have the lowest clock as suspend clock.
* It may be overridden by 'opp-mali-errata-1485982'.
*/
if (kbdev->pm.backend.gpu_clock_slow_down_wa) {
freq = 0;
opp = dev_pm_opp_find_freq_ceil(kbdev->dev, &freq);
if (IS_ERR(opp)) {
dev_err(kbdev->dev, "failed to find slowest clock");
return 0;
}
dev_pm_opp_put(opp);
dev_info(kbdev->dev, "suspend clock %lu from slowest", freq);
kbdev->pm.backend.gpu_clock_suspend_freq = freq;
}
return 0;
}
static void kbase_devfreq_term_freq_table(struct kbase_device *kbdev)
{
struct devfreq_dev_profile *dp = &kbdev->devfreq_profile;
kfree(dp->freq_table);
dp->freq_table = NULL;
}
static void kbase_devfreq_term_core_mask_table(struct kbase_device *kbdev)
{
kfree(kbdev->devfreq_table);
kbdev->devfreq_table = NULL;
}
static void kbase_devfreq_exit(struct device *dev)
{
struct kbase_device *kbdev = dev_get_drvdata(dev);
if (kbdev)
kbase_devfreq_term_freq_table(kbdev);
}
static void kbasep_devfreq_read_suspend_clock(struct kbase_device *kbdev, struct device_node *node)
{
u64 freq = 0;
int err = 0;
/* Check if this node is the opp entry having 'opp-mali-errata-1485982'
* to get the suspend clock, otherwise skip it.
*/
if (!of_property_read_bool(node, "opp-mali-errata-1485982"))
return;
/* In kbase DevFreq, the clock will be read from 'opp-hz'
* and translated into the actual clock by opp_translate.
*
* In customer DVFS, the clock will be read from 'opp-hz-real'
* for clk driver. If 'opp-hz-real' does not exist,
* read from 'opp-hz'.
*/
if (IS_ENABLED(CONFIG_MALI_VALHALL_DEVFREQ))
err = of_property_read_u64(node, "opp-hz", &freq);
else {
if (of_property_read_u64(node, "opp-hz-real", &freq))
err = of_property_read_u64(node, "opp-hz", &freq);
}
if (WARN_ON(err || !freq))
return;
kbdev->pm.backend.gpu_clock_suspend_freq = freq;
dev_info(kbdev->dev, "suspend clock %llu by opp-mali-errata-1485982", freq);
}
static int kbase_devfreq_init_core_mask_table(struct kbase_device *kbdev)
{
#ifndef CONFIG_OF
/* OPP table initialization requires at least the capability to get
* regulators and clocks from the device tree, as well as parsing
* arrays of unsigned integer values.
*
* The whole initialization process shall simply be skipped if the
* minimum capability is not available.
*/
return 0;
#else
struct device_node *opp_node =
of_parse_phandle(kbdev->dev->of_node, "operating-points-v2", 0);
struct device_node *node;
unsigned int i = 0;
int count;
u64 shader_present = kbdev->gpu_props.shader_present;
if (!opp_node)
return 0;
if (!of_device_is_compatible(opp_node, "operating-points-v2-mali"))
return 0;
count = dev_pm_opp_get_opp_count(kbdev->dev);
kbdev->devfreq_table =
kmalloc_array((size_t)count, sizeof(struct kbase_devfreq_opp), GFP_KERNEL);
if (!kbdev->devfreq_table)
return -ENOMEM;
for_each_available_child_of_node(opp_node, node) {
const void *core_count_p;
u64 core_mask, opp_freq, real_freqs[BASE_MAX_NR_CLOCKS_REGULATORS];
int err;
#if IS_ENABLED(CONFIG_REGULATOR)
u32 opp_volts[BASE_MAX_NR_CLOCKS_REGULATORS];
#endif
/* Read suspend clock from opp table */
if (kbdev->pm.backend.gpu_clock_slow_down_wa)
kbasep_devfreq_read_suspend_clock(kbdev, node);
err = of_property_read_u64(node, "opp-hz", &opp_freq);
if (err) {
dev_warn(kbdev->dev, "Failed to read opp-hz property with error %d\n", err);
continue;
}
#if BASE_MAX_NR_CLOCKS_REGULATORS > 1
err = of_property_read_u64_array(node, "opp-hz-real", real_freqs, kbdev->nr_clocks);
#else
WARN_ON(kbdev->nr_clocks != 1);
err = of_property_read_u64(node, "opp-hz-real", real_freqs);
#endif
if (err < 0) {
dev_warn(kbdev->dev, "Failed to read opp-hz-real property with error %d",
err);
continue;
}
#if IS_ENABLED(CONFIG_REGULATOR)
err = of_property_read_u32_array(node, "opp-microvolt", opp_volts,
kbdev->nr_regulators);
if (err < 0) {
dev_warn(kbdev->dev, "Failed to read opp-microvolt property with error %d",
err);
continue;
}
#endif
if (of_property_read_u64(node, "opp-core-mask", &core_mask))
core_mask = shader_present;
if (core_mask != shader_present && corestack_driver_control) {
dev_warn(
kbdev->dev,
"Ignoring OPP %llu - Dynamic Core Scaling not supported on this GPU",
opp_freq);
continue;
}
#if MALI_USE_CSF
if (kbase_csf_dev_has_ne(kbdev)) {
u64 neural_present = kbdev->gpu_props.neural_present;
u64 sc_with_ne = shader_present & neural_present;
if (!sc_with_ne) {
dev_err(kbdev->dev,
"No shader cores with NE cores present in configuration with NE!");
continue;
}
if ((neural_present & shader_present) != neural_present) {
dev_err(kbdev->dev,
"Detected NE core without a corresponding shader core: NEURAL_PRESENT %llx SHADER_PRESENT %llx",
neural_present, shader_present);
}
if (!(core_mask & sc_with_ne)) {
dev_err(kbdev->dev,
"Ignoring OPP %d - No shader cores with NE cores present in the given core mask %llx",
i, core_mask);
continue;
}
}
#endif /* MALI_USE_CSF */
core_count_p = of_get_property(node, "opp-core-count", NULL);
if (core_count_p) {
u64 remaining_core_mask = kbdev->gpu_props.shader_present;
int core_count = be32_to_cpup(core_count_p);
core_mask = 0;
for (; core_count > 0; core_count--) {
int core = ffs(remaining_core_mask);
if (!core) {
dev_err(kbdev->dev, "OPP has more cores than GPU\n");
return -ENODEV;
}
core_mask |= (1ull << (core - 1));
remaining_core_mask &= ~(1ull << (core - 1));
}
}
if (!core_mask) {
dev_err(kbdev->dev, "OPP has invalid core mask of 0\n");
return -ENODEV;
}
kbdev->devfreq_table[i].opp_freq = opp_freq;
kbdev->devfreq_table[i].core_mask = core_mask;
if (kbdev->nr_clocks > 0) {
unsigned int j;
for (j = 0; j < kbdev->nr_clocks; j++)
kbdev->devfreq_table[i].real_freqs[j] = real_freqs[j];
}
#if IS_ENABLED(CONFIG_REGULATOR)
if (kbdev->nr_regulators > 0) {
unsigned int j;
for (j = 0; j < kbdev->nr_regulators; j++)
kbdev->devfreq_table[i].opp_volts[j] = opp_volts[j];
}
#endif
dev_info(kbdev->dev, "OPP %d : opp_freq=%llu core_mask=%llx\n", i, opp_freq,
core_mask);
i++;
}
kbdev->num_opps = i;
return 0;
#endif /* CONFIG_OF */
}
static const char *kbase_devfreq_req_type_name(enum kbase_devfreq_work_type type)
{
const char *p;
switch (type) {
case DEVFREQ_WORK_NONE:
p = "devfreq_none";
break;
case DEVFREQ_WORK_SUSPEND:
p = "devfreq_suspend";
break;
case DEVFREQ_WORK_RESUME:
p = "devfreq_resume";
break;
default:
p = "Unknown devfreq_type";
}
return p;
}
static void kbase_devfreq_suspend_resume_worker(struct work_struct *work)
{
struct kbase_devfreq_queue_info *info =
container_of(work, struct kbase_devfreq_queue_info, work);
struct kbase_device *kbdev = container_of(info, struct kbase_device, devfreq_queue);
unsigned long flags;
enum kbase_devfreq_work_type type, acted_type;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
type = kbdev->devfreq_queue.req_type;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
acted_type = kbdev->devfreq_queue.acted_type;
dev_dbg(kbdev->dev, "Worker handles queued req: %s (acted: %s)\n",
kbase_devfreq_req_type_name(type), kbase_devfreq_req_type_name(acted_type));
switch (type) {
case DEVFREQ_WORK_SUSPEND:
case DEVFREQ_WORK_RESUME:
if (type != acted_type) {
if (type == DEVFREQ_WORK_RESUME)
devfreq_resume_device(kbdev->devfreq);
else
devfreq_suspend_device(kbdev->devfreq);
dev_dbg(kbdev->dev, "Devfreq transition occured: %s => %s\n",
kbase_devfreq_req_type_name(acted_type),
kbase_devfreq_req_type_name(type));
kbdev->devfreq_queue.acted_type = type;
}
break;
default:
WARN_ON(1);
}
}
void kbase_devfreq_enqueue_work(struct kbase_device *kbdev, enum kbase_devfreq_work_type work_type)
{
unsigned long flags;
WARN_ON(work_type == DEVFREQ_WORK_NONE);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* Skip enqueuing a work if workqueue has already been terminated. */
if (likely(kbdev->devfreq_queue.workq)) {
kbdev->devfreq_queue.req_type = work_type;
queue_work(kbdev->devfreq_queue.workq, &kbdev->devfreq_queue.work);
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
dev_dbg(kbdev->dev, "Enqueuing devfreq req: %s\n", kbase_devfreq_req_type_name(work_type));
}
static int kbase_devfreq_work_init(struct kbase_device *kbdev)
{
kbdev->devfreq_queue.req_type = DEVFREQ_WORK_NONE;
kbdev->devfreq_queue.acted_type = DEVFREQ_WORK_RESUME;
kbdev->devfreq_queue.workq = alloc_ordered_workqueue("devfreq_workq", 0);
if (!kbdev->devfreq_queue.workq)
return -ENOMEM;
INIT_WORK(&kbdev->devfreq_queue.work, kbase_devfreq_suspend_resume_worker);
return 0;
}
static void kbase_devfreq_work_term(struct kbase_device *kbdev)
{
unsigned long flags;
struct workqueue_struct *workq;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
workq = kbdev->devfreq_queue.workq;
kbdev->devfreq_queue.workq = NULL;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
destroy_workqueue(workq);
}
int kbase_devfreq_init(struct kbase_device *kbdev)
{
struct device_node *np = kbdev->dev->of_node;
struct devfreq_dev_profile *dp;
int err;
struct dev_pm_opp *opp;
unsigned int dyn_power_coeff = 0;
unsigned int i;
bool free_devfreq_freq_table = true;
if (kbdev->nr_clocks == 0) {
dev_err(kbdev->dev, "Clock not available for devfreq\n");
return -ENODEV;
}
for (i = 0; i < kbdev->nr_clocks; i++) {
if (kbdev->clocks[i])
kbdev->current_freqs[i] = clk_get_rate(kbdev->clocks[i]);
}
kbdev->current_nominal_freq = kbdev->current_freqs[0];
opp = devfreq_recommended_opp(kbdev->dev, &kbdev->current_nominal_freq, 0);
if (IS_ERR(opp))
return PTR_ERR(opp);
dev_pm_opp_put(opp);
dp = &kbdev->devfreq_profile;
dp->initial_freq = kbdev->current_nominal_freq;
dp->polling_ms = 100;
dp->target = kbase_devfreq_target;
dp->get_dev_status = kbase_devfreq_status;
dp->get_cur_freq = kbase_devfreq_cur_freq;
dp->exit = kbase_devfreq_exit;
if (kbase_devfreq_init_freq_table(kbdev, dp))
return -EFAULT;
if (dp->max_state > 0) {
/* Record the maximum frequency possible */
kbdev->gpu_props.gpu_freq_khz_max = dp->freq_table[0] / 1000;
};
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
of_property_read_u32(kbdev->dev->of_node, "dynamic-power-coefficient",
&dyn_power_coeff);
if (dyn_power_coeff)
dp->is_cooling_device = true;
#endif
err = kbase_devfreq_init_core_mask_table(kbdev);
if (err)
goto init_core_mask_table_failed;
of_property_read_u32(np, "upthreshold",
&ondemand_data.upthreshold);
of_property_read_u32(np, "downdifferential",
&ondemand_data.downdifferential);
kbdev->devfreq = devfreq_add_device(kbdev->dev, dp, "simple_ondemand", &ondemand_data);
if (IS_ERR(kbdev->devfreq)) {
err = PTR_ERR(kbdev->devfreq);
kbdev->devfreq = NULL;
dev_err(kbdev->dev, "Fail to add devfreq device(%d)", err);
goto devfreq_add_dev_failed;
}
/* Explicit free of freq table isn't needed after devfreq_add_device() */
free_devfreq_freq_table = false;
/* Initialize devfreq suspend/resume workqueue */
err = kbase_devfreq_work_init(kbdev);
if (err) {
dev_err(kbdev->dev, "Fail to init devfreq workqueue");
goto devfreq_work_init_failed;
}
/* devfreq_add_device only copies a few of kbdev->dev's fields, so
* set drvdata explicitly so IPA models can access kbdev.
*/
dev_set_drvdata(&kbdev->devfreq->dev, kbdev);
err = devfreq_register_opp_notifier(kbdev->dev, kbdev->devfreq);
if (err) {
dev_err(kbdev->dev, "Failed to register OPP notifier (%d)", err);
goto opp_notifier_failed;
}
mali_mdevp.data = kbdev->devfreq;
mali_mdevp.opp_info = &kbdev->opp_info;
kbdev->mdev_info = rockchip_system_monitor_register(kbdev->dev, &mali_mdevp);
if (IS_ERR(kbdev->mdev_info)) {
dev_dbg(kbdev->dev, "without system monitor\n");
kbdev->mdev_info = NULL;
}
kbdev->opp_info.is_rate_volt_checked = true;
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
if (!dp->is_cooling_device) {
err = kbase_ipa_init(kbdev);
if (err) {
dev_err(kbdev->dev, "IPA initialization failed\n");
goto ipa_init_failed;
}
kbdev->devfreq_cooling = devfreq_cooling_em_register(
kbdev->devfreq,
&kbase_ipa_power_model_ops);
if (IS_ERR(kbdev->devfreq_cooling)) {
err = PTR_ERR(kbdev->devfreq_cooling);
dev_err(kbdev->dev, "Failed to register cooling device (%d)", err);
goto cooling_reg_failed;
}
}
#endif
return 0;
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
cooling_reg_failed:
kbase_ipa_term(kbdev);
ipa_init_failed:
devfreq_unregister_opp_notifier(kbdev->dev, kbdev->devfreq);
#endif /* CONFIG_DEVFREQ_THERMAL */
opp_notifier_failed:
kbase_devfreq_work_term(kbdev);
devfreq_work_init_failed:
if (devfreq_remove_device(kbdev->devfreq))
dev_err(kbdev->dev, "Failed to terminate devfreq (%d)", err);
kbdev->devfreq = NULL;
devfreq_add_dev_failed:
kbase_devfreq_term_core_mask_table(kbdev);
init_core_mask_table_failed:
if (free_devfreq_freq_table)
kbase_devfreq_term_freq_table(kbdev);
return err;
}
void kbase_devfreq_term(struct kbase_device *kbdev)
{
int err;
dev_dbg(kbdev->dev, "Term Mali devfreq\n");
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
if (kbdev->devfreq_cooling)
devfreq_cooling_unregister(kbdev->devfreq_cooling);
#endif
devfreq_unregister_opp_notifier(kbdev->dev, kbdev->devfreq);
kbase_devfreq_work_term(kbdev);
err = devfreq_remove_device(kbdev->devfreq);
if (err)
dev_err(kbdev->dev, "Failed to terminate devfreq (%d)\n", err);
else
kbdev->devfreq = NULL;
kbase_devfreq_term_core_mask_table(kbdev);
#if IS_ENABLED(CONFIG_DEVFREQ_THERMAL)
if (!kbdev->model_data)
kbase_ipa_term(kbdev);
kfree(kbdev->model_data);
#endif
}

View File

@@ -0,0 +1,69 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _BASE_DEVFREQ_H_
#define _BASE_DEVFREQ_H_
/**
* kbase_devfreq_init - Initialize kbase device for DevFreq.
* @kbdev: Device pointer
*
* This function must be called only when a kbase device is initialized.
*
* Return: 0 on success.
*/
int kbase_devfreq_init(struct kbase_device *kbdev);
void kbase_devfreq_term(struct kbase_device *kbdev);
/**
* kbase_devfreq_force_freq - Set GPU frequency on L2 power on/off.
* @kbdev: Device pointer
* @freq: GPU frequency in HZ to be set when
* MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE is enabled
*/
void kbase_devfreq_force_freq(struct kbase_device *kbdev, unsigned long freq);
/**
* kbase_devfreq_enqueue_work - Enqueue a work item for suspend/resume devfreq.
* @kbdev: Device pointer
* @work_type: The type of the devfreq work item, i.e. suspend or resume
*/
void kbase_devfreq_enqueue_work(struct kbase_device *kbdev, enum kbase_devfreq_work_type work_type);
/**
* kbase_devfreq_opp_translate - Translate nominal OPP frequency from devicetree
* into real frequency & voltage pair, along with
* core mask
* @kbdev: Device pointer
* @freq: Nominal frequency
* @core_mask: Pointer to u64 to store core mask to
* @freqs: Pointer to array of frequencies
* @volts: Pointer to array of voltages
*
* This function will only perform translation if an operating-points-v2-mali
* table is present in devicetree. If one is not present then it will return an
* untranslated frequency (and corresponding voltage) and all cores enabled.
* The voltages returned are in micro Volts (uV).
*/
void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq, u64 *core_mask,
unsigned long *freqs, unsigned long *volts);
#endif /* _BASE_DEVFREQ_H_ */

View File

@@ -0,0 +1,148 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel property query backend APIs
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <mali_kbase_hwaccess_gpuprops.h>
#include <mali_kbase_gpuprops_private_types.h>
#include <mali_kbase_io.h>
int kbase_backend_gpuprops_get(struct kbase_device *kbdev, struct kbasep_gpuprops_regdump *regdump)
{
uint i;
/* regdump is zero intiialized, individual entries do not need to be explicitly set */
regdump->gpu_id = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(GPU_ID));
regdump->shader_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(SHADER_PRESENT));
regdump->tiler_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(TILER_PRESENT));
regdump->l2_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(L2_PRESENT));
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(AS_PRESENT)))
regdump->as_present = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(AS_PRESENT));
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(STACK_PRESENT)))
regdump->stack_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(STACK_PRESENT));
#if !MALI_USE_CSF
regdump->js_present = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(JS_PRESENT));
/* Not a valid register on TMIX */
/* TGOx specific register */
if (kbase_hw_has_feature(kbdev, KBASE_HW_FEATURE_THREAD_TLS_ALLOC))
regdump->thread_tls_alloc =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_TLS_ALLOC));
#endif /* !MALI_USE_CSF */
regdump->thread_max_threads = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_THREADS));
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_WORKGROUP_SIZE)))
regdump->thread_max_workgroup_size =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_WORKGROUP_SIZE));
#if MALI_USE_CSF
#endif /* MALI_USE_CSF */
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_BARRIER_SIZE)))
regdump->thread_max_barrier_size =
kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_MAX_BARRIER_SIZE));
regdump->thread_features = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(THREAD_FEATURES));
/* Feature Registers */
/* AMBA_FEATURES enum is mapped to COHERENCY_FEATURES enum */
regdump->coherency_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(COHERENCY_FEATURES));
if (kbase_hw_has_feature(kbdev, KBASE_HW_FEATURE_CORE_FEATURES))
regdump->core_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(CORE_FEATURES));
#if MALI_USE_CSF
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(GPU_FEATURES)))
regdump->gpu_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(GPU_FEATURES));
/* Only applicable to GPUs with power control domain registers */
if (kbase_hw_has_feature(kbdev, KBASE_HW_FEATURE_POWER_CONTROL)) {
regdump->base_present = kbase_reg_read64(kbdev, HOST_POWER_ENUM(BASE_PRESENT));
regdump->neural_present = kbase_reg_read64(kbdev, HOST_POWER_ENUM(NEURAL_PRESENT));
}
#endif /* MALI_USE_CSF */
regdump->tiler_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(TILER_FEATURES));
regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
regdump->mem_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MEM_FEATURES));
regdump->mmu_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MMU_FEATURES));
#if !MALI_USE_CSF
for (i = 0; i < GPU_MAX_JOB_SLOTS; i++)
regdump->js_features[i] = kbase_reg_read32(kbdev, GPU_JS_FEATURES_OFFSET(i));
#endif /* !MALI_USE_CSF */
#if MALI_USE_CSF
#endif /* MALI_USE_CSF */
{
for (i = 0; i < BASE_GPU_NUM_TEXTURE_FEATURES_REGISTERS; i++)
regdump->texture_features[i] =
kbase_reg_read32(kbdev, GPU_TEXTURE_FEATURES_OFFSET(i));
}
if (!kbase_io_has_gpu(kbdev))
return -EIO;
return 0;
}
int kbase_backend_gpuprops_get_curr_config(struct kbase_device *kbdev,
struct kbase_current_config_regdump *curr_config_regdump)
{
if (WARN_ON(!kbdev) || WARN_ON(!curr_config_regdump))
return -EINVAL;
curr_config_regdump->mem_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(MEM_FEATURES));
curr_config_regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
curr_config_regdump->shader_present =
kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(SHADER_PRESENT));
curr_config_regdump->l2_present = kbase_reg_read64(kbdev, GPU_CONTROL_ENUM(L2_PRESENT));
if (!kbase_io_has_gpu(kbdev))
return -EIO;
return 0;
}
int kbase_backend_gpuprops_get_l2_features(struct kbase_device *kbdev,
struct kbasep_gpuprops_regdump *regdump)
{
if (kbase_hw_has_feature(kbdev, KBASE_HW_FEATURE_L2_CONFIG)) {
regdump->l2_features = KBASE_REG_READ(kbdev, GPU_CONTROL_ENUM(L2_FEATURES));
regdump->l2_config = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(L2_CONFIG));
#if MALI_USE_CSF
if (kbase_hw_has_l2_slice_hash_feature(kbdev)) {
uint i;
for (i = 0; i < GPU_L2_SLICE_HASH_COUNT; i++)
regdump->l2_slice_hash[i] =
kbase_reg_read32(kbdev, GPU_L2_SLICE_HASH_OFFSET(i));
}
#endif /* MALI_USE_CSF */
if (!kbase_io_has_gpu(kbdev))
return -EIO;
}
return 0;
}

View File

@@ -0,0 +1,479 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* GPU backend instrumentation APIs.
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_hwaccess_instr.h>
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_instr_internal.h>
#include <mali_kbase_io.h>
#define WAIT_FOR_DUMP_TIMEOUT_MS 5000
static int wait_prfcnt_ready(struct kbase_device *kbdev)
{
u32 val;
const u32 timeout_us =
kbase_get_timeout_ms(kbdev, KBASE_PRFCNT_ACTIVE_TIMEOUT) * USEC_PER_MSEC;
const int err = kbase_reg_poll32_timeout(kbdev, GPU_CONTROL_ENUM(GPU_STATUS), val,
!(val & GPU_STATUS_PRFCNT_ACTIVE), 0, timeout_us,
false);
if (err) {
dev_err(kbdev->dev, "PRFCNT_ACTIVE bit stuck\n");
return -EBUSY;
}
return 0;
}
int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev, struct kbase_context *kctx,
struct kbase_instr_hwcnt_enable *enable)
{
unsigned long flags;
int err = -EINVAL;
u32 irq_mask;
u32 prfcnt_config;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* alignment failure */
if ((enable->dump_buffer == 0ULL) || (enable->dump_buffer & (2048 - 1)))
return err;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_DISABLED) {
/* Instrumentation is already enabled */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
if (!kbase_io_has_gpu(kbdev)) {
/* GPU has been removed by Arbiter */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
/* Enable interrupt */
irq_mask = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK));
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK),
irq_mask | PRFCNT_SAMPLE_COMPLETED);
/* In use, this context is the owner */
kbdev->hwcnt.kctx = kctx;
/* Remember the dump address so we can reprogram it later */
kbdev->hwcnt.addr = enable->dump_buffer;
kbdev->hwcnt.addr_bytes = enable->dump_buffer_bytes;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
/* Configure */
prfcnt_config = (u32)kctx->as_nr << PRFCNT_CONFIG_AS_SHIFT;
#ifdef CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS
prfcnt_config |= (u32)kbdev->hwcnt.backend.override_counter_set
<< PRFCNT_CONFIG_SETSELECT_SHIFT;
#else
prfcnt_config |= (u32)enable->counter_set << PRFCNT_CONFIG_SETSELECT_SHIFT;
#endif
/* Wait until prfcnt config register can be written */
err = wait_prfcnt_ready(kbdev);
if (err)
return err;
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG),
prfcnt_config | PRFCNT_CONFIG_MODE_OFF);
/* Wait until prfcnt is disabled before writing configuration registers */
err = wait_prfcnt_ready(kbdev);
if (err)
return err;
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(PRFCNT_BASE), enable->dump_buffer);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_JM_EN), enable->fe_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_SHADER_EN), enable->shader_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_MMU_L2_EN), enable->mmu_l2_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_TILER_EN), enable->tiler_bm);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG),
prfcnt_config | PRFCNT_CONFIG_MODE_MANUAL);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
dev_dbg(kbdev->dev, "HW counters dumping set-up for context %pK", kctx);
return 0;
}
static void kbasep_instr_hwc_disable_hw_prfcnt(struct kbase_device *kbdev)
{
u32 irq_mask;
lockdep_assert_held(&kbdev->hwaccess_lock);
lockdep_assert_held(&kbdev->hwcnt.lock);
if (!kbase_io_has_gpu(kbdev))
/* GPU has been removed by Arbiter */
return;
/* Disable interrupt */
irq_mask = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK));
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_MASK),
irq_mask & ~PRFCNT_SAMPLE_COMPLETED);
/* Wait until prfcnt config register can be written, then disable the counters.
* Return value is ignored as we are disabling anyway.
*/
wait_prfcnt_ready(kbdev);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(PRFCNT_CONFIG), 0);
kbdev->hwcnt.kctx = NULL;
kbdev->hwcnt.addr = 0ULL;
kbdev->hwcnt.addr_bytes = 0ULL;
}
int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)
{
unsigned long flags, pm_flags;
struct kbase_device *kbdev = kctx->kbdev;
const unsigned long timeout = msecs_to_jiffies(WAIT_FOR_DUMP_TIMEOUT_MS);
unsigned int remaining;
while (1) {
spin_lock_irqsave(&kbdev->hwaccess_lock, pm_flags);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
/* Instrumentation is in unrecoverable error state,
* there is nothing for us to do.
*/
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
/* Already disabled, return no error. */
return 0;
}
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_DISABLED) {
/* Instrumentation is not enabled */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
return -EINVAL;
}
if (kbdev->hwcnt.kctx != kctx) {
/* Instrumentation has been setup for another context */
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
return -EINVAL;
}
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE)
break;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
/* Ongoing dump/setup - wait for its completion */
remaining = wait_event_timeout(kbdev->hwcnt.backend.wait,
kbdev->hwcnt.backend.triggered != 0, timeout);
if (remaining == 0)
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_UNRECOVERABLE_ERROR;
}
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
kbdev->hwcnt.backend.triggered = 0;
kbasep_instr_hwc_disable_hw_prfcnt(kbdev);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
dev_dbg(kbdev->dev, "HW counters dumping disabled for context %pK", kctx);
return 0;
}
int kbase_instr_hwcnt_request_dump(struct kbase_context *kctx)
{
unsigned long flags;
int err = -EINVAL;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.kctx != kctx) {
/* The instrumentation has been setup for another context */
goto unlock;
}
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_IDLE) {
/* HW counters are disabled or another dump is ongoing, or we're
* resetting, or we are in unrecoverable error state.
*/
goto unlock;
}
if (!kbase_io_has_gpu(kbdev)) {
/* GPU has been removed by Arbiter */
goto unlock;
}
kbdev->hwcnt.backend.triggered = 0;
/* Mark that we're dumping - the PF handler can signal that we faulted
*/
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DUMPING;
/* Wait until prfcnt is ready to request dump */
err = wait_prfcnt_ready(kbdev);
if (err)
goto unlock;
/* Reconfigure the dump address */
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(PRFCNT_BASE), kbdev->hwcnt.addr);
/* Start dumping */
KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_SAMPLE, NULL, kbdev->hwcnt.addr);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_COMMAND), GPU_COMMAND_PRFCNT_SAMPLE);
dev_dbg(kbdev->dev, "HW counters dumping done for context %pK", kctx);
unlock:
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_request_dump);
bool kbase_instr_hwcnt_dump_complete(struct kbase_context *kctx, bool *const success)
{
unsigned long flags;
bool complete = false;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE) {
*success = true;
complete = true;
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
*success = false;
complete = true;
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return complete;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_dump_complete);
void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* If the state is in unrecoverable error, we already wake_up the waiter
* and don't need to do any action when sample is done.
*/
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_DUMPING) {
/* All finished and idle */
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
int kbase_instr_hwcnt_wait_for_dump(struct kbase_context *kctx)
{
struct kbase_device *kbdev = kctx->kbdev;
unsigned long flags;
int err;
unsigned long remaining;
const unsigned long timeout = msecs_to_jiffies(WAIT_FOR_DUMP_TIMEOUT_MS);
/* Wait for dump & cache clean to complete */
remaining = wait_event_timeout(kbdev->hwcnt.backend.wait,
kbdev->hwcnt.backend.triggered != 0, timeout);
if (remaining == 0) {
err = -ETIME;
/* Set the backend state so it's clear things have gone bad (could be a HW issue)
*/
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_UNRECOVERABLE_ERROR;
goto timed_out;
}
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_FAULT) {
err = -EINVAL;
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
err = -EIO;
} else {
/* Dump done */
KBASE_DEBUG_ASSERT(kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_IDLE);
err = 0;
}
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
timed_out:
return err;
}
int kbase_instr_hwcnt_clear(struct kbase_context *kctx)
{
unsigned long flags;
int err = -EINVAL;
struct kbase_device *kbdev = kctx->kbdev;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* Check it's the context previously set up and we're not in IDLE
* state.
*/
if (kbdev->hwcnt.kctx != kctx || kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_IDLE)
goto unlock;
if (!kbase_io_has_gpu(kbdev)) {
/* GPU has been removed by Arbiter */
goto unlock;
}
/* Wait until prfcnt is ready to clear */
err = wait_prfcnt_ready(kbdev);
if (err)
goto unlock;
/* Clear the counters */
KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_CLEAR, NULL, 0);
kbase_reg_write32(kbdev, GPU_CONTROL_ENUM(GPU_COMMAND), GPU_COMMAND_PRFCNT_CLEAR);
unlock:
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return err;
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_clear);
void kbase_instr_hwcnt_on_unrecoverable_error(struct kbase_device *kbdev)
{
unsigned long flags;
lockdep_assert_held(&kbdev->hwaccess_lock);
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* If we already in unrecoverable error state, early return. */
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR) {
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
return;
}
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_UNRECOVERABLE_ERROR;
/* Need to disable HW if it's not disabled yet. */
if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_DISABLED)
kbasep_instr_hwc_disable_hw_prfcnt(kbdev);
/* Wake up any waiters. */
kbdev->hwcnt.backend.triggered = 1;
wake_up(&kbdev->hwcnt.backend.wait);
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_on_unrecoverable_error);
void kbase_instr_hwcnt_on_before_reset(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
/* A reset is the only way to exit the unrecoverable error state */
if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_UNRECOVERABLE_ERROR)
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
}
KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_on_before_reset);
int kbase_instr_backend_init(struct kbase_device *kbdev)
{
spin_lock_init(&kbdev->hwcnt.lock);
kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
init_waitqueue_head(&kbdev->hwcnt.backend.wait);
#ifdef CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS
/* Use the build time option for the override default. */
#if defined(CONFIG_MALI_VALHALL_PRFCNT_SET_SECONDARY)
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_SECONDARY;
#elif defined(CONFIG_MALI_VALHALL_PRFCNT_SET_TERTIARY)
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_TERTIARY;
#else
/* Default to primary */
kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_PHYSICAL_SET_PRIMARY;
#endif
#endif
return 0;
}
void kbase_instr_backend_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
#ifdef CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS
void kbase_instr_backend_debugfs_init(struct kbase_device *kbdev)
{
/* No validation is done on the debugfs input. Invalid input could cause
* performance counter errors. This is acceptable since this is a debug
* only feature and users should know what they are doing.
*
* Valid inputs are the values accepted bythe SET_SELECT bits of the
* PRFCNT_CONFIG register as defined in the architecture specification.
*/
debugfs_create_u8("hwcnt_set_select", 0644, kbdev->mali_debugfs_directory,
(u8 *)&kbdev->hwcnt.backend.override_counter_set);
}
#endif

View File

@@ -0,0 +1,60 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014, 2016, 2018-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific instrumentation definitions
*/
#ifndef _KBASE_INSTR_DEFS_H_
#define _KBASE_INSTR_DEFS_H_
#include <hwcnt/mali_kbase_hwcnt_gpu.h>
/*
* Instrumentation State Machine States
*/
enum kbase_instr_state {
/* State where instrumentation is not active */
KBASE_INSTR_STATE_DISABLED = 0,
/* State machine is active and ready for a command. */
KBASE_INSTR_STATE_IDLE,
/* Hardware is currently dumping a frame. */
KBASE_INSTR_STATE_DUMPING,
/* An error has occurred during DUMPING (page fault). */
KBASE_INSTR_STATE_FAULT,
/* An unrecoverable error has occurred, a reset is the only way to exit
* from unrecoverable error state.
*/
KBASE_INSTR_STATE_UNRECOVERABLE_ERROR,
};
/* Structure used for instrumentation and HW counters dumping */
struct kbase_instr_backend {
wait_queue_head_t wait;
int triggered;
#ifdef CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS
enum kbase_hwcnt_physical_set override_counter_set;
#endif
enum kbase_instr_state state;
};
#endif /* _KBASE_INSTR_DEFS_H_ */

View File

@@ -0,0 +1,41 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014, 2018, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific HW access instrumentation APIs
*/
#ifndef _KBASE_INSTR_INTERNAL_H_
#define _KBASE_INSTR_INTERNAL_H_
/**
* kbasep_cache_clean_worker() - Workqueue for handling cache cleaning
* @data: a &struct work_struct
*/
void kbasep_cache_clean_worker(struct work_struct *data);
/**
* kbase_instr_hwcnt_sample_done() - Dump complete interrupt received
* @kbdev: Kbase device
*/
void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev);
#endif /* _KBASE_INSTR_INTERNAL_H_ */

View File

@@ -0,0 +1,110 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend specific IRQ APIs
*/
#ifndef _KBASE_IRQ_INTERNAL_H_
#define _KBASE_IRQ_INTERNAL_H_
/* GPU IRQ Tags */
#define JOB_IRQ_TAG 0
#define MMU_IRQ_TAG 1
#define GPU_IRQ_TAG 2
#define EDGE_IRQ_TAG 3
/**
* kbase_install_interrupts - Install IRQs handlers.
*
* @kbdev: The kbase device
*
* This function must be called once only when a kbase device is initialized.
*
* Return: 0 on success. Error code (negative) on failure.
*/
int kbase_install_interrupts(struct kbase_device *kbdev);
/**
* kbase_release_interrupts - Uninstall IRQs handlers.
*
* @kbdev: The kbase device
*
* This function needs to be called when a kbase device is terminated.
*/
void kbase_release_interrupts(struct kbase_device *kbdev);
/**
* kbase_synchronize_irqs - Ensure that all IRQ handlers have completed
* execution
* @kbdev: The kbase device
*/
void kbase_synchronize_irqs(struct kbase_device *kbdev);
#ifdef CONFIG_MALI_VALHALL_DEBUG
#if IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW)
/**
* kbase_validate_interrupts - Validate interrupts
*
* @kbdev: The kbase device
*
* This function will be called once when a kbase device is initialized
* to check whether interrupt handlers are configured appropriately.
* If interrupt numbers and/or flags defined in the device tree are
* incorrect, then the validation might fail.
* The whold device initialization will fail if it returns error code.
*
* Return: 0 on success. Error code (negative) on failure.
*/
int kbase_validate_interrupts(struct kbase_device *const kbdev);
#endif /* IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW) */
#endif /* CONFIG_MALI_VALHALL_DEBUG */
/**
* kbase_get_interrupt_handler - Return default interrupt handler
* @kbdev: Kbase device
* @irq_tag: Tag to choose the handler
*
* If single interrupt line is used the combined interrupt handler
* will be returned regardless of irq_tag. Otherwise the corresponding
* interrupt handler will be returned.
*
* Return: Interrupt handler corresponding to the tag. NULL on failure.
*/
irq_handler_t kbase_get_interrupt_handler(struct kbase_device *kbdev, u32 irq_tag);
/**
* kbase_set_custom_irq_handler - Set a custom IRQ handler
*
* @kbdev: The kbase device for which the handler is to be registered
* @custom_handler: Handler to be registered
* @irq_tag: Interrupt tag
*
* Register given interrupt handler for requested interrupt tag
* In the case where irq handler is not specified, the default handler shall be
* registered
*
* Return: 0 case success, error code otherwise
*/
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag);
#endif /* _KBASE_IRQ_INTERNAL_H_ */

View File

@@ -0,0 +1,551 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_irq_internal.h>
#include <mali_kbase_io.h>
#include <linux/interrupt.h>
#if IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW)
static void *kbase_tag(void *ptr, u32 tag)
{
return (void *)(((uintptr_t)ptr) | tag);
}
#endif
static void *kbase_untag(void *ptr)
{
return (void *)(((uintptr_t)ptr) & ~(uintptr_t)3);
}
static irqreturn_t kbase_job_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_STATUS));
if (!val) {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
#if MALI_USE_CSF
/* call the csf interrupt handler */
kbase_csf_interrupt(kbdev, val);
#else
kbase_job_done(kbdev, val);
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_HANDLED;
}
static irqreturn_t kbase_mmu_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
atomic_inc(&kbdev->faults_pending);
val = kbase_reg_read32(kbdev, MMU_CONTROL_ENUM(IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val) {
atomic_dec(&kbdev->faults_pending);
return IRQ_NONE;
}
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbase_mmu_interrupt(kbdev, val);
atomic_dec(&kbdev->faults_pending);
return IRQ_HANDLED;
}
#if MALI_USE_CSF
static irqreturn_t kbase_pwr_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 pwr_irq_status = 0;
irqreturn_t irq_state = IRQ_NONE;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
pwr_irq_status = kbase_reg_read32(kbdev, HOST_POWER_ENUM(PWR_IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (pwr_irq_status) {
dev_dbg(kbdev->dev, "%s: pwr irq %d irqstatus 0x%x\n", __func__, irq,
pwr_irq_status);
kbase_pwr_interrupt(kbdev, pwr_irq_status);
irq_state = IRQ_HANDLED;
}
return irq_state;
}
#endif /* MALI_USE_CSF */
static irqreturn_t kbase_gpuonly_irq_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 gpu_irq_status;
irqreturn_t irq_state = IRQ_NONE;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
gpu_irq_status = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(GPU_IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (gpu_irq_status) {
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, gpu_irq_status);
kbase_gpu_interrupt(kbdev, gpu_irq_status);
irq_state = IRQ_HANDLED;
}
return irq_state;
}
/**
* kbase_gpu_irq_handler - GPU interrupt handler
* @irq: IRQ number
* @data: Data associated with this IRQ (i.e. kbdev)
*
* Return: IRQ_HANDLED if any interrupt request has been successfully handled.
* IRQ_NONE otherwise.
*/
static irqreturn_t kbase_gpu_irq_handler(int irq, void *data)
{
irqreturn_t irq_state = kbase_gpuonly_irq_handler(irq, data);
#if MALI_USE_CSF
struct kbase_device *kbdev = kbase_untag(data);
/* Skip if HOST_POWER page is not available */
if (kbdev->pm.backend.has_host_pwr_iface) {
if (kbase_pwr_irq_handler(irq, data) == IRQ_HANDLED)
irq_state = IRQ_HANDLED;
}
#endif /* MALI_USE_CSF */
return irq_state;
}
/**
* kbase_combined_irq_handler - Combined interrupt handler for all interrupts
* @irq: IRQ number
* @data: Data associated with this IRQ (i.e. kbdev)
*
* This handler will be used for the GPU with single interrupt line.
*
* Return: IRQ_HANDLED if any interrupt request has been successfully handled.
* IRQ_NONE otherwise.
*/
static irqreturn_t kbase_combined_irq_handler(int irq, void *data)
{
irqreturn_t irq_state = IRQ_NONE;
irq_state |= kbase_job_irq_handler(irq, data);
irq_state |= kbase_mmu_irq_handler(irq, data);
irq_state |= kbase_gpu_irq_handler(irq, data);
return irq_state;
}
static irq_handler_t kbase_handler_table[] = {
[JOB_IRQ_TAG] = kbase_job_irq_handler,
[MMU_IRQ_TAG] = kbase_mmu_irq_handler,
[GPU_IRQ_TAG] = kbase_gpu_irq_handler,
};
irq_handler_t kbase_get_interrupt_handler(struct kbase_device *kbdev, u32 irq_tag)
{
if (kbdev->nr_irqs == 1)
return kbase_combined_irq_handler;
else if (irq_tag < ARRAY_SIZE(kbase_handler_table))
return kbase_handler_table[irq_tag];
else
return NULL;
}
#if IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW)
#ifdef CONFIG_MALI_VALHALL_DEBUG
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag)
{
int result = 0;
irq_handler_t handler = custom_handler;
const int irq = (kbdev->nr_irqs == 1) ? 0 : irq_tag;
if (unlikely(!((irq_tag >= JOB_IRQ_TAG) && (irq_tag <= GPU_IRQ_TAG)))) {
dev_err(kbdev->dev, "Invalid irq_tag (%d)\n", irq_tag);
return -EINVAL;
}
/* Release previous handler */
if (kbdev->irqs[irq].irq)
free_irq(kbdev->irqs[irq].irq, kbase_tag(kbdev, irq));
/* If a custom handler isn't provided use the default handler */
if (!handler)
handler = kbase_get_interrupt_handler(kbdev, irq_tag);
if (request_irq(kbdev->irqs[irq].irq, handler, kbdev->irqs[irq].flags | IRQF_SHARED,
dev_name(kbdev->dev), kbase_tag(kbdev, irq)) != 0) {
result = -EINVAL;
dev_err(kbdev->dev, "Can't request interrupt %u (index %u)\n", kbdev->irqs[irq].irq,
irq_tag);
if (IS_ENABLED(CONFIG_SPARSE_IRQ))
dev_err(kbdev->dev,
"CONFIG_SPARSE_IRQ enabled - is the interrupt number correct for this config?\n");
}
return result;
}
KBASE_EXPORT_TEST_API(kbase_set_custom_irq_handler);
/* test correct interrupt assigment and reception by cpu */
struct kbasep_irq_test {
struct hrtimer timer;
wait_queue_head_t wait;
int triggered;
u32 timeout;
};
static struct kbasep_irq_test kbasep_irq_test_data;
#define IRQ_TEST_TIMEOUT 500
static irqreturn_t kbase_job_irq_test_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val)
return IRQ_NONE;
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbasep_irq_test_data.triggered = 1;
wake_up(&kbasep_irq_test_data.wait);
kbase_reg_write32(kbdev, JOB_CONTROL_ENUM(JOB_IRQ_CLEAR), val);
return IRQ_HANDLED;
}
static irqreturn_t kbase_mmu_irq_test_handler(int irq, void *data)
{
unsigned long flags;
struct kbase_device *kbdev = kbase_untag(data);
u32 val;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbase_io_is_gpu_powered(kbdev)) {
/* GPU is turned off - IRQ is not for us */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return IRQ_NONE;
}
val = kbase_reg_read32(kbdev, MMU_CONTROL_ENUM(IRQ_STATUS));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (!val)
return IRQ_NONE;
dev_dbg(kbdev->dev, "%s: irq %d irqstatus 0x%x\n", __func__, irq, val);
kbasep_irq_test_data.triggered = 1;
wake_up(&kbasep_irq_test_data.wait);
kbase_reg_write32(kbdev, MMU_CONTROL_ENUM(IRQ_CLEAR), val);
return IRQ_HANDLED;
}
static enum hrtimer_restart kbasep_test_interrupt_timeout(struct hrtimer *timer)
{
struct kbasep_irq_test *test_data = container_of(timer, struct kbasep_irq_test, timer);
test_data->timeout = 1;
test_data->triggered = 1;
wake_up(&test_data->wait);
return HRTIMER_NORESTART;
}
/**
* validate_interrupt - Validate an interrupt
* @kbdev: Kbase device
* @tag: Tag to choose the interrupt
*
* To validate the settings for the interrupt, write a value on RAWSTAT
* register to trigger interrupt. Then with custom interrupt handler
* check whether the interrupt happens within reasonable time.
*
* Return: 0 if validating interrupt succeeds.
*/
static int validate_interrupt(struct kbase_device *const kbdev, u32 tag)
{
int err = 0;
irq_handler_t handler;
const int irq = (kbdev->nr_irqs == 1) ? 0 : tag;
u32 old_mask_val;
u16 mask_offset;
u16 rawstat_offset;
switch (tag) {
case JOB_IRQ_TAG:
handler = kbase_job_irq_test_handler;
rawstat_offset = JOB_CONTROL_ENUM(JOB_IRQ_RAWSTAT);
mask_offset = JOB_CONTROL_ENUM(JOB_IRQ_MASK);
break;
case MMU_IRQ_TAG:
handler = kbase_mmu_irq_test_handler;
rawstat_offset = MMU_CONTROL_ENUM(IRQ_RAWSTAT);
mask_offset = MMU_CONTROL_ENUM(IRQ_MASK);
break;
case GPU_IRQ_TAG:
/* already tested by pm_driver - bail out */
return 0;
default:
dev_err(kbdev->dev, "Invalid tag (%d)\n", tag);
return -EINVAL;
}
/* store old mask */
old_mask_val = kbase_reg_read32(kbdev, mask_offset);
/* mask interrupts */
kbase_reg_write32(kbdev, mask_offset, 0x0);
if (kbdev->irqs[irq].irq) {
/* release original handler and install test handler */
if (kbase_set_custom_irq_handler(kbdev, handler, tag) != 0) {
err = -EINVAL;
} else {
kbasep_irq_test_data.timeout = 0;
hrtimer_init(&kbasep_irq_test_data.timer, CLOCK_MONOTONIC,
HRTIMER_MODE_REL);
kbasep_irq_test_data.timer.function = kbasep_test_interrupt_timeout;
/* trigger interrupt */
kbase_reg_write32(kbdev, mask_offset, 0x1);
kbase_reg_write32(kbdev, rawstat_offset, 0x1);
hrtimer_start(&kbasep_irq_test_data.timer,
HR_TIMER_DELAY_MSEC(IRQ_TEST_TIMEOUT), HRTIMER_MODE_REL);
wait_event(kbasep_irq_test_data.wait, kbasep_irq_test_data.triggered != 0);
if (kbasep_irq_test_data.timeout != 0) {
dev_err(kbdev->dev, "Interrupt %u (index %u) didn't reach CPU.\n",
kbdev->irqs[irq].irq, irq);
err = -EINVAL;
} else {
dev_dbg(kbdev->dev, "Interrupt %u (index %u) reached CPU.\n",
kbdev->irqs[irq].irq, irq);
}
hrtimer_cancel(&kbasep_irq_test_data.timer);
kbasep_irq_test_data.triggered = 0;
/* mask interrupts */
kbase_reg_write32(kbdev, mask_offset, 0x0);
/* release test handler */
free_irq(kbdev->irqs[irq].irq, kbase_tag(kbdev, irq));
}
/* restore original interrupt */
if (request_irq(kbdev->irqs[irq].irq, kbase_get_interrupt_handler(kbdev, tag),
kbdev->irqs[irq].flags | IRQF_SHARED, dev_name(kbdev->dev),
kbase_tag(kbdev, irq))) {
dev_err(kbdev->dev, "Can't restore original interrupt %u (index %u)\n",
kbdev->irqs[irq].irq, tag);
err = -EINVAL;
}
}
/* restore old mask */
kbase_reg_write32(kbdev, mask_offset, old_mask_val);
return err;
}
#if IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW)
int kbase_validate_interrupts(struct kbase_device *const kbdev)
{
int err;
init_waitqueue_head(&kbasep_irq_test_data.wait);
kbasep_irq_test_data.triggered = 0;
/* A suspend won't happen during startup/insmod */
kbase_pm_context_active(kbdev);
err = validate_interrupt(kbdev, JOB_IRQ_TAG);
if (err) {
dev_err(kbdev->dev,
"Interrupt JOB_IRQ didn't reach CPU. Check interrupt assignments.\n");
goto out;
}
err = validate_interrupt(kbdev, MMU_IRQ_TAG);
if (err) {
dev_err(kbdev->dev,
"Interrupt MMU_IRQ didn't reach CPU. Check interrupt assignments.\n");
goto out;
}
dev_dbg(kbdev->dev, "Interrupts are correctly assigned.\n");
out:
kbase_pm_context_idle(kbdev);
return err;
}
#endif /* CONFIG_MALI_VALHALL_REAL_HW */
#endif /* CONFIG_MALI_VALHALL_DEBUG */
int kbase_install_interrupts(struct kbase_device *kbdev)
{
u32 irq_index;
#if MALI_USE_CSF
if (kbdev->gpu_props.gpu_id.arch_id >= GPU_ID_ARCH_MAKE(14, 8, 0)) {
if (kbdev->nr_irqs != 1) {
dev_err(kbdev->dev, "Incorrect number of irq entries (%u)", kbdev->nr_irqs);
return -EINVAL;
}
} else {
if (kbdev->nr_irqs != 3) {
dev_err(kbdev->dev, "Incorrect number of irq entries (%u)", kbdev->nr_irqs);
return -EINVAL;
}
}
#endif /* MALI_USE_CSF */
for (irq_index = 0; irq_index < kbdev->nr_irqs; irq_index++) {
const int result = request_irq(kbdev->irqs[irq_index].irq,
kbase_get_interrupt_handler(kbdev, irq_index),
kbdev->irqs[irq_index].flags | IRQF_SHARED,
dev_name(kbdev->dev), kbase_tag(kbdev, irq_index));
if (result) {
dev_err(kbdev->dev, "Can't request interrupt %u (index %u)\n",
kbdev->irqs[irq_index].irq, irq_index);
goto irq_release;
}
}
return 0;
irq_release:
if (IS_ENABLED(CONFIG_SPARSE_IRQ))
dev_err(kbdev->dev,
"CONFIG_SPARSE_IRQ enabled - is the interrupt number correct for this config?\n");
while (irq_index-- > 0)
free_irq(kbdev->irqs[irq_index].irq, kbase_tag(kbdev, irq_index));
return -EINVAL;
}
void kbase_release_interrupts(struct kbase_device *kbdev)
{
u32 i;
for (i = 0; i < kbdev->nr_irqs; i++) {
if (kbdev->irqs[i].irq)
free_irq(kbdev->irqs[i].irq, kbase_tag(kbdev, i));
}
}
void kbase_synchronize_irqs(struct kbase_device *kbdev)
{
u32 i;
for (i = 0; i < kbdev->nr_irqs; i++) {
if (kbdev->irqs[i].irq)
synchronize_irq(kbdev->irqs[i].irq);
}
}
KBASE_EXPORT_TEST_API(kbase_synchronize_irqs);
#endif /* IS_ENABLED(CONFIG_MALI_VALHALL_REAL_HW) */

View File

@@ -0,0 +1,237 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register backend context / address space management
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_jm.h>
#include <mali_kbase_ctx_sched.h>
/**
* assign_and_activate_kctx_addr_space - Assign an AS to a context
* @kbdev: Kbase device
* @kctx: Kbase context
* @current_as: Address Space to assign
*
* Assign an Address Space (AS) to a context, and add the context to the Policy.
*
* This includes
* setting up the global runpool_irq structure and the context on the AS,
* Activating the MMU on the AS,
* Allowing jobs to be submitted on the AS.
*
* Context:
* kbasep_js_kctx_info.jsctx_mutex held,
* kbasep_js_device_data.runpool_mutex held,
* AS transaction mutex held,
* Runpool IRQ lock held
*/
static void assign_and_activate_kctx_addr_space(struct kbase_device *kbdev,
struct kbase_context *kctx,
struct kbase_as *current_as)
{
struct kbasep_js_device_data *js_devdata = &kbdev->js_data;
CSTD_UNUSED(current_as);
lockdep_assert_held(&kctx->jctx.sched_info.ctx.jsctx_mutex);
lockdep_assert_held(&js_devdata->runpool_mutex);
lockdep_assert_held(&kbdev->hwaccess_lock);
#if !MALI_USE_CSF
/* Attribute handling */
kbasep_js_ctx_attr_runpool_retain_ctx(kbdev, kctx);
#endif
/* Allow it to run jobs */
kbasep_js_set_submit_allowed(js_devdata, kctx);
kbase_js_runpool_inc_context_count(kbdev, kctx);
}
bool kbase_backend_use_ctx_sched(struct kbase_device *kbdev, struct kbase_context *kctx,
unsigned int js)
{
int i;
if (kbdev->hwaccess.active_kctx[js] == kctx) {
/* Context is already active */
return true;
}
for (i = 0; i < kbdev->nr_hw_address_spaces; i++) {
if (kbdev->as_to_kctx[i] == kctx) {
/* Context already has ASID - mark as active */
return true;
}
}
/* Context does not have address space assigned */
return false;
}
void kbase_backend_release_ctx_irq(struct kbase_device *kbdev, struct kbase_context *kctx)
{
int as_nr = kctx->as_nr;
if (as_nr == KBASEP_AS_NR_INVALID) {
WARN(1, "Attempting to release context without ASID\n");
return;
}
lockdep_assert_held(&kbdev->hwaccess_lock);
if (atomic_read(&kctx->refcount) != 1) {
WARN(1, "Attempting to release active ASID\n");
return;
}
kbasep_js_clear_submit_allowed(&kbdev->js_data, kctx);
kbase_ctx_sched_release_ctx(kctx);
kbase_js_runpool_dec_context_count(kbdev, kctx);
}
void kbase_backend_release_ctx_noirq(struct kbase_device *kbdev, struct kbase_context *kctx)
{
CSTD_UNUSED(kbdev);
CSTD_UNUSED(kctx);
}
int kbase_backend_find_and_release_free_address_space(struct kbase_device *kbdev,
struct kbase_context *kctx)
{
struct kbasep_js_device_data *js_devdata;
struct kbasep_js_kctx_info *js_kctx_info;
unsigned long flags;
int i;
js_devdata = &kbdev->js_data;
js_kctx_info = &kctx->jctx.sched_info;
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
for (i = 0; i < kbdev->nr_hw_address_spaces; i++) {
struct kbasep_js_kctx_info *as_js_kctx_info;
struct kbase_context *as_kctx;
as_kctx = kbdev->as_to_kctx[i];
as_js_kctx_info = &as_kctx->jctx.sched_info;
/* Don't release privileged or active contexts, or contexts with
* jobs running.
* Note that a context will have at least 1 reference (which
* was previously taken by kbasep_js_schedule_ctx()) until
* descheduled.
*/
if (as_kctx && !kbase_ctx_flag(as_kctx, KCTX_PRIVILEGED) &&
atomic_read(&as_kctx->refcount) == 1) {
if (!kbase_ctx_sched_inc_refcount_nolock(as_kctx)) {
WARN(1, "Failed to retain active context\n");
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return KBASEP_AS_NR_INVALID;
}
kbasep_js_clear_submit_allowed(js_devdata, as_kctx);
/* Drop and retake locks to take the jsctx_mutex on the
* context we're about to release without violating lock
* ordering
*/
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
/* Release context from address space */
mutex_lock(&as_js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
kbasep_js_runpool_release_ctx_nolock(kbdev, as_kctx);
if (!kbase_ctx_flag(as_kctx, KCTX_SCHEDULED)) {
kbasep_js_runpool_requeue_or_kill_ctx(kbdev, as_kctx, true);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&as_js_kctx_info->ctx.jsctx_mutex);
return i;
}
/* Context was retained while locks were dropped,
* continue looking for free AS
*/
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&as_js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
mutex_lock(&js_devdata->runpool_mutex);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
}
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
mutex_unlock(&js_devdata->runpool_mutex);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return KBASEP_AS_NR_INVALID;
}
bool kbase_backend_use_ctx(struct kbase_device *kbdev, struct kbase_context *kctx, int as_nr)
{
struct kbasep_js_device_data *js_devdata;
struct kbase_as *new_address_space = NULL;
int js;
js_devdata = &kbdev->js_data;
for (js = 0; js < BASE_JM_MAX_NR_SLOTS; js++) {
if (kbdev->hwaccess.active_kctx[js] == kctx) {
WARN(1, "Context is already scheduled in\n");
return false;
}
}
new_address_space = &kbdev->as[as_nr];
lockdep_assert_held(&js_devdata->runpool_mutex);
lockdep_assert_held(&kbdev->mmu_hw_mutex);
lockdep_assert_held(&kbdev->hwaccess_lock);
assign_and_activate_kctx_addr_space(kbdev, kctx, new_address_space);
if (kbase_ctx_flag(kctx, KCTX_PRIVILEGED)) {
/* We need to retain it to keep the corresponding address space
*/
kbase_ctx_sched_retain_ctx_refcount(kctx);
}
return true;
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,140 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Job Manager backend-specific low-level APIs.
*/
#ifndef _KBASE_JM_HWACCESS_H_
#define _KBASE_JM_HWACCESS_H_
#include <mali_kbase_hw.h>
#include <mali_kbase_debug.h>
#include <linux/atomic.h>
#include <backend/gpu/mali_kbase_jm_rb.h>
#include <device/mali_kbase_device.h>
/**
* kbase_job_done_slot() - Complete the head job on a particular job-slot
* @kbdev: Device pointer
* @s: Job slot
* @completion_code: Completion code of job reported by GPU
* @job_tail: Job tail address reported by GPU
* @end_timestamp: Timestamp of job completion
*/
void kbase_job_done_slot(struct kbase_device *kbdev, int s, u32 completion_code, u64 job_tail,
ktime_t *end_timestamp);
/**
* kbase_job_hw_submit() - Submit a job to the GPU
* @kbdev: Device pointer
* @katom: Atom to submit
* @js: Job slot to submit on
*
* The caller must check kbasep_jm_is_submit_slots_free() != false before
* calling this.
*
* The following locking conditions are made on the caller:
* - it must hold the hwaccess_lock
*
* Return: 0 if the job was successfully submitted to hardware, an error otherwise.
*/
int kbase_job_hw_submit(struct kbase_device *kbdev, struct kbase_jd_atom *katom, unsigned int js);
#if !MALI_USE_CSF
/**
* kbasep_job_slot_soft_or_hard_stop_do_action() - Perform a soft or hard stop
* on the specified atom
* @kbdev: Device pointer
* @js: Job slot to stop on
* @action: The action to perform, either JS_COMMAND_HARD_STOP or
* JS_COMMAND_SOFT_STOP
* @core_reqs: Core requirements of atom to stop
* @target_katom: Atom to stop
*
* The following locking conditions are made on the caller:
* - it must hold the hwaccess_lock
*/
void kbasep_job_slot_soft_or_hard_stop_do_action(struct kbase_device *kbdev, unsigned int js,
u32 action, base_jd_core_req core_reqs,
struct kbase_jd_atom *target_katom);
#endif /* !MALI_USE_CSF */
/**
* kbase_backend_soft_hard_stop_slot() - Soft or hard stop jobs on a given job
* slot belonging to a given context.
* @kbdev: Device pointer
* @kctx: Context pointer. May be NULL
* @katom: Specific atom to stop. May be NULL
* @js: Job slot to hard stop
* @action: The action to perform, either JS_COMMAND_HARD_STOP or
* JS_COMMAND_SOFT_STOP
*
* If no context is provided then all jobs on the slot will be soft or hard
* stopped.
*
* If a katom is provided then only that specific atom will be stopped. In this
* case the kctx parameter is ignored.
*
* Jobs that are on the slot but are not yet on the GPU will be unpulled and
* returned to the job scheduler.
*
* Return: true if an atom was stopped, false otherwise
*/
bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev, struct kbase_context *kctx,
unsigned int js, struct kbase_jd_atom *katom, u32 action);
/**
* kbase_job_slot_init - Initialise job slot framework
* @kbdev: Device pointer
*
* Called on driver initialisation
*
* Return: 0 on success
*/
int kbase_job_slot_init(struct kbase_device *kbdev);
/**
* kbase_job_slot_halt - Halt the job slot framework
* @kbdev: Device pointer
*
* Should prevent any further job slot processing
*/
void kbase_job_slot_halt(struct kbase_device *kbdev);
/**
* kbase_job_slot_term - Terminate job slot framework
* @kbdev: Device pointer
*
* Called on driver termination
*/
void kbase_job_slot_term(struct kbase_device *kbdev);
/**
* kbase_gpu_cache_clean - Cause a GPU cache clean & flush
* @kbdev: Device pointer
*
* Caller must not be in IRQ context
*/
void kbase_gpu_cache_clean(struct kbase_device *kbdev);
#endif /* _KBASE_JM_HWACCESS_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,77 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2018, 2020-2022 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific APIs
*/
#ifndef _KBASE_HWACCESS_GPU_H_
#define _KBASE_HWACCESS_GPU_H_
#include <backend/gpu/mali_kbase_pm_internal.h>
/**
* kbase_gpu_irq_evict - Evict an atom from a NEXT slot
*
* @kbdev: Device pointer
* @js: Job slot to evict from
* @completion_code: Event code from job that was run.
*
* Evict the atom in the NEXT slot for the specified job slot. This function is
* called from the job complete IRQ handler when the previous job has failed.
*
* Return: true if job evicted from NEXT registers, false otherwise
*/
bool kbase_gpu_irq_evict(struct kbase_device *kbdev, unsigned int js, u32 completion_code);
/**
* kbase_gpu_complete_hw - Complete an atom on job slot js
*
* @kbdev: Device pointer
* @js: Job slot that has completed
* @completion_code: Event code from job that has completed
* @job_tail: The tail address from the hardware if the job has partially
* completed
* @end_timestamp: Time of completion
*/
void kbase_gpu_complete_hw(struct kbase_device *kbdev, unsigned int js, u32 completion_code,
u64 job_tail, ktime_t *end_timestamp);
/**
* kbase_gpu_inspect - Inspect the contents of the HW access ringbuffer
*
* @kbdev: Device pointer
* @js: Job slot to inspect
* @idx: Index into ringbuffer. 0 is the job currently running on
* the slot, 1 is the job waiting, all other values are invalid.
* Return: The atom at that position in the ringbuffer
* or NULL if no atom present
*/
struct kbase_jd_atom *kbase_gpu_inspect(struct kbase_device *kbdev, unsigned int js, int idx);
/**
* kbase_gpu_dump_slots - Print the contents of the slot ringbuffers
*
* @kbdev: Device pointer
*/
void kbase_gpu_dump_slots(struct kbase_device *kbdev);
#endif /* _KBASE_HWACCESS_GPU_H_ */

View File

@@ -0,0 +1,371 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific job scheduler APIs
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_jm.h>
#include <mali_kbase_reset_gpu.h>
#include <backend/gpu/mali_kbase_jm_internal.h>
#include <backend/gpu/mali_kbase_js_internal.h>
#if IS_ENABLED(CONFIG_MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD)
#include <mali_kbase_gpu_metrics.h>
#endif
/*
* Hold the runpool_mutex for this
*/
static inline bool timer_callback_should_run(struct kbase_device *kbdev, int nr_running_ctxs)
{
lockdep_assert_held(&kbdev->js_data.runpool_mutex);
#ifdef CONFIG_MALI_VALHALL_DEBUG
if (kbdev->js_data.softstop_always) {
/* Debug support for allowing soft-stop on a single context */
return true;
}
#endif /* CONFIG_MALI_VALHALL_DEBUG */
if (kbase_hw_has_issue(kbdev, KBASE_HW_ISSUE_9435)) {
/* Timeouts would have to be 4x longer (due to micro-
* architectural design) to support OpenCL conformance tests, so
* only run the timer when there's:
* - 2 or more CL contexts
* - 1 or more GLES contexts
*
* NOTE: We will treat a context that has both Compute and Non-
* Compute jobs will be treated as an OpenCL context (hence, we
* don't check KBASEP_JS_CTX_ATTR_NON_COMPUTE).
*/
{
int nr_compute_ctxs = kbasep_js_ctx_attr_count_on_runpool(
kbdev, KBASEP_JS_CTX_ATTR_COMPUTE);
int nr_noncompute_ctxs = nr_running_ctxs - nr_compute_ctxs;
return (bool)(nr_compute_ctxs >= 2 || nr_noncompute_ctxs > 0);
}
} else {
/* Run the timer callback whenever you have at least 1 context
*/
return (bool)(nr_running_ctxs > 0);
}
}
static enum hrtimer_restart timer_callback(struct hrtimer *timer)
{
unsigned long flags;
struct kbase_device *kbdev;
struct kbasep_js_device_data *js_devdata;
struct kbase_backend_data *backend;
unsigned int s;
bool reset_needed = false;
KBASE_DEBUG_ASSERT(timer != NULL);
backend = container_of(timer, struct kbase_backend_data, scheduling_timer);
kbdev = container_of(backend, struct kbase_device, hwaccess.backend);
js_devdata = &kbdev->js_data;
/* Loop through the slots */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
for (s = 0; s < kbdev->gpu_props.num_job_slots; s++) {
struct kbase_jd_atom *atom = NULL;
if (kbase_backend_nr_atoms_on_slot(kbdev, s) > 0) {
atom = kbase_gpu_inspect(kbdev, s, 0);
KBASE_DEBUG_ASSERT(atom != NULL);
}
if (atom != NULL) {
/* The current version of the model doesn't support
* Soft-Stop
*/
if (!kbase_hw_has_issue(kbdev, KBASE_HW_ISSUE_5736)) {
u32 ticks = atom->ticks++;
#if !defined(CONFIG_MALI_VALHALL_JOB_DUMP) && !defined(CONFIG_MALI_VECTOR_DUMP)
u32 soft_stop_ticks, hard_stop_ticks, gpu_reset_ticks;
if (atom->core_req & BASE_JD_REQ_ONLY_COMPUTE) {
soft_stop_ticks = js_devdata->soft_stop_ticks_cl;
hard_stop_ticks = js_devdata->hard_stop_ticks_cl;
gpu_reset_ticks = js_devdata->gpu_reset_ticks_cl;
} else {
soft_stop_ticks = js_devdata->soft_stop_ticks;
if (kbase_is_quick_reset_enabled(kbdev)) {
hard_stop_ticks = 2;
gpu_reset_ticks = 3;
} else {
hard_stop_ticks = js_devdata->hard_stop_ticks_ss;
gpu_reset_ticks = js_devdata->gpu_reset_ticks_ss;
}
}
/* If timeouts have been changed then ensure
* that atom tick count is not greater than the
* new soft_stop timeout. This ensures that
* atoms do not miss any of the timeouts due to
* races between this worker and the thread
* changing the timeouts.
*/
if (backend->timeouts_updated && ticks > soft_stop_ticks)
ticks = atom->ticks = soft_stop_ticks;
/* Job is Soft-Stoppable */
if (ticks == soft_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->soft_stop_ticks ticks.
* Soft stop the slot so we can run
* other jobs.
*/
#if !KBASE_DISABLE_SCHEDULING_SOFT_STOPS
int disjoint_threshold =
KBASE_DISJOINT_STATE_INTERLEAVED_CONTEXT_COUNT_THRESHOLD;
u32 softstop_flags = 0u;
dev_dbg(kbdev->dev, "Soft-stop");
/* nr_user_contexts_running is updated
* with the runpool_mutex, but we can't
* take that here.
*
* However, if it's about to be
* increased then the new context can't
* run any jobs until they take the
* hwaccess_lock, so it's OK to observe
* the older value.
*
* Similarly, if it's about to be
* decreased, the last job from another
* context has already finished, so
* it's not too bad that we observe the
* older value and register a disjoint
* event when we try soft-stopping
*/
if (js_devdata->nr_user_contexts_running >=
disjoint_threshold)
softstop_flags |= JS_COMMAND_SW_CAUSES_DISJOINT;
kbase_job_slot_softstop_swflags(kbdev, s, atom,
softstop_flags);
#endif
} else if (ticks == hard_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->hard_stop_ticks_ss ticks.
* It should have been soft-stopped by
* now. Hard stop the slot.
*/
#if !KBASE_DISABLE_SCHEDULING_HARD_STOPS
u32 ms = js_devdata->scheduling_period_ns / 1000000u;
if (!kbase_is_quick_reset_enabled(kbdev))
dev_warn(
kbdev->dev,
"JS: Job Hard-Stopped (took more than %u ticks at %u ms/tick)",
ticks, ms);
kbase_job_slot_hardstop(atom->kctx, s, atom);
#endif
} else if (ticks == gpu_reset_ticks) {
/* Job has been scheduled for at least
* js_devdata->gpu_reset_ticks_ss ticks.
* It should have left the GPU by now.
* Signal that the GPU needs to be
* reset.
*/
reset_needed = true;
}
#else /* !CONFIG_MALI_VALHALL_JOB_DUMP */
/* NOTE: During CONFIG_MALI_VALHALL_JOB_DUMP, we use
* the alternate timeouts, which makes the hard-
* stop and GPU reset timeout much longer. We
* also ensure that we don't soft-stop at all.
*/
if (ticks == js_devdata->soft_stop_ticks) {
/* Job has been scheduled for at least
* js_devdata->soft_stop_ticks. We do
* not soft-stop during
* CONFIG_MALI_VALHALL_JOB_DUMP, however.
*/
dev_dbg(kbdev->dev, "Soft-stop");
} else if (ticks == js_devdata->hard_stop_ticks_dumping) {
/* Job has been scheduled for at least
* js_devdata->hard_stop_ticks_dumping
* ticks. Hard stop the slot.
*/
#if !KBASE_DISABLE_SCHEDULING_HARD_STOPS
u32 ms = js_devdata->scheduling_period_ns / 1000000u;
dev_warn(
kbdev->dev,
"JS: Job Hard-Stopped (took more than %u ticks at %u ms/tick)",
ticks, ms);
kbase_job_slot_hardstop(atom->kctx, s, atom);
#endif
} else if (ticks == js_devdata->gpu_reset_ticks_dumping) {
/* Job has been scheduled for at least
* js_devdata->gpu_reset_ticks_dumping
* ticks. It should have left the GPU by
* now. Signal that the GPU needs to be
* reset.
*/
reset_needed = true;
}
#endif /* !CONFIG_MALI_VALHALL_JOB_DUMP */
}
}
}
if (reset_needed) {
if (kbase_is_quick_reset_enabled(kbdev))
dev_err(kbdev->dev, "quick reset");
else
dev_err(kbdev->dev,
"JS: Job has been on the GPU for too long (JS_RESET_TICKS_SS/DUMPING timeout hit). Issuing GPU soft-reset to resolve.");
if (kbase_prepare_to_reset_gpu_locked(kbdev, RESET_FLAGS_NONE))
kbase_reset_gpu_locked(kbdev);
}
/* the timer is re-issued if there is contexts in the run-pool */
if (backend->timer_running)
hrtimer_start(&backend->scheduling_timer,
HR_TIMER_DELAY_NSEC(js_devdata->scheduling_period_ns),
HRTIMER_MODE_REL);
backend->timeouts_updated = false;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return HRTIMER_NORESTART;
}
void kbase_backend_ctx_count_changed(struct kbase_device *kbdev)
{
struct kbasep_js_device_data *js_devdata = &kbdev->js_data;
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
unsigned long flags;
/* Timer must stop if we are suspending */
const bool suspend_timer = backend->suspend_timer;
const int nr_running_ctxs = atomic_read(&kbdev->js_data.nr_contexts_runnable);
lockdep_assert_held(&js_devdata->runpool_mutex);
if (suspend_timer || !timer_callback_should_run(kbdev, nr_running_ctxs)) {
/* Take spinlock to force synchronisation with timer */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
backend->timer_running = false;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* From now on, return value of timer_callback_should_run()
* will also cause the timer to not requeue itself. Its return
* value cannot change, because it depends on variables updated
* with the runpool_mutex held, which the caller of this must
* also hold
*/
hrtimer_cancel(&backend->scheduling_timer);
}
if (!suspend_timer && timer_callback_should_run(kbdev, nr_running_ctxs) &&
!backend->timer_running) {
/* Take spinlock to force synchronisation with timer */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
backend->timer_running = true;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
hrtimer_start(&backend->scheduling_timer,
HR_TIMER_DELAY_NSEC(js_devdata->scheduling_period_ns),
HRTIMER_MODE_REL);
KBASE_KTRACE_ADD_JM(kbdev, JS_POLICY_TIMER_START, NULL, NULL, 0u, 0u);
}
#if IS_ENABLED(CONFIG_MALI_VALHALL_TRACE_POWER_GPU_WORK_PERIOD)
if (unlikely(suspend_timer)) {
js_devdata->gpu_metrics_timer_needed = false;
/* Cancel the timer as System suspend is happening */
hrtimer_cancel(&js_devdata->gpu_metrics_timer);
js_devdata->gpu_metrics_timer_running = false;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* Explicitly emit the tracepoint on System suspend */
kbase_gpu_metrics_emit_tracepoint(kbdev, ktime_get_raw_ns());
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return;
}
if (!nr_running_ctxs) {
/* Just set the flag to not restart the timer on expiry */
js_devdata->gpu_metrics_timer_needed = false;
return;
}
/* There are runnable contexts so the timer is needed */
if (!js_devdata->gpu_metrics_timer_needed) {
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
js_devdata->gpu_metrics_timer_needed = true;
/* No need to restart the timer if it is already running. */
if (!js_devdata->gpu_metrics_timer_running) {
hrtimer_start(&js_devdata->gpu_metrics_timer,
HR_TIMER_DELAY_NSEC(kbase_gpu_metrics_get_tp_emit_interval()),
HRTIMER_MODE_REL);
js_devdata->gpu_metrics_timer_running = true;
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
#endif
}
int kbase_backend_timer_init(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
hrtimer_init(&backend->scheduling_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
backend->scheduling_timer.function = timer_callback;
backend->timer_running = false;
return 0;
}
void kbase_backend_timer_term(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
hrtimer_cancel(&backend->scheduling_timer);
}
void kbase_backend_timer_suspend(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->suspend_timer = true;
kbase_backend_ctx_count_changed(kbdev);
}
void kbase_backend_timer_resume(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->suspend_timer = false;
kbase_backend_ctx_count_changed(kbdev);
}
void kbase_backend_timeouts_changed(struct kbase_device *kbdev)
{
struct kbase_backend_data *backend = &kbdev->hwaccess.backend;
backend->timeouts_updated = true;
}

View File

@@ -0,0 +1,72 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2015, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Register-based HW access backend specific job scheduler APIs
*/
#ifndef _KBASE_JS_BACKEND_H_
#define _KBASE_JS_BACKEND_H_
/**
* kbase_backend_timer_init() - Initialise the JS scheduling timer
* @kbdev: Device pointer
*
* This function should be called at driver initialisation
*
* Return: 0 on success
*/
int kbase_backend_timer_init(struct kbase_device *kbdev);
/**
* kbase_backend_timer_term() - Terminate the JS scheduling timer
* @kbdev: Device pointer
*
* This function should be called at driver termination
*/
void kbase_backend_timer_term(struct kbase_device *kbdev);
/**
* kbase_backend_timer_suspend - Suspend is happening, stop the JS scheduling
* timer
* @kbdev: Device pointer
*
* This function should be called on suspend, after the active count has reached
* zero. This is required as the timer may have been started on job submission
* to the job scheduler, but before jobs are submitted to the GPU.
*
* Caller must hold runpool_mutex.
*/
void kbase_backend_timer_suspend(struct kbase_device *kbdev);
/**
* kbase_backend_timer_resume - Resume is happening, re-evaluate the JS
* scheduling timer
* @kbdev: Device pointer
*
* This function should be called on resume. Note that is not guaranteed to
* re-start the timer, only evalute whether it should be re-started.
*
* Caller must hold runpool_mutex.
*/
void kbase_backend_timer_resume(struct kbase_device *kbdev);
#endif /* _KBASE_JS_BACKEND_H_ */

View File

@@ -0,0 +1,121 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <linux/version_compat_defs_for_valhall.h>
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <device/mali_kbase_device.h>
#include "mali_kbase_l2_mmu_config.h"
#include <mali_kbase_io.h>
/**
* struct l2_mmu_config_limit_region - L2 MMU limit field
*
* @value: The default value to load into the L2_MMU_CONFIG register
* @mask: The shifted mask of the field in the L2_MMU_CONFIG register
* @shift: The shift of where the field starts in the L2_MMU_CONFIG register
* This should be the same value as the smaller of the two mask
* values
*/
struct l2_mmu_config_limit_region {
u32 value, mask, shift;
};
/**
* struct l2_mmu_config_limit - L2 MMU read and write limit
*
* @product_model: The GPU for which this entry applies
* @read: Values for the read limit field
* @write: Values for the write limit field
*/
struct l2_mmu_config_limit {
u32 product_model;
struct l2_mmu_config_limit_region read;
struct l2_mmu_config_limit_region write;
};
/*
* Zero represents no limit
*
* For LBEX TBEX TBAX TTRX and TNAX:
* The value represents the number of outstanding reads (6 bits) or writes (5 bits)
*
* For all other GPUS it is a fraction see: mali_kbase_config_defaults.h
*/
static const struct l2_mmu_config_limit limits[] = {
/* GPU, read, write */
{ GPU_ID_PRODUCT_LBEX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TBEX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TBAX, { 0, GENMASK(10, 5), 5 }, { 0, GENMASK(16, 12), 12 } },
{ GPU_ID_PRODUCT_TTRX, { 0, GENMASK(12, 7), 7 }, { 0, GENMASK(17, 13), 13 } },
{ GPU_ID_PRODUCT_TNAX, { 0, GENMASK(12, 7), 7 }, { 0, GENMASK(17, 13), 13 } },
{ GPU_ID_PRODUCT_TGOX,
{ KBASE_3BIT_AID_32, GENMASK(14, 12), 12 },
{ KBASE_3BIT_AID_32, GENMASK(17, 15), 15 } },
{ GPU_ID_PRODUCT_TNOX,
{ KBASE_3BIT_AID_32, GENMASK(14, 12), 12 },
{ KBASE_3BIT_AID_32, GENMASK(17, 15), 15 } },
};
int kbase_set_mmu_quirks(struct kbase_device *kbdev)
{
/* All older GPUs had 2 bits for both fields, this is a default */
struct l2_mmu_config_limit limit = { 0, /* Any GPU not in the limits array defined above */
{ KBASE_AID_32, GENMASK(25, 24), 24 },
{ KBASE_AID_32, GENMASK(27, 26), 26 } };
u32 product_model;
u32 mmu_config = 0;
unsigned int i;
product_model = kbdev->gpu_props.gpu_id.product_model;
/* Limit the GPU bus bandwidth if the platform needs this. */
for (i = 0; i < ARRAY_SIZE(limits); i++) {
if (product_model == limits[i].product_model) {
limit = limits[i];
break;
}
}
if (kbase_reg_is_valid(kbdev, GPU_CONTROL_ENUM(L2_MMU_CONFIG)))
mmu_config = kbase_reg_read32(kbdev, GPU_CONTROL_ENUM(L2_MMU_CONFIG));
if (!kbase_io_has_gpu(kbdev))
return -EIO;
mmu_config &= ~(limit.read.mask | limit.write.mask);
/* Can't use FIELD_PREP() macro here as the mask isn't constant */
mmu_config |= (limit.read.value << limit.read.shift) |
(limit.write.value << limit.write.shift);
kbdev->hw_quirks_mmu = mmu_config;
if (kbdev->system_coherency == COHERENCY_ACE) {
/* Allow memory configuration disparity to be ignored,
* we optimize the use of shared memory and thus we
* expect some disparity in the memory configuration.
*/
kbdev->hw_quirks_mmu |= L2_MMU_CONFIG_ALLOW_SNOOP_DISPARITY;
}
return 0;
}

View File

@@ -0,0 +1,36 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_L2_MMU_CONFIG_H_
#define _KBASE_L2_MMU_CONFIG_H_
/**
* kbase_set_mmu_quirks - Set the hw_quirks_mmu field of kbdev
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Use this function to initialise the hw_quirks_mmu field, for instance to set
* the MAX_READS and MAX_WRITES to sane defaults for each GPU.
*
* Return: Zero for succeess or a Linux error code
*/
int kbase_set_mmu_quirks(struct kbase_device *kbdev);
#endif /* _KBASE_L2_MMU_CONFIG_H */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,224 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Dummy Model interface
*
* Support for NO_MALI dummy Model interface.
*
* +-----------------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------------+
* | Model Linux Framework |
* +-----------------------------------+
* | Model Dummy interface definitions |
* +-----------------+-----------------+
* | Fake R/W | Fake IRQ |
* +-----------------+-----------------+
*/
#ifndef _KBASE_MODEL_DUMMY_H_
#define _KBASE_MODEL_DUMMY_H_
#include <uapi/gpu/arm/valhall/backend/gpu/mali_kbase_model_linux.h>
#include <uapi/gpu/arm/valhall/backend/gpu/mali_kbase_model_dummy.h>
#define model_error_log(module, ...) pr_err(__VA_ARGS__)
#define NUM_SLOTS 4 /*number of job slots */
/*Errors Mask Codes*/
/* each bit of errors_mask is associated to a specific error:
* NON FAULT STATUS CODES: only the following are implemented since the others
* represent normal working statuses
*/
#define KBASE_JOB_INTERRUPTED (1 << 0)
#define KBASE_JOB_STOPPED (1 << 1)
#define KBASE_JOB_TERMINATED (1 << 2)
/* JOB EXCEPTIONS: */
#define KBASE_JOB_CONFIG_FAULT (1 << 3)
#define KBASE_JOB_POWER_FAULT (1 << 4)
#define KBASE_JOB_READ_FAULT (1 << 5)
#define KBASE_JOB_WRITE_FAULT (1 << 6)
#define KBASE_JOB_AFFINITY_FAULT (1 << 7)
#define KBASE_JOB_BUS_FAULT (1 << 8)
#define KBASE_INSTR_INVALID_PC (1 << 9)
#define KBASE_INSTR_INVALID_ENC (1 << 10)
#define KBASE_INSTR_TYPE_MISMATCH (1 << 11)
#define KBASE_INSTR_OPERAND_FAULT (1 << 12)
#define KBASE_INSTR_TLS_FAULT (1 << 13)
#define KBASE_INSTR_BARRIER_FAULT (1 << 14)
#define KBASE_INSTR_ALIGN_FAULT (1 << 15)
#define KBASE_DATA_INVALID_FAULT (1 << 16)
#define KBASE_TILE_RANGE_FAULT (1 << 17)
#define KBASE_ADDR_RANGE_FAULT (1 << 18)
#define KBASE_OUT_OF_MEMORY (1 << 19)
#define KBASE_UNKNOWN (1 << 20)
/* GPU EXCEPTIONS:*/
#define KBASE_DELAYED_BUS_FAULT (1 << 21)
#define KBASE_SHAREABILITY_FAULT (1 << 22)
/* MMU EXCEPTIONS:*/
#define KBASE_TRANSLATION_FAULT (1 << 23)
#define KBASE_PERMISSION_FAULT (1 << 24)
#define KBASE_TRANSTAB_BUS_FAULT (1 << 25)
#define KBASE_ACCESS_FLAG (1 << 26)
/* generic useful bitmasks */
#define IS_A_JOB_ERROR ((KBASE_UNKNOWN << 1) - KBASE_JOB_INTERRUPTED)
#define IS_A_MMU_ERROR ((KBASE_ACCESS_FLAG << 1) - KBASE_TRANSLATION_FAULT)
#define IS_A_GPU_ERROR (KBASE_DELAYED_BUS_FAULT | KBASE_SHAREABILITY_FAULT)
/* number of possible MMU address spaces */
#define NUM_MMU_AS \
16 /* total number of MMU address spaces as in
* MMU_IRQ_RAWSTAT register
*/
/* Forward declaration */
struct kbase_device;
/*
* the function below is used to trigger the simulation of a faulty
* HW condition for a specific job chain atom
*/
struct kbase_error_params {
u64 jc;
u32 errors_mask;
u32 mmu_table_level;
u16 faulty_mmu_as;
u16 padding[3];
};
enum kbase_model_control_command {
/* Disable/Enable job completion in the dummy model */
KBASE_MC_DISABLE_JOBS
};
/* struct to control dummy model behavior */
struct kbase_model_control_params {
s32 command;
s32 value;
};
/* struct to track faulty atoms */
struct kbase_error_atom {
struct kbase_error_params params;
struct kbase_error_atom *next;
};
/*struct to track the system error state*/
struct error_status_t {
spinlock_t access_lock;
u32 errors_mask;
u32 mmu_table_level;
u32 faulty_mmu_as;
u64 current_jc;
u32 current_job_slot;
u32 job_irq_rawstat;
u32 job_irq_status;
u32 js_status[NUM_SLOTS];
u32 mmu_irq_mask;
u32 mmu_irq_rawstat;
u32 gpu_error_irq;
u32 gpu_fault_status;
u32 as_faultstatus[NUM_MMU_AS];
u32 as_command[NUM_MMU_AS];
u64 as_transtab[NUM_MMU_AS];
};
/**
* struct gpu_model_prfcnt_en - Performance counter enable masks
* @fe: Enable mask for front-end block
* @tiler: Enable mask for tiler block
* @l2: Enable mask for L2/Memory system blocks
* @shader: Enable mask for shader core blocks
*/
struct gpu_model_prfcnt_en {
u32 fe;
u32 tiler;
u32 l2;
u32 shader;
};
void midgard_set_error(u32 job_slot);
int job_atom_inject_error(struct kbase_error_params *params);
int gpu_model_control(void *h, struct kbase_model_control_params *params);
/**
* gpu_model_set_dummy_prfcnt_user_sample() - Set performance counter values
* @data: Userspace pointer to array of counter values
* @size: Size of counter value array
*
* Counter values set by this function will be used for one sample dump only
* after which counters will be cleared back to zero.
*
* Return: 0 on success, else error code.
*/
int gpu_model_set_dummy_prfcnt_user_sample(u32 __user *data, u32 size);
/**
* gpu_model_set_dummy_prfcnt_kernel_sample() - Set performance counter values
* @data: Pointer to array of counter values
* @size: Size of counter value array
*
* Counter values set by this function will be used for one sample dump only
* after which counters will be cleared back to zero.
*/
void gpu_model_set_dummy_prfcnt_kernel_sample(u64 *data, u32 size);
void gpu_model_get_dummy_prfcnt_cores(struct kbase_device *kbdev, u64 *l2_present,
u64 *shader_present);
void gpu_model_set_dummy_prfcnt_cores(struct kbase_device *kbdev, u64 l2_present,
u64 shader_present);
/* Clear the counter values array maintained by the dummy model */
void gpu_model_clear_prfcnt_values(void);
#if MALI_USE_CSF
/**
* gpu_model_prfcnt_dump_request() - Request performance counter sample dump.
* @sample_buf: Pointer to KBASE_DUMMY_MODEL_MAX_VALUES_PER_SAMPLE sized array
* in which to store dumped performance counter values.
* @enable_maps: Physical enable maps for performance counter blocks.
*/
void gpu_model_prfcnt_dump_request(uint32_t *sample_buf, struct gpu_model_prfcnt_en enable_maps);
/**
* gpu_model_glb_request_job_irq() - Trigger job interrupt with global request
* flag set.
* @model: Model pointer returned by midgard_model_create().
*/
void gpu_model_glb_request_job_irq(void *model);
#endif /* MALI_USE_CSF */
extern struct error_status_t hw_error_status;
#endif

View File

@@ -0,0 +1,192 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Model Linux Framework interfaces.
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include "backend/gpu/mali_kbase_model_linux.h"
#include "device/mali_kbase_device.h"
#include "mali_kbase_irq_internal.h"
#include <linux/kthread.h>
struct model_irq_data {
struct kbase_device *kbdev;
struct work_struct work;
};
#define DEFINE_SERVE_IRQ(irq_handler) \
static void serve_##irq_handler(struct work_struct *work) \
{ \
struct model_irq_data *data = container_of(work, struct model_irq_data, work); \
struct kbase_device *kbdev = data->kbdev; \
irq_handler(kbdev); \
kmem_cache_free(kbdev->irq_slab, data); \
}
static void job_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_job_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, JOB_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(job_irq)
static void gpu_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_gpu_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, GPU_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(gpu_irq)
static void mmu_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_mmu_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, MMU_IRQ_TAG)(0, kbdev);
}
DEFINE_SERVE_IRQ(mmu_irq)
static void irqaw_irq(struct kbase_device *kbdev)
{
/* Make sure no worker is already serving this IRQ */
while (atomic_cmpxchg(&kbdev->serving_irqaw_irq, 1, 0) == 1)
kbase_get_interrupt_handler(kbdev, 0)(0, kbdev);
}
DEFINE_SERVE_IRQ(irqaw_irq)
void gpu_device_raise_irq(void *model, u32 irq)
{
struct model_irq_data *data;
struct kbase_device *kbdev = gpu_device_get_data(model);
KBASE_DEBUG_ASSERT(kbdev);
data = kmem_cache_alloc(kbdev->irq_slab, GFP_ATOMIC);
if (data == NULL)
return;
data->kbdev = kbdev;
switch (irq) {
case MODEL_LINUX_JOB_IRQ:
INIT_WORK(&data->work, serve_job_irq);
atomic_set(&kbdev->serving_job_irq, 1);
break;
case MODEL_LINUX_GPU_IRQ:
INIT_WORK(&data->work, serve_gpu_irq);
atomic_set(&kbdev->serving_gpu_irq, 1);
break;
case MODEL_LINUX_MMU_IRQ:
INIT_WORK(&data->work, serve_mmu_irq);
atomic_set(&kbdev->serving_mmu_irq, 1);
break;
case MODEL_LINUX_IRQAW_IRQ:
INIT_WORK(&data->work, serve_irqaw_irq);
atomic_set(&kbdev->serving_irqaw_irq, 1);
break;
default:
dev_warn(kbdev->dev, "Unknown IRQ");
kmem_cache_free(kbdev->irq_slab, data);
data = NULL;
break;
}
if (data != NULL)
queue_work(kbdev->irq_workq, &data->work);
}
int kbase_install_interrupts(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
atomic_set(&kbdev->serving_job_irq, 0);
atomic_set(&kbdev->serving_gpu_irq, 0);
atomic_set(&kbdev->serving_mmu_irq, 0);
atomic_set(&kbdev->serving_irqaw_irq, 0);
kbdev->irq_workq = alloc_ordered_workqueue("dummy irq queue", 0);
if (kbdev->irq_workq == NULL)
return -ENOMEM;
kbdev->irq_slab =
kmem_cache_create("dummy_irq_slab", sizeof(struct model_irq_data), 0, 0, NULL);
if (kbdev->irq_slab == NULL) {
destroy_workqueue(kbdev->irq_workq);
return -ENOMEM;
}
kbdev->nr_irqs = 3;
if (kbdev->gpu_props.gpu_id.arch_id >= GPU_ID_ARCH_MAKE(14, 8, 0))
kbdev->nr_irqs = 1;
return 0;
}
void kbase_release_interrupts(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
destroy_workqueue(kbdev->irq_workq);
kmem_cache_destroy(kbdev->irq_slab);
}
void kbase_synchronize_irqs(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev);
flush_workqueue(kbdev->irq_workq);
}
KBASE_EXPORT_TEST_API(kbase_synchronize_irqs);
int kbase_set_custom_irq_handler(struct kbase_device *kbdev, irq_handler_t custom_handler,
u32 irq_tag)
{
return 0;
}
KBASE_EXPORT_TEST_API(kbase_set_custom_irq_handler);
int kbase_gpu_device_create(struct kbase_device *kbdev)
{
kbdev->model = midgard_model_create(kbdev);
if (kbdev->model == NULL)
return -ENOMEM;
spin_lock_init(&kbdev->reg_op_lock);
return 0;
}
/**
* kbase_gpu_device_destroy - Destroy GPU device
*
* @kbdev: kbase device
*/
void kbase_gpu_device_destroy(struct kbase_device *kbdev)
{
midgard_model_destroy(kbdev->model);
}

View File

@@ -0,0 +1,146 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Model Linux Framework interfaces.
*
* This framework is used to provide generic Kbase Models interfaces.
* Note: Backends cannot be used together; the selection is done at build time.
*
* - Without Model Linux Framework:
* +-----------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------+
* | HW interface definitions |
* +-----------------------------+
*
* - With Model Linux Framework:
* +-----------------------------+
* | Kbase read/write/IRQ |
* +-----------------------------+
* | Model Linux Framework |
* +-----------------------------+
* | Model interface definitions |
* +-----------------------------+
*/
#ifndef _KBASE_MODEL_LINUX_H_
#define _KBASE_MODEL_LINUX_H_
/*
* Include Model definitions
*/
#include <backend/gpu/mali_kbase_model_dummy.h>
/**
* kbase_gpu_device_create() - Generic create function.
*
* @kbdev: Kbase device.
*
* Specific model hook is implemented by midgard_model_create()
*
* Return: 0 on success, error code otherwise.
*/
int kbase_gpu_device_create(struct kbase_device *kbdev);
/**
* kbase_gpu_device_destroy() - Generic create function.
*
* @kbdev: Kbase device.
*
* Specific model hook is implemented by midgard_model_destroy()
*/
void kbase_gpu_device_destroy(struct kbase_device *kbdev);
/**
* midgard_model_create() - Private create function.
*
* @kbdev: Kbase device.
*
* This hook is specific to the model built in Kbase.
*
* Return: Model handle.
*/
void *midgard_model_create(struct kbase_device *kbdev);
/**
* midgard_model_destroy() - Private destroy function.
*
* @h: Model handle.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_destroy(void *h);
/**
* midgard_model_write_reg() - Private model write function.
*
* @h: Model handle.
* @addr: Address at which to write.
* @value: value to write.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_write_reg(void *h, u32 addr, u32 value);
/**
* midgard_model_read_reg() - Private model read function.
*
* @h: Model handle.
* @addr: Address from which to read.
* @value: Pointer where to store the read value.
*
* This hook is specific to the model built in Kbase.
*/
void midgard_model_read_reg(void *h, u32 addr, u32 *const value);
/**
* gpu_device_raise_irq() - Private IRQ raise function.
*
* @model: Model handle.
* @irq: IRQ type to raise.
*
* This hook is global to the model Linux framework.
*/
void gpu_device_raise_irq(void *model, u32 irq);
/**
* gpu_device_set_data() - Private model set data function.
*
* @model: Model handle.
* @data: Data carried by model.
*
* This hook is global to the model Linux framework.
*/
void gpu_device_set_data(void *model, void *data);
/**
* gpu_device_get_data() - Private model get data function.
*
* @model: Model handle.
*
* This hook is global to the model Linux framework.
*
* Return: Pointer to the data carried by model.
*/
void *gpu_device_get_data(void *model);
#endif /* _KBASE_MODEL_LINUX_H_ */

View File

@@ -0,0 +1,75 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Always on" power management policy
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
static bool always_on_shaders_needed(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
return true;
}
static bool always_on_get_core_active(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
return true;
}
static void always_on_init(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/**
* always_on_term - Term callback function for always-on power policy
*
* @kbdev: kbase device
*/
static void always_on_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/*
* The struct kbase_pm_policy structure for the demand power policy.
*
* This is the static structure that defines the demand power policy's callback
* and name.
*/
const struct kbase_pm_policy kbase_pm_always_on_policy_ops = {
"always_on", /* name */
always_on_init, /* init */
always_on_term, /* term */
always_on_shaders_needed, /* shaders_needed */
always_on_get_core_active, /* get_core_active */
NULL, /* handle_event */
KBASE_PM_POLICY_ID_ALWAYS_ON, /* id */
#if MALI_USE_CSF
ALWAYS_ON_PM_SCHED_FLAGS, /* pm_sched_flags */
#endif
};
KBASE_EXPORT_TEST_API(kbase_pm_always_on_policy_ops);

View File

@@ -0,0 +1,77 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Always on" power management policy
*/
#ifndef MALI_KBASE_PM_ALWAYS_ON_H
#define MALI_KBASE_PM_ALWAYS_ON_H
/**
* DOC:
* The "Always on" power management policy has the following
* characteristics:
*
* - When KBase indicates that the GPU will be powered up, but we don't yet
* know which Job Chains are to be run:
* Shader Cores are powered up, regardless of whether or not they will be
* needed later.
*
* - When KBase indicates that Shader Cores are needed to submit the currently
* queued Job Chains:
* Shader Cores are kept powered, regardless of whether or not they will be
* needed
*
* - When KBase indicates that the GPU need not be powered:
* The Shader Cores are kept powered, regardless of whether or not they will
* be needed. The GPU itself is also kept powered, even though it is not
* needed.
*
* This policy is automatically overridden during system suspend: the desired
* core state is ignored, and the cores are forced off regardless of what the
* policy requests. After resuming from suspend, new changes to the desired
* core state made by the policy are honored.
*
* Note:
*
* - KBase indicates the GPU will be powered up when it has a User Process that
* has just started to submit Job Chains.
*
* - KBase indicates the GPU need not be powered when all the Job Chains from
* User Processes have finished, and it is waiting for a User Process to
* submit some more Job Chains.
*/
/**
* struct kbasep_pm_policy_always_on - Private struct for policy instance data
* @dummy: unused dummy variable
*
* This contains data that is private to the particular power policy that is
* active.
*/
struct kbasep_pm_policy_always_on {
int dummy;
};
extern const struct kbase_pm_policy kbase_pm_always_on_policy_ops;
#endif /* MALI_KBASE_PM_ALWAYS_ON_H */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,177 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2013-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel core availability APIs
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <backend/gpu/mali_kbase_model_linux.h>
#include <mali_kbase_dummy_job_wa.h>
int kbase_pm_ca_init(struct kbase_device *kbdev)
{
#ifdef CONFIG_MALI_VALHALL_DEVFREQ
struct kbase_pm_backend_data *pm_backend = &kbdev->pm.backend;
if (kbdev->current_core_mask)
pm_backend->ca_cores_enabled = kbdev->current_core_mask;
else
pm_backend->ca_cores_enabled = kbdev->gpu_props.shader_present;
#endif
return 0;
}
void kbase_pm_ca_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
#ifdef CONFIG_MALI_VALHALL_DEVFREQ
void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask)
{
struct kbase_pm_backend_data *pm_backend = &kbdev->pm.backend;
unsigned long flags;
#if MALI_USE_CSF
u64 old_core_mask = 0;
bool mmu_sync_needed = false;
if (!IS_ENABLED(CONFIG_MALI_VALHALL_NO_MALI) &&
kbase_hw_has_issue(kbdev, KBASE_HW_ISSUE_GPU2019_3901)) {
mmu_sync_needed = true;
down_write(&kbdev->csf.mmu_sync_sem);
}
#endif
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
#if MALI_USE_CSF
if (kbase_hw_has_feature(kbdev, KBASE_HW_FEATURE_GOV_CORE_MASK_SUPPORT)) {
if (kbase_io_is_gpu_powered(kbdev)) {
kbase_reg_write64(kbdev, GPU_GOVERNOR_ENUM(GOV_CORE_MASK),
core_mask & kbdev->pm.debug_core_mask);
}
goto unlock;
}
if (!(core_mask & kbdev->pm.debug_core_mask)) {
dev_err(kbdev->dev,
"OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
core_mask, kbdev->pm.debug_core_mask);
goto unlock;
}
old_core_mask = pm_backend->ca_cores_enabled;
#else
if (!(core_mask & kbdev->pm.debug_core_mask_all)) {
dev_err(kbdev->dev,
"OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
core_mask, kbdev->pm.debug_core_mask_all);
goto unlock;
}
if (kbase_dummy_job_wa_enabled(kbdev)) {
dev_err_once(kbdev->dev,
"Dynamic core scaling not supported as dummy job WA is enabled");
goto unlock;
}
#endif /* MALI_USE_CSF */
pm_backend->ca_cores_enabled = core_mask;
kbase_pm_update_state(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
#if MALI_USE_CSF
/* Check if old_core_mask contained the undesired cores and wait
* for those cores to get powered down
*/
if ((core_mask & old_core_mask) != old_core_mask) {
if (kbase_pm_wait_for_cores_down_scale(kbdev)) {
dev_warn(kbdev->dev,
"Wait for update of core_mask from %llx to %llx failed",
old_core_mask, core_mask);
}
}
if (mmu_sync_needed)
up_write(&kbdev->csf.mmu_sync_sem);
#endif
dev_dbg(kbdev->dev, "Devfreq policy : new core mask=%llX\n", pm_backend->ca_cores_enabled);
return;
unlock:
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
#if MALI_USE_CSF
if (mmu_sync_needed)
up_write(&kbdev->csf.mmu_sync_sem);
#endif
}
KBASE_EXPORT_TEST_API(kbase_devfreq_set_core_mask);
#endif
u64 kbase_pm_ca_get_debug_core_mask(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
return kbdev->pm.debug_core_mask;
#else
return kbdev->pm.debug_core_mask_all;
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_ca_get_debug_core_mask);
u64 kbase_pm_ca_get_core_mask(struct kbase_device *kbdev)
{
u64 debug_core_mask = kbase_pm_ca_get_debug_core_mask(kbdev);
lockdep_assert_held(&kbdev->hwaccess_lock);
#ifdef CONFIG_MALI_VALHALL_DEVFREQ
/*
* Although in the init we let the pm_backend->ca_cores_enabled to be
* the max config (it uses the base_gpu_props), at this function we need
* to limit it to be a subgroup of the curr config, otherwise the
* shaders state machine on the PM does not evolve.
*/
return kbdev->gpu_props.curr_config.shader_present & kbdev->pm.backend.ca_cores_enabled &
debug_core_mask;
#else
return kbdev->gpu_props.curr_config.shader_present & debug_core_mask;
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_ca_get_core_mask);
u64 kbase_pm_ca_get_instr_core_mask(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
#if IS_ENABLED(CONFIG_MALI_VALHALL_NO_MALI)
return (((1ull) << KBASE_DUMMY_MODEL_MAX_SHADER_CORES) - 1);
#elif MALI_USE_CSF
return kbase_pm_get_ready_cores(kbdev, KBASE_PM_CORE_SHADER);
#else
return kbdev->pm.backend.pm_shaders_core_mask;
#endif
}

View File

@@ -0,0 +1,99 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel core availability APIs
*/
#ifndef _KBASE_PM_CA_H_
#define _KBASE_PM_CA_H_
/**
* kbase_pm_ca_init - Initialize core availability framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Must be called before calling any other core availability function
*
* Return: 0 if the core availability framework was successfully initialized,
* -errno otherwise
*/
int kbase_pm_ca_init(struct kbase_device *kbdev);
/**
* kbase_pm_ca_term - Terminate core availability framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*/
void kbase_pm_ca_term(struct kbase_device *kbdev);
/**
* kbase_pm_ca_get_core_mask - Get currently available shaders core mask
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the currently available shader cores.
* Calls into the core availability policy
*
* Return: The bit mask of available cores
*/
u64 kbase_pm_ca_get_core_mask(struct kbase_device *kbdev);
/**
* kbase_pm_ca_get_debug_core_mask - Get debug core mask.
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the currently selected shader cores.
*
* Return: The bit mask of user-selected cores
*/
u64 kbase_pm_ca_get_debug_core_mask(struct kbase_device *kbdev);
/**
* kbase_pm_ca_update_core_status - Update core status
*
* @kbdev: The kbase device structure for the device (must be
* a valid pointer)
* @cores_ready: The bit mask of cores ready for job submission
* @cores_transitioning: The bit mask of cores that are transitioning power
* state
*
* Update core availability policy with current core power status
*
* Calls into the core availability policy
*/
void kbase_pm_ca_update_core_status(struct kbase_device *kbdev, u64 cores_ready,
u64 cores_transitioning);
/**
* kbase_pm_ca_get_instr_core_mask - Get the PM state sync-ed shaders core mask
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Returns a mask of the PM state synchronised shader cores for arranging
* HW performance counter dumps
*
* Return: The bit mask of PM state synchronised cores
*/
u64 kbase_pm_ca_get_instr_core_mask(struct kbase_device *kbdev);
#endif /* _KBASE_PM_CA_H_ */

View File

@@ -0,0 +1,58 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2017-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* A core availability policy for use with devfreq, where core masks are
* associated with OPPs.
*/
#ifndef MALI_KBASE_PM_CA_DEVFREQ_H
#define MALI_KBASE_PM_CA_DEVFREQ_H
/**
* struct kbasep_pm_ca_policy_devfreq - Private structure for devfreq ca policy
*
* @cores_desired: Cores that the policy wants to be available
* @cores_enabled: Cores that the policy is currently returning as available
* @cores_used: Cores currently powered or transitioning
*
* This contains data that is private to the devfreq core availability
* policy.
*/
struct kbasep_pm_ca_policy_devfreq {
u64 cores_desired;
u64 cores_enabled;
u64 cores_used;
};
extern const struct kbase_pm_ca_policy kbase_pm_ca_devfreq_policy_ops;
/**
* kbase_devfreq_set_core_mask - Set core mask for policy to use
* @kbdev: Device pointer
* @core_mask: New core mask
*
* The new core mask will have immediate effect if the GPU is powered, or will
* take effect when it is next powered on.
*/
void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask);
#endif /* MALI_KBASE_PM_CA_DEVFREQ_H */

View File

@@ -0,0 +1,67 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2012-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Coarse Demand" power management policy
*/
#include <mali_kbase.h>
#include <mali_kbase_pm.h>
static bool coarse_demand_shaders_needed(struct kbase_device *kbdev)
{
return kbase_pm_is_active(kbdev);
}
static bool coarse_demand_get_core_active(struct kbase_device *kbdev)
{
return kbase_pm_is_active(kbdev);
}
static void coarse_demand_init(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
static void coarse_demand_term(struct kbase_device *kbdev)
{
CSTD_UNUSED(kbdev);
}
/* The struct kbase_pm_policy structure for the demand power policy.
*
* This is the static structure that defines the demand power policy's callback
* and name.
*/
const struct kbase_pm_policy kbase_pm_coarse_demand_policy_ops = {
"coarse_demand", /* name */
coarse_demand_init, /* init */
coarse_demand_term, /* term */
coarse_demand_shaders_needed, /* shaders_needed */
coarse_demand_get_core_active, /* get_core_active */
NULL, /* handle_event */
KBASE_PM_POLICY_ID_COARSE_DEMAND, /* id */
#if MALI_USE_CSF
COARSE_ON_DEMAND_PM_SCHED_FLAGS, /* pm_sched_flags */
#endif
};
KBASE_EXPORT_TEST_API(kbase_pm_coarse_demand_policy_ops);

View File

@@ -0,0 +1,64 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2012-2015, 2018, 2020-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* "Coarse Demand" power management policy
*/
#ifndef MALI_KBASE_PM_COARSE_DEMAND_H
#define MALI_KBASE_PM_COARSE_DEMAND_H
/**
* DOC:
* The "Coarse" demand power management policy has the following
* characteristics:
* - When KBase indicates that the GPU will be powered up, but we don't yet
* know which Job Chains are to be run:
* - Shader Cores are powered up, regardless of whether or not they will be
* needed later.
* - When KBase indicates that Shader Cores are needed to submit the currently
* queued Job Chains:
* - Shader Cores are kept powered, regardless of whether or not they will
* be needed
* - When KBase indicates that the GPU need not be powered:
* - The Shader Cores are powered off, and the GPU itself is powered off too.
*
* @note:
* - KBase indicates the GPU will be powered up when it has a User Process that
* has just started to submit Job Chains.
* - KBase indicates the GPU need not be powered when all the Job Chains from
* User Processes have finished, and it is waiting for a User Process to
* submit some more Job Chains.
*/
/**
* struct kbasep_pm_policy_coarse_demand - Private structure for coarse demand
* policy
* @dummy: Dummy member - no state needed
* This contains data that is private to the coarse demand power policy.
*/
struct kbasep_pm_policy_coarse_demand {
int dummy;
};
extern const struct kbase_pm_policy kbase_pm_coarse_demand_policy_ops;
#endif /* MALI_KBASE_PM_COARSE_DEMAND_H */

View File

@@ -0,0 +1,727 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager definitions
*/
#ifndef _KBASE_PM_HWACCESS_DEFS_H_
#define _KBASE_PM_HWACCESS_DEFS_H_
#include "mali_kbase_pm_always_on.h"
#include "mali_kbase_pm_coarse_demand.h"
#include <hw_access/mali_kbase_hw_access_regmap.h>
#if defined(CONFIG_PM_RUNTIME) || defined(CONFIG_PM)
#define KBASE_PM_RUNTIME 1
#endif
/* Forward definition - see mali_kbase.h */
struct kbase_device;
struct kbase_jd_atom;
/**
* enum kbase_pm_core_type - The types of core in a GPU.
*
* @KBASE_PM_CORE_L2: The L2 cache
* @KBASE_PM_CORE_SHADER: Shader cores
* @KBASE_PM_CORE_TILER: Tiler cores
* @KBASE_PM_CORE_STACK: Core stacks
*
* These enumerated values are used in calls to
* - kbase_pm_get_present_cores()
* - kbase_pm_get_active_cores()
* - kbase_pm_get_trans_cores()
* - kbase_pm_get_ready_cores()
* - kbase_pm_get_state()
* - core_type_to_reg()
* - pwr_cmd_constructor()
* - valid_to_power_up()
* - valid_to_power_down()
* - kbase_pm_invoke()
*
* They specify which type of core should be acted on.
*/
#if MALI_USE_CSF
enum kbase_pm_core_type {
KBASE_PM_CORE_L2 = HOST_POWER_ENUM(L2_PRESENT),
KBASE_PM_CORE_SHADER = HOST_POWER_ENUM(SHADER_PRESENT),
KBASE_PM_CORE_TILER = HOST_POWER_ENUM(TILER_PRESENT),
KBASE_PM_CORE_STACK = HOST_POWER_ENUM(STACK_PRESENT),
/**
* @KBASE_PM_CORE_NEURAL: Neural engine
*/
KBASE_PM_CORE_NEURAL = HOST_POWER_ENUM(NEURAL_PRESENT),
/**
* @KBASE_PM_CORE_BASE: Shader core base domain
*/
KBASE_PM_CORE_BASE = HOST_POWER_ENUM(BASE_PRESENT)
};
#else
enum kbase_pm_core_type {
KBASE_PM_CORE_L2 = GPU_CONTROL_ENUM(L2_PRESENT),
KBASE_PM_CORE_SHADER = GPU_CONTROL_ENUM(SHADER_PRESENT),
KBASE_PM_CORE_TILER = GPU_CONTROL_ENUM(TILER_PRESENT),
KBASE_PM_CORE_STACK = GPU_CONTROL_ENUM(STACK_PRESENT)
};
#endif
/*
* enum kbase_l2_core_state - The states used for the L2 cache & tiler power
* state machine.
*/
enum kbase_l2_core_state {
#define KBASEP_L2_STATE(n) KBASE_L2_##n,
#include "mali_kbase_pm_l2_states.h"
#undef KBASEP_L2_STATE
};
#if MALI_USE_CSF
/*
* enum kbase_mcu_state - The states used for the MCU state machine.
*/
enum kbase_mcu_state {
#define KBASEP_MCU_STATE(n) KBASE_MCU_##n,
#include "mali_kbase_pm_mcu_states.h"
#undef KBASEP_MCU_STATE
};
#endif
/*
* enum kbase_shader_core_state - The states used for the shaders' state machine.
*/
enum kbase_shader_core_state {
#define KBASEP_SHADER_STATE(n) KBASE_SHADERS_##n,
#include "mali_kbase_pm_shader_states.h"
#undef KBASEP_SHADER_STATE
};
/**
* enum kbase_pm_runtime_suspend_abort_reason - Reason why runtime suspend was aborted
* after the wake up of MCU.
*
* @ABORT_REASON_NONE: Not aborted
* @ABORT_REASON_DB_MIRROR_IRQ: Runtime suspend was aborted due to DB_MIRROR irq.
* @ABORT_REASON_NON_IDLE_CGS: Runtime suspend was aborted as CSGs were detected as non-idle after
* their suspension.
*/
enum kbase_pm_runtime_suspend_abort_reason {
ABORT_REASON_NONE,
ABORT_REASON_DB_MIRROR_IRQ,
ABORT_REASON_NON_IDLE_CGS
};
/* The following indices point to the corresponding bits stored in
* &kbase_pm_backend_data.gpu_sleep_allowed. They denote the conditions that
* would be checked against to determine the level of support for GPU sleep
* and firmware sleep-on-idle.
*/
#define KBASE_GPU_SUPPORTS_GPU_SLEEP ((uint8_t)0)
#define KBASE_GPU_SUPPORTS_FW_SLEEP_ON_IDLE ((uint8_t)1)
#define KBASE_GPU_PERF_COUNTERS_COLLECTION_ENABLED ((uint8_t)2)
#define KBASE_GPU_IGNORE_IDLE_EVENT ((uint8_t)3)
#define KBASE_GPU_NON_IDLE_OFF_SLOT_GROUPS_AVAILABLE ((uint8_t)4)
/* FW sleep-on-idle could be enabled if
* &kbase_pm_backend_data.gpu_sleep_allowed is equal to this value.
*/
#define KBASE_GPU_FW_SLEEP_ON_IDLE_ALLOWED \
((uint8_t)((1 << KBASE_GPU_SUPPORTS_GPU_SLEEP) | \
(1 << KBASE_GPU_SUPPORTS_FW_SLEEP_ON_IDLE) | \
(0 << KBASE_GPU_PERF_COUNTERS_COLLECTION_ENABLED) | \
(0 << KBASE_GPU_IGNORE_IDLE_EVENT) | \
(0 << KBASE_GPU_NON_IDLE_OFF_SLOT_GROUPS_AVAILABLE)))
/**
* struct kbasep_pm_metrics - Metrics data collected for use by the power
* management framework.
*
* @time_busy: the amount of time the GPU was busy executing jobs since the
* @time_period_start timestamp, in units of 256ns. This also includes
* time_in_protm, the time spent in protected mode, since it's assumed
* the GPU was busy 100% during this period.
* @time_idle: the amount of time the GPU was not executing jobs since the
* time_period_start timestamp, measured in units of 256ns.
* @time_in_protm: The amount of time the GPU has spent in protected mode since
* the time_period_start timestamp, measured in units of 256ns.
* @busy_cl: the amount of time the GPU was busy executing CL jobs. Note that
* if two CL jobs were active for 256ns, this value would be updated
* with 2 (2x256ns).
* @busy_gl: the amount of time the GPU was busy executing GL jobs. Note that
* if two GL jobs were active for 256ns, this value would be updated
* with 2 (2x256ns).
*/
struct kbasep_pm_metrics {
u32 time_busy;
u32 time_idle;
#if MALI_USE_CSF
u32 time_in_protm;
#else
u32 busy_cl[2];
u32 busy_gl;
#endif
};
/**
* struct kbasep_pm_metrics_state - State required to collect the metrics in
* struct kbasep_pm_metrics
* @time_period_start: time at which busy/idle measurements started
* @ipa_control_client: Handle returned on registering DVFS as a
* kbase_ipa_control client
* @skip_gpu_active_sanity_check: Decide whether to skip GPU_ACTIVE sanity
* check in DVFS utilisation calculation
* @gpu_active: true when the GPU is executing jobs. false when
* not. Updated when the job scheduler informs us a job in submitted
* or removed from a GPU slot.
* @active_cl_ctx: number of CL jobs active on the GPU. Array is per-device.
* @active_gl_ctx: number of GL jobs active on the GPU. Array is per-slot.
* @lock: spinlock protecting the kbasep_pm_metrics_state structure
* @platform_data: pointer to data controlled by platform specific code
* @kbdev: pointer to kbase device for which metrics are collected
* @values: The current values of the power management metrics. The
* kbase_pm_get_dvfs_metrics() function is used to compare these
* current values with the saved values from a previous invocation.
* @initialized: tracks whether metrics_state has been initialized or not.
* @timer: timer to regularly make DVFS decisions based on the power
* management metrics.
* @timer_state: atomic indicating current @timer state, on, off, or stopped.
* @dvfs_last: values of the PM metrics from the last DVFS tick
* @dvfs_diff: different between the current and previous PM metrics.
*/
struct kbasep_pm_metrics_state {
ktime_t time_period_start;
#if MALI_USE_CSF
void *ipa_control_client;
bool skip_gpu_active_sanity_check;
#else
bool gpu_active;
u32 active_cl_ctx[2];
u32 active_gl_ctx[3];
#endif
spinlock_t lock;
void *platform_data;
struct kbase_device *kbdev;
struct kbasep_pm_metrics values;
#ifdef CONFIG_MALI_VALHALL_DVFS
bool initialized;
struct hrtimer timer;
atomic_t timer_state;
struct kbasep_pm_metrics dvfs_last;
struct kbasep_pm_metrics dvfs_diff;
#endif
};
/**
* struct kbasep_pm_tick_timer_state - State for the shader hysteresis timer
* @wq: Work queue to wait for the timer to stopped
* @work: Work item which cancels the timer
* @timer: Timer for powering off the shader cores
* @configured_interval: Period of GPU poweroff timer
* @default_ticks: User-configured number of ticks to wait after the shader
* power down request is received before turning off the cores
* @configured_ticks: Power-policy configured number of ticks to wait after the
* shader power down request is received before turning off
* the cores. For simple power policies, this is equivalent
* to @default_ticks.
* @remaining_ticks: Number of remaining timer ticks until shaders are powered off
* @cancel_queued: True if the cancellation work item has been queued. This is
* required to ensure that it is not queued twice, e.g. after
* a reset, which could cause the timer to be incorrectly
* cancelled later by a delayed workitem.
* @needed: Whether the timer should restart itself
*/
struct kbasep_pm_tick_timer_state {
struct workqueue_struct *wq;
struct work_struct work;
struct hrtimer timer;
ktime_t configured_interval;
unsigned int default_ticks;
unsigned int configured_ticks;
unsigned int remaining_ticks;
bool cancel_queued;
bool needed;
};
union kbase_pm_policy_data {
struct kbasep_pm_policy_always_on always_on;
struct kbasep_pm_policy_coarse_demand coarse_demand;
};
/**
* struct kbase_pm_backend_data - Data stored per device for power management.
*
* @pm_current_policy: The policy that is currently actively controlling the
* power state.
* @pm_policy_data: Private data for current PM policy. This is automatically
* zeroed when a policy change occurs.
* @reset_done: Flag when a reset is complete
* @reset_done_wait: Wait queue to wait for changes to @reset_done
* @gpu_cycle_counter_requests: The reference count of active gpu cycle counter
* users
* @gpu_cycle_counter_requests_lock: Lock to protect @gpu_cycle_counter_requests
* @gpu_in_desired_state_wait: Wait queue set when the GPU is in the desired
* state according to the L2 and shader power state
* machines
* @gpu_ready: Indicates whether the GPU is in a state in which it is
* safe to perform PM changes. When false, the PM state
* machine needs to wait before making changes to the GPU
* power policy, DevFreq or core_mask, so as to avoid these
* changing while implicit GPU resets are ongoing.
* @pm_shaders_core_mask: Shader PM state synchronised shaders core mask. It
* holds the cores enabled in a hardware counters dump,
* and may differ from @shaders_avail when under different
* states and transitions.
* @cg1_disabled: Set if the policy wants to keep the second core group
* powered off
* @metrics: Structure to hold metrics for the GPU
* @shader_tick_timer: Structure to hold the shader poweroff tick timer state
* @poweroff_wait_in_progress: true if a wait for GPU power off is in progress.
* hwaccess_lock must be held when accessing
* @invoke_poweroff_wait_wq_when_l2_off: flag indicating that the L2 power state
* machine should invoke the poweroff
* worker after the L2 has turned off.
* @poweron_required: true if a GPU power on is required. Should only be set
* when poweroff_wait_in_progress is true, and therefore the
* GPU can not immediately be powered on. pm.lock must be
* held when accessing
* @gpu_poweroff_wait_wq: workqueue for waiting for GPU to power off
* @gpu_poweroff_wait_work: work item for use with @gpu_poweroff_wait_wq
* @poweroff_wait: waitqueue for waiting for @gpu_poweroff_wait_work to complete
* @callback_power_on: Callback when the GPU needs to be turned on. See
* &struct kbase_pm_callback_conf
* @callback_power_off: Callback when the GPU may be turned off. See
* &struct kbase_pm_callback_conf
* @callback_power_suspend: Callback when a suspend occurs and the GPU needs to
* be turned off. See &struct kbase_pm_callback_conf
* @callback_power_resume: Callback when a resume occurs and the GPU needs to
* be turned on. See &struct kbase_pm_callback_conf
* @callback_power_runtime_on: Callback when the GPU needs to be turned on. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_off: Callback when the GPU may be turned off. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_idle: Optional callback invoked by runtime PM core
* when the GPU may be idle. See
* &struct kbase_pm_callback_conf
* @callback_soft_reset: Optional callback to software reset the GPU. See
* &struct kbase_pm_callback_conf
* @callback_power_runtime_gpu_idle: Callback invoked by Kbase when GPU has
* become idle.
* See &struct kbase_pm_callback_conf.
* @callback_power_runtime_gpu_active: Callback when GPU has become active and
* @callback_power_runtime_gpu_idle was
* called previously.
* See &struct kbase_pm_callback_conf.
* @ca_cores_enabled: Cores that are currently available
* @apply_hw_issue_TITANHW_2938_wa: Indicates if the workaround for KBASE_HW_ISSUE_TITANHW_2938
* needs to be applied when unmapping memory from GPU.
* @mcu_state: The current state of the micro-control unit, only applicable
* to GPUs that have such a component
* @l2_state: The current state of the L2 cache state machine. See
* &enum kbase_l2_core_state
* @l2_desired: True if the L2 cache should be powered on by the L2 cache state
* machine
* @l2_always_on: If true, disable powering down of l2 cache.
* @shaders_state: The current state of the shader state machine.
* @shaders_avail: This is updated by the state machine when it is in a state
* where it can write to the SHADER_PWRON or PWROFF registers
* to have the same set of available cores as specified by
* @shaders_desired_mask. So would precisely indicate the cores
* that are currently available. This is internal to shader
* state machine of JM GPUs and should *not* be modified
* elsewhere.
* @shaders_desired_mask: This is updated by the state machine when it is in
* a state where it can handle changes to the core
* availability (either by DVFS or sysfs). This is
* internal to the shader state machine and should
* *not* be modified elsewhere.
* @shaders_desired: True if the PM active count or power policy requires the
* shader cores to be on. This is used as an input to the
* shader power state machine. The current state of the
* cores may be different, but there should be transitions in
* progress that will eventually achieve this state (assuming
* that the policy doesn't change its mind in the mean time).
* @mcu_desired: True if the micro-control unit should be powered on by the MCU state
* machine.
* @policy_change_clamp_state_to_off: Signaling the backend is in PM policy
* change transition, needs the mcu/L2 to be brought back to the
* off state and remain in that state until the flag is cleared.
* @waiting_for_mmu_fault_handling: Flag set just before the wait for pending MMU faults
* is done inside @gpu_poweroff_wait_wq.
* @csf_pm_sched_flags: CSF Dynamic PM control flags in accordance to the
* current active PM policy. This field is updated whenever a
* new policy is activated.
* @policy_change_lock: Used to serialize the policy change calls. In CSF case,
* the change of policy may involve the scheduler to
* suspend running CSGs and then reconfigure the MCU.
* @core_idle_wq: Workqueue for executing the @core_idle_work.
* @core_idle_work: Work item used to wait for undesired cores to become inactive.
* The work item is enqueued when Host controls the power for
* shader cores and down scaling of cores is performed.
* @gpu_sleep_allowed: Bitmask to indicate the conditions that would be
* used to determine what support for GPU sleep is
* available.
* @gpu_sleep_mode_active: Flag to indicate that the GPU needs to be in sleep
* mode. It is set when the GPU idle notification is
* received and is cleared when HW state has been
* saved in the runtime suspend callback function or
* when the GPU power down is aborted if GPU became
* active whilst it was in sleep mode. The flag is
* guarded with hwaccess_lock spinlock.
* @exit_gpu_sleep_mode: Flag to indicate the GPU can now exit the sleep
* mode due to the submission of work from Userspace.
* The flag is guarded with hwaccess_lock spinlock.
* The @gpu_sleep_mode_active flag is not immediately
* reset when this flag is set, this is to ensure that
* MCU doesn't gets disabled undesirably without the
* suspend of CSGs. That could happen when
* scheduler_pm_active() and scheduler_pm_idle() gets
* called before the Scheduler gets reactivated.
* @gpu_idled: Flag to ensure that the gpu_idle & gpu_active callbacks are
* always called in pair. The flag is guarded with pm.lock mutex.
* @gpu_wakeup_override: Flag to force the power up of L2 cache & reactivation
* of MCU. This is set during the runtime suspend
* callback function, when GPU needs to exit the sleep
* mode for the saving the HW state before power down.
* @db_mirror_interrupt_enabled: Flag tracking if the Doorbell mirror interrupt
* is enabled or not.
* @runtime_suspend_abort_reason: Tracks if the runtime suspend was aborted,
* after the wake up of MCU, due to the DB_MIRROR irq
* or non-idle CSGs. Tracking is done to avoid
* redundant transition of MCU to sleep state after the
* abort of runtime suspend and before the resumption
* of scheduling.
* @l2_force_off_after_mcu_halt: Flag to indicate that L2 cache power down is
* must after performing the MCU halt. Flag is set
* immediately after the MCU halt and cleared
* after the L2 cache power down. MCU can't be
* re-enabled whilst the flag is set.
* @in_reset: True if a GPU is resetting and normal power manager operation is
* suspended
* @partial_shaderoff: True if we want to partial power off shader cores,
* it indicates a partial shader core off case,
* do some special operation for such case like flush
* L2 cache because of GPU2017-861
* @protected_entry_transition_override : True if GPU reset is being used
* before entering the protected mode and so
* the reset handling behaviour is being
* overridden.
* @protected_transition_override : True if a protected mode transition is in
* progress and is overriding power manager
* behaviour.
* @protected_l2_override : Non-zero if the L2 cache is required during a
* protected mode transition. Has no effect if not
* transitioning.
* @hwcnt_desired: True if we want GPU hardware counters to be enabled.
* @hwcnt_disabled: True if GPU hardware counters are not enabled.
* @hwcnt_disable_work: Work item to disable GPU hardware counters, used if
* atomic disable is not possible.
* @gpu_clock_suspend_freq: 'opp-mali-errata-1485982' clock in opp table
* for safe L2 power cycle.
* If no opp-mali-errata-1485982 specified,
* the slowest clock will be taken.
* @gpu_clock_slow_down_wa: If true, slow down GPU clock during L2 power cycle.
* @gpu_clock_slow_down_desired: True if we want lower GPU clock
* for safe L2 power cycle. False if want GPU clock
* to back to normalized one. This is updated only
* in L2 state machine, kbase_pm_l2_update_state.
* @gpu_clock_slowed_down: During L2 power cycle,
* True if gpu clock is set at lower frequency
* for safe L2 power down, False if gpu clock gets
* restored to previous speed. This is updated only in
* work function, kbase_pm_gpu_clock_control_worker.
* @gpu_clock_control_work: work item to set GPU clock during L2 power cycle
* using gpu_clock_control
* @reset_in_progress: Set if reset is ongoing, otherwise set to 0
*
* This structure contains data for the power management framework. There is one
* instance of this structure per device in the system.
*
* Note:
* During an IRQ, @pm_current_policy can be NULL when the policy is being
* changed with kbase_pm_set_policy(). The change is protected under
* kbase_device.pm.pcower_change_lock. Direct access to this from IRQ context
* must therefore check for NULL. If NULL, then kbase_pm_set_policy() will
* re-issue the policy functions that would have been done under IRQ.
*/
struct kbase_pm_backend_data {
const struct kbase_pm_policy *pm_current_policy;
union kbase_pm_policy_data pm_policy_data;
bool reset_done;
wait_queue_head_t reset_done_wait;
int gpu_cycle_counter_requests;
spinlock_t gpu_cycle_counter_requests_lock;
wait_queue_head_t gpu_in_desired_state_wait;
bool gpu_ready;
u64 pm_shaders_core_mask;
bool cg1_disabled;
struct kbasep_pm_metrics_state metrics;
struct kbasep_pm_tick_timer_state shader_tick_timer;
bool poweroff_wait_in_progress;
bool invoke_poweroff_wait_wq_when_l2_off;
bool poweron_required;
struct workqueue_struct *gpu_poweroff_wait_wq;
struct work_struct gpu_poweroff_wait_work;
wait_queue_head_t poweroff_wait;
int (*callback_power_on)(struct kbase_device *kbdev);
void (*callback_power_off)(struct kbase_device *kbdev);
void (*callback_power_suspend)(struct kbase_device *kbdev);
void (*callback_power_resume)(struct kbase_device *kbdev);
int (*callback_power_runtime_on)(struct kbase_device *kbdev);
void (*callback_power_runtime_off)(struct kbase_device *kbdev);
int (*callback_power_runtime_idle)(struct kbase_device *kbdev);
int (*callback_soft_reset)(struct kbase_device *kbdev);
void (*callback_power_runtime_gpu_idle)(struct kbase_device *kbdev);
void (*callback_power_runtime_gpu_active)(struct kbase_device *kbdev);
u64 ca_cores_enabled;
#if MALI_USE_CSF
bool apply_hw_issue_TITANHW_2938_wa;
enum kbase_mcu_state mcu_state;
#endif
enum kbase_l2_core_state l2_state;
enum kbase_shader_core_state shaders_state;
u64 shaders_avail;
u64 shaders_desired_mask;
#if MALI_USE_CSF
bool mcu_desired;
bool policy_change_clamp_state_to_off;
bool waiting_for_mmu_fault_handling;
unsigned int csf_pm_sched_flags;
struct mutex policy_change_lock;
struct workqueue_struct *core_idle_wq;
struct work_struct core_idle_work;
#ifdef KBASE_PM_RUNTIME
unsigned long gpu_sleep_allowed;
bool gpu_sleep_mode_active;
bool exit_gpu_sleep_mode;
bool gpu_idled;
bool gpu_wakeup_override;
bool db_mirror_interrupt_enabled;
enum kbase_pm_runtime_suspend_abort_reason runtime_suspend_abort_reason;
#endif
/**
* @has_host_pwr_iface: GPU supports the host power control interface.
*/
bool has_host_pwr_iface;
/**
* @pwr_cntl_delegated: Flag indicating that control for PM domains (Tiler,
* Shading engine and Neural engine) has been delegated
* to the MCU firmware.
*/
bool pwr_cntl_delegated;
bool l2_force_off_after_mcu_halt;
#endif
bool l2_desired;
bool l2_always_on;
bool shaders_desired;
bool in_reset;
#if !MALI_USE_CSF
bool partial_shaderoff;
bool protected_entry_transition_override;
bool protected_transition_override;
int protected_l2_override;
#endif
bool hwcnt_desired;
bool hwcnt_disabled;
struct work_struct hwcnt_disable_work;
u64 gpu_clock_suspend_freq;
bool gpu_clock_slow_down_wa;
bool gpu_clock_slow_down_desired;
bool gpu_clock_slowed_down;
struct work_struct gpu_clock_control_work;
atomic_t reset_in_progress;
};
#if MALI_USE_CSF
/* CSF PM flag, signaling that the MCU shader Core should be kept on */
#define CSF_DYNAMIC_PM_CORE_KEEP_ON (1 << 0)
/* CSF PM flag, signaling no scheduler suspension on idle groups */
#define CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE (1 << 1)
/* CSF PM flag, signaling no scheduler suspension on no runnable groups */
#define CSF_DYNAMIC_PM_SCHED_NO_SUSPEND (1 << 2)
/* The following flags corresponds to existing defined PM policies */
#define ALWAYS_ON_PM_SCHED_FLAGS \
(CSF_DYNAMIC_PM_CORE_KEEP_ON | CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE | \
CSF_DYNAMIC_PM_SCHED_NO_SUSPEND)
#define COARSE_ON_DEMAND_PM_SCHED_FLAGS (0)
#if !MALI_CUSTOMER_RELEASE
#define ALWAYS_ON_DEMAND_PM_SCHED_FLAGS (CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE)
#endif
#endif
/* List of policy IDs */
enum kbase_pm_policy_id {
KBASE_PM_POLICY_ID_COARSE_DEMAND,
#if !MALI_CUSTOMER_RELEASE
KBASE_PM_POLICY_ID_ALWAYS_ON_DEMAND,
#endif
KBASE_PM_POLICY_ID_ALWAYS_ON
};
/**
* enum kbase_pm_policy_event - PM Policy event ID
*/
enum kbase_pm_policy_event {
/**
* @KBASE_PM_POLICY_EVENT_IDLE: Indicates that the GPU power state
* model has determined that the GPU has gone idle.
*/
KBASE_PM_POLICY_EVENT_IDLE,
/**
* @KBASE_PM_POLICY_EVENT_POWER_ON: Indicates that the GPU state model
* is preparing to power on the GPU.
*/
KBASE_PM_POLICY_EVENT_POWER_ON,
/**
* @KBASE_PM_POLICY_EVENT_TIMER_HIT: Indicates that the GPU became
* active while the Shader Tick Timer was holding the GPU in a powered
* on state.
*/
KBASE_PM_POLICY_EVENT_TIMER_HIT,
/**
* @KBASE_PM_POLICY_EVENT_TIMER_MISS: Indicates that the GPU did not
* become active before the Shader Tick Timer timeout occurred.
*/
KBASE_PM_POLICY_EVENT_TIMER_MISS
};
/**
* struct kbase_pm_policy - Power policy structure.
*
* @name: The name of this policy
* @init: Function called when the policy is selected
* @term: Function called when the policy is unselected
* @shaders_needed: Function called to find out if shader cores are needed
* @get_core_active: Function called to get the current overall GPU power
* state
* @handle_event: Function called when a PM policy event occurs. Should be
* set to NULL if the power policy doesn't require any
* event notifications.
* @id: Field indicating an ID for this policy. This is not
* necessarily the same as its index in the list returned
* by kbase_pm_list_policies().
* It is used purely for debugging.
* @pm_sched_flags: Policy associated with CSF PM scheduling operational flags.
* Pre-defined required flags exist for each of the
* ARM released policies, such as 'always_on', 'coarse_demand'
* and etc.
* Each power policy exposes a (static) instance of this structure which
* contains function pointers to the policy's methods.
*/
struct kbase_pm_policy {
char *name;
/*
* Function called when the policy is selected
*
* This should initialize the kbdev->pm.pm_policy_data structure. It
* should not attempt to make any changes to hardware state.
*
* It is undefined what state the cores are in when the function is
* called.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*/
void (*init)(struct kbase_device *kbdev);
/*
* Function called when the policy is unselected.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*/
void (*term)(struct kbase_device *kbdev);
/*
* Function called to find out if shader cores are needed
*
* This needs to at least satisfy kbdev->pm.backend.shaders_desired,
* and so must never return false when shaders_desired is true.
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*
* Return: true if shader cores are needed, false otherwise
*/
bool (*shaders_needed)(struct kbase_device *kbdev);
/*
* Function called to get the current overall GPU power state
*
* This function must meet or exceed the requirements for power
* indicated by kbase_pm_is_active().
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
*
* Return: true if the GPU should be powered, false otherwise
*/
bool (*get_core_active)(struct kbase_device *kbdev);
/*
* Function called when a power event occurs
*
* @kbdev: The kbase device structure for the device (must be a
* valid pointer)
* @event: The id of the power event that has occurred
*/
void (*handle_event)(struct kbase_device *kbdev, enum kbase_pm_policy_event event);
enum kbase_pm_policy_id id;
#if MALI_USE_CSF
/* Policy associated with CSF PM scheduling operational flags.
* There are pre-defined required flags exist for each of the
* ARM released policies, such as 'always_on', 'coarse_demand'
* and etc.
*/
unsigned int pm_sched_flags;
#endif
};
#endif /* _KBASE_PM_HWACCESS_DEFS_H_ */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,50 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager level 2 cache state definitions.
* The function-like macro KBASEP_L2_STATE() must be defined before including
* this header file. This header file can be included multiple times in the
* same compilation unit with different definitions of KBASEP_L2_STATE().
*
* @OFF: The L2 cache and tiler are off
* @PEND_ON: The L2 cache and tiler are powering on
* @RESTORE_CLOCKS: The GPU clock is restored. Conditionally used.
* @ON_HWCNT_ENABLE: The L2 cache and tiler are on, and hwcnt is being enabled
* @ON: The L2 cache and tiler are on, and hwcnt is enabled
* @ON_HWCNT_DISABLE: The L2 cache and tiler are on, and hwcnt is being disabled
* @SLOW_DOWN_CLOCKS: The GPU clock is set to appropriate or lowest clock.
* Conditionally used.
* @POWER_DOWN: The L2 cache and tiler are about to be powered off
* @PEND_OFF: The L2 cache and tiler are powering off
* @RESET_WAIT: The GPU is resetting, L2 cache and tiler power state are
* unknown
*/
KBASEP_L2_STATE(OFF)
KBASEP_L2_STATE(PEND_ON)
KBASEP_L2_STATE(RESTORE_CLOCKS)
KBASEP_L2_STATE(ON_HWCNT_ENABLE)
KBASEP_L2_STATE(ON)
KBASEP_L2_STATE(ON_HWCNT_DISABLE)
KBASEP_L2_STATE(SLOW_DOWN_CLOCKS)
KBASEP_L2_STATE(POWER_DOWN)
KBASEP_L2_STATE(PEND_OFF)
KBASEP_L2_STATE(RESET_WAIT)

View File

@@ -0,0 +1,113 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager MCU state definitions.
* The function-like macro KBASEP_MCU_STATE() must be defined before including
* this header file. This header file can be included multiple times in the
* same compilation unit with different definitions of KBASEP_MCU_STATE().
*
* @OFF: The MCU is powered off.
* @PEND_ON_RELOAD: The warm boot of MCU or cold boot of MCU (with
* firmware reloading) is in progress.
* @ON_GLB_REINIT_PEND: The MCU is enabled and Global configuration
* requests have been sent to the firmware.
* @ON_HWCNT_ENABLE: The Global requests have completed and MCU is now
* ready for use and hwcnt is being enabled.
* @ON: The MCU is active and hwcnt has been enabled.
* @ON_CORE_ATTR_UPDATE_PEND: The MCU is active and mask of enabled shader cores
* is being updated.
* @ON_HWCNT_DISABLE: The MCU is on and hwcnt is being disabled.
* @ON_HALT: The MCU is on and hwcnt has been disabled, MCU
* halt would be triggered.
* @ON_PEND_HALT: MCU halt in progress, confirmation pending.
* @POWER_DOWN: MCU halted operations, pending being disabled.
* @PEND_OFF: MCU is being disabled, pending on powering off.
* @RESET_WAIT: The GPU is resetting, MCU state is unknown.
* @HCTL_SHADERS_PEND_ON: Global configuration requests sent to the firmware
* have completed and shaders have been requested to
* power on.
* @HCTL_CORES_NOTIFY_PEND: Shader cores have powered up and firmware is being
* notified of the mask of enabled shader cores.
* @HCTL_MCU_ON_RECHECK: MCU is on and hwcnt disabling is triggered
* and checks are done to update the number of
* enabled cores.
* @HCTL_SHADERS_READY_OFF: MCU has halted and cores need to be powered down
* @HCTL_SHADERS_PEND_OFF: Cores are transitioning to power down.
* @HCTL_CORES_DOWN_SCALE_NOTIFY_PEND: Firmware has been informed to stop using
* specific cores, due to core_mask change request.
* After the ACK from FW, the wait will be done for
* undesired cores to become inactive.
* @HCTL_CORE_INACTIVE_PEND: Waiting for specific cores to become inactive.
* Once the cores become inactive their power down
* will be initiated.
* @HCTL_SHADERS_CORE_OFF_PEND: Waiting for specific cores to complete the
* transition to power down. Once powered down,
* HW counters will be re-enabled.
* @ON_SLEEP_INITIATE: MCU is on and hwcnt has been disabled and MCU
* is being put to sleep.
* @ON_PEND_SLEEP: MCU sleep is in progress.
* @IN_SLEEP: Sleep request is completed and MCU has halted.
* @ON_PMODE_ENTER_CORESIGHT_DISABLE: The MCU is on, protected mode enter is about to
* be requested, Coresight is being disabled.
* @ON_PMODE_EXIT_CORESIGHT_ENABLE : The MCU is on, protected mode exit has happened
* Coresight is being enabled.
* @CORESIGHT_DISABLE: The MCU is on and Coresight is being disabled.
* @CORESIGHT_ENABLE: The MCU is on, host does not have control and
* Coresight is being enabled.
*/
KBASEP_MCU_STATE(OFF)
KBASEP_MCU_STATE(PEND_ON_RELOAD)
KBASEP_MCU_STATE(ON_GLB_REINIT_PEND)
KBASEP_MCU_STATE(ON_HWCNT_ENABLE)
KBASEP_MCU_STATE(ON)
KBASEP_MCU_STATE(ON_CORE_ATTR_UPDATE_PEND)
KBASEP_MCU_STATE(ON_HWCNT_DISABLE)
KBASEP_MCU_STATE(ON_HALT)
KBASEP_MCU_STATE(ON_PEND_HALT)
KBASEP_MCU_STATE(POWER_DOWN)
KBASEP_MCU_STATE(PEND_OFF)
KBASEP_MCU_STATE(RESET_WAIT)
/* Additional MCU states with HOST_CONTROL_SHADERS */
KBASEP_MCU_STATE(HCTL_STACK_PEND_ON)
KBASEP_MCU_STATE(HCTL_BASE_PEND_ON)
KBASEP_MCU_STATE(HCTL_SHADERS_PEND_ON)
KBASEP_MCU_STATE(HCTL_CORES_NOTIFY_PEND)
KBASEP_MCU_STATE(HCTL_MCU_ON_RECHECK)
KBASEP_MCU_STATE(HCTL_SHADERS_READY_OFF)
KBASEP_MCU_STATE(HCTL_SHADERS_PEND_OFF)
KBASEP_MCU_STATE(HCTL_BASE_PEND_OFF)
KBASEP_MCU_STATE(HCTL_STACK_PEND_OFF)
KBASEP_MCU_STATE(HCTL_CORES_DOWN_SCALE_NOTIFY_PEND)
KBASEP_MCU_STATE(HCTL_CORE_INACTIVE_PEND)
KBASEP_MCU_STATE(HCTL_SHADERS_CORE_OFF_PEND)
/* Additional MCU states to support GPU sleep feature */
KBASEP_MCU_STATE(ON_SLEEP_INITIATE)
KBASEP_MCU_STATE(ON_PEND_SLEEP)
KBASEP_MCU_STATE(ON_PEND_SOI_SLEEP)
KBASEP_MCU_STATE(IN_SLEEP)
#if IS_ENABLED(CONFIG_MALI_VALHALL_CORESIGHT)
/* Additional MCU states for Coresight */
KBASEP_MCU_STATE(ON_PMODE_ENTER_CORESIGHT_DISABLE)
KBASEP_MCU_STATE(ON_PMODE_EXIT_CORESIGHT_ENABLE)
KBASEP_MCU_STATE(CORESIGHT_DISABLE)
KBASEP_MCU_STATE(CORESIGHT_ENABLE)
#endif /* IS_ENABLED(CONFIG_MALI_VALHALL_CORESIGHT) */

View File

@@ -0,0 +1,497 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2011-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Metrics for power management
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#if MALI_USE_CSF
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#include <csf/ipa_control/mali_kbase_csf_ipa_control.h>
#else
#include <backend/gpu/mali_kbase_jm_rb.h>
#endif /* !MALI_USE_CSF */
#include <backend/gpu/mali_kbase_pm_defs.h>
#include <mali_linux_trace.h>
#if defined(CONFIG_MALI_VALHALL_DEVFREQ) || defined(CONFIG_MALI_VALHALL_DVFS) || !MALI_USE_CSF
/* Shift used for kbasep_pm_metrics_data.time_busy/idle - units of (1 << 8) ns
* This gives a maximum period between samples of 2^(32+8)/100 ns = slightly
* under 11s. Exceeding this will cause overflow
*/
#define KBASE_PM_TIME_SHIFT 8
#endif
#if MALI_USE_CSF
/* To get the GPU_ACTIVE value in nano seconds unit */
#define GPU_ACTIVE_SCALING_FACTOR ((u64)1E9)
#endif
/*
* Possible state transitions
* ON -> ON | OFF | STOPPED
* STOPPED -> ON | OFF
* OFF -> ON
*
*
* ef
* v v
* ON a> STOPPED b> OFF
* ^^
* c
*
* d
*
* Transition effects:
* a. None
* b. Timer expires without restart
* c. Timer is not stopped, timer period is unaffected
* d. Timer must be restarted
* e. Callback is executed and the timer is restarted
* f. Timer is cancelled, or the callback is waited on if currently executing. This is called during
* tear-down and should not be subject to a race from an OFF->ON transition
*/
enum dvfs_metric_timer_state { TIMER_OFF, TIMER_STOPPED, TIMER_ON };
#ifdef CONFIG_MALI_VALHALL_DVFS
static enum hrtimer_restart dvfs_callback(struct hrtimer *timer)
{
struct kbasep_pm_metrics_state *metrics;
if (WARN_ON(!timer))
return HRTIMER_NORESTART;
metrics = container_of(timer, struct kbasep_pm_metrics_state, timer);
/* Transition (b) to fully off if timer was stopped, don't restart the timer in this case */
if (atomic_cmpxchg(&metrics->timer_state, TIMER_STOPPED, TIMER_OFF) != TIMER_ON)
return HRTIMER_NORESTART;
kbase_pm_get_dvfs_action(metrics->kbdev);
/* Set the new expiration time and restart (transition e) */
hrtimer_forward_now(timer, HR_TIMER_DELAY_MSEC(metrics->kbdev->pm.dvfs_period));
return HRTIMER_RESTART;
}
#endif /* CONFIG_MALI_VALHALL_DVFS */
int kbasep_pm_metrics_init(struct kbase_device *kbdev)
{
#if MALI_USE_CSF
struct kbase_ipa_control_perf_counter perf_counter;
int err;
/* One counter group */
const size_t NUM_PERF_COUNTERS = 1;
KBASE_DEBUG_ASSERT(kbdev != NULL);
kbdev->pm.backend.metrics.kbdev = kbdev;
kbdev->pm.backend.metrics.time_period_start = ktime_get_raw();
perf_counter.scaling_factor = GPU_ACTIVE_SCALING_FACTOR;
/* Normalize values by GPU frequency */
perf_counter.gpu_norm = true;
/* We need the GPU_ACTIVE counter, which is in the CSHW group */
perf_counter.type = KBASE_IPA_CORE_TYPE_CSHW;
/* We need the GPU_ACTIVE counter */
perf_counter.idx = GPU_ACTIVE_CNT_IDX;
err = kbase_ipa_control_register(kbdev, &perf_counter, NUM_PERF_COUNTERS,
&kbdev->pm.backend.metrics.ipa_control_client);
if (err) {
dev_err(kbdev->dev, "Failed to register IPA with kbase_ipa_control: err=%d", err);
return -1;
}
#else
KBASE_DEBUG_ASSERT(kbdev != NULL);
kbdev->pm.backend.metrics.kbdev = kbdev;
kbdev->pm.backend.metrics.time_period_start = ktime_get_raw();
#endif
spin_lock_init(&kbdev->pm.backend.metrics.lock);
#ifdef CONFIG_MALI_VALHALL_DVFS
hrtimer_init(&kbdev->pm.backend.metrics.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
kbdev->pm.backend.metrics.timer.function = dvfs_callback;
kbdev->pm.backend.metrics.initialized = true;
atomic_set(&kbdev->pm.backend.metrics.timer_state, TIMER_OFF);
kbase_pm_metrics_start(kbdev);
#endif /* CONFIG_MALI_VALHALL_DVFS */
#if MALI_USE_CSF
/* The sanity check on the GPU_ACTIVE performance counter
* is skipped for Juno platforms that have timing problems.
*/
kbdev->pm.backend.metrics.skip_gpu_active_sanity_check =
(kbdev->gpu_props.impl_tech >= THREAD_FEATURES_IMPLEMENTATION_TECHNOLOGY_FPGA);
#endif
return 0;
}
KBASE_EXPORT_TEST_API(kbasep_pm_metrics_init);
void kbasep_pm_metrics_term(struct kbase_device *kbdev)
{
#ifdef CONFIG_MALI_VALHALL_DVFS
KBASE_DEBUG_ASSERT(kbdev != NULL);
/* Cancel the timer, and block if the callback is currently executing (transition f) */
kbdev->pm.backend.metrics.initialized = false;
atomic_set(&kbdev->pm.backend.metrics.timer_state, TIMER_OFF);
hrtimer_cancel(&kbdev->pm.backend.metrics.timer);
#endif /* CONFIG_MALI_VALHALL_DVFS */
#if MALI_USE_CSF
kbase_ipa_control_unregister(kbdev, kbdev->pm.backend.metrics.ipa_control_client);
#else
CSTD_UNUSED(kbdev);
#endif
}
KBASE_EXPORT_TEST_API(kbasep_pm_metrics_term);
/* caller needs to hold kbdev->pm.backend.metrics.lock before calling this
* function
*/
#if MALI_USE_CSF
#if defined(CONFIG_MALI_VALHALL_DEVFREQ) || defined(CONFIG_MALI_VALHALL_DVFS)
static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev)
{
int err;
u64 gpu_active_counter;
u64 protected_time;
ktime_t now;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
/* Query IPA_CONTROL for the latest GPU-active and protected-time
* info.
*/
err = kbase_ipa_control_query(kbdev, kbdev->pm.backend.metrics.ipa_control_client,
&gpu_active_counter, 1, &protected_time);
/* Read the timestamp after reading the GPU_ACTIVE counter value.
* This ensures the time gap between the 2 reads is consistent for
* a meaningful comparison between the increment of GPU_ACTIVE and
* elapsed time. The lock taken inside kbase_ipa_control_query()
* function can cause lot of variation.
*/
now = ktime_get_raw();
if (err) {
dev_err(kbdev->dev, "Failed to query the increment of GPU_ACTIVE counter: err=%d",
err);
} else {
u64 diff_ns;
s64 diff_ns_signed;
u32 ns_time;
ktime_t diff = ktime_sub(now, kbdev->pm.backend.metrics.time_period_start);
diff_ns_signed = ktime_to_ns(diff);
if (diff_ns_signed < 0)
return;
diff_ns = (u64)diff_ns_signed;
#if !IS_ENABLED(CONFIG_MALI_VALHALL_NO_MALI)
/* The GPU_ACTIVE counter shouldn't clock-up more time than has
* actually elapsed - but still some margin needs to be given
* when doing the comparison. There could be some drift between
* the CPU and GPU clock.
*
* Can do the check only in a real driver build, as an arbitrary
* value for GPU_ACTIVE can be fed into dummy model in no_mali
* configuration which may not correspond to the real elapsed
* time.
*/
if (!kbdev->pm.backend.metrics.skip_gpu_active_sanity_check) {
/* The margin is scaled to allow for the worst-case
* scenario where the samples are maximally separated,
* plus a small offset for sampling errors.
*/
u64 const MARGIN_NS =
IPA_CONTROL_TIMER_DEFAULT_VALUE_MS * NSEC_PER_MSEC * 3 / 2;
if (gpu_active_counter > (diff_ns + MARGIN_NS)) {
dev_info(
kbdev->dev,
"GPU activity takes longer than time interval: %llu ns > %llu ns",
(unsigned long long)gpu_active_counter,
(unsigned long long)diff_ns);
}
}
#endif
/* Calculate time difference in units of 256ns */
ns_time = (u32)(diff_ns >> KBASE_PM_TIME_SHIFT);
/* Add protected_time to gpu_active_counter so that time in
* protected mode is included in the apparent GPU active time,
* then convert it from units of 1ns to units of 256ns, to
* match what JM GPUs use. The assumption is made here that the
* GPU is 100% busy while in protected mode, so we should add
* this since the GPU can't (and thus won't) update these
* counters while it's actually in protected mode.
*
* Perform the add after dividing each value down, to reduce
* the chances of overflows.
*/
protected_time >>= KBASE_PM_TIME_SHIFT;
gpu_active_counter >>= KBASE_PM_TIME_SHIFT;
gpu_active_counter += protected_time;
/* Ensure the following equations don't go wrong if ns_time is
* slightly larger than gpu_active_counter somehow
*/
gpu_active_counter = MIN(gpu_active_counter, ns_time);
kbdev->pm.backend.metrics.values.time_busy += gpu_active_counter;
kbdev->pm.backend.metrics.values.time_idle += ns_time - gpu_active_counter;
/* Also make time in protected mode available explicitly,
* so users of this data have this info, too.
*/
kbdev->pm.backend.metrics.values.time_in_protm += protected_time;
}
kbdev->pm.backend.metrics.time_period_start = now;
}
#endif /* defined(CONFIG_MALI_VALHALL_DEVFREQ) || defined(CONFIG_MALI_VALHALL_DVFS) */
#else
static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev, ktime_t now)
{
ktime_t diff;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
diff = ktime_sub(now, kbdev->pm.backend.metrics.time_period_start);
if (ktime_to_ns(diff) < 0)
return;
if (kbdev->pm.backend.metrics.gpu_active) {
u32 ns_time = (u32)(ktime_to_ns(diff) >> KBASE_PM_TIME_SHIFT);
kbdev->pm.backend.metrics.values.time_busy += ns_time;
if (kbdev->pm.backend.metrics.active_cl_ctx[0])
kbdev->pm.backend.metrics.values.busy_cl[0] += ns_time;
if (kbdev->pm.backend.metrics.active_cl_ctx[1])
kbdev->pm.backend.metrics.values.busy_cl[1] += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[0])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[1])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
if (kbdev->pm.backend.metrics.active_gl_ctx[2])
kbdev->pm.backend.metrics.values.busy_gl += ns_time;
} else {
kbdev->pm.backend.metrics.values.time_idle +=
(u32)(ktime_to_ns(diff) >> KBASE_PM_TIME_SHIFT);
}
kbdev->pm.backend.metrics.time_period_start = now;
}
#endif /* MALI_USE_CSF */
#if defined(CONFIG_MALI_VALHALL_DEVFREQ) || defined(CONFIG_MALI_VALHALL_DVFS)
void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev, struct kbasep_pm_metrics *last,
struct kbasep_pm_metrics *diff)
{
struct kbasep_pm_metrics *cur = &kbdev->pm.backend.metrics.values;
unsigned long flags;
spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
#if MALI_USE_CSF
kbase_pm_get_dvfs_utilisation_calc(kbdev);
#else
kbase_pm_get_dvfs_utilisation_calc(kbdev, ktime_get_raw());
#endif
memset(diff, 0, sizeof(*diff));
diff->time_busy = cur->time_busy - last->time_busy;
diff->time_idle = cur->time_idle - last->time_idle;
#if MALI_USE_CSF
diff->time_in_protm = cur->time_in_protm - last->time_in_protm;
#else
diff->busy_cl[0] = cur->busy_cl[0] - last->busy_cl[0];
diff->busy_cl[1] = cur->busy_cl[1] - last->busy_cl[1];
diff->busy_gl = cur->busy_gl - last->busy_gl;
#endif
*last = *cur;
spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_pm_get_dvfs_metrics);
#endif
#ifdef CONFIG_MALI_VALHALL_DVFS
void kbase_pm_get_dvfs_action(struct kbase_device *kbdev)
{
int utilisation;
struct kbasep_pm_metrics *diff;
#if !MALI_USE_CSF
int busy;
int util_gl_share;
int util_cl_share[2];
#endif
KBASE_DEBUG_ASSERT(kbdev != NULL);
diff = &kbdev->pm.backend.metrics.dvfs_diff;
kbase_pm_get_dvfs_metrics(kbdev, &kbdev->pm.backend.metrics.dvfs_last, diff);
utilisation = (100 * diff->time_busy) / max(diff->time_busy + diff->time_idle, 1u);
#if !MALI_USE_CSF
busy = max(diff->busy_gl + diff->busy_cl[0] + diff->busy_cl[1], 1u);
util_gl_share = (100 * diff->busy_gl) / busy;
util_cl_share[0] = (100 * diff->busy_cl[0]) / busy;
util_cl_share[1] = (100 * diff->busy_cl[1]) / busy;
kbase_platform_dvfs_event(kbdev, utilisation, util_gl_share, util_cl_share);
#else
/* Note that, at present, we don't pass protected-mode time to the
* platform here. It's unlikely to be useful, however, as the platform
* probably just cares whether the GPU is busy or not; time in
* protected mode is already added to busy-time at this point, though,
* so we should be good.
*/
kbase_platform_dvfs_event(kbdev, utilisation);
#endif
}
bool kbase_pm_metrics_is_active(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev != NULL);
return atomic_read(&kbdev->pm.backend.metrics.timer_state) == TIMER_ON;
}
KBASE_EXPORT_TEST_API(kbase_pm_metrics_is_active);
void kbase_pm_metrics_start(struct kbase_device *kbdev)
{
struct kbasep_pm_metrics_state *metrics = &kbdev->pm.backend.metrics;
if (unlikely(!metrics->initialized))
return;
/* Transition to ON, from a stopped state (transition c) */
if (atomic_xchg(&metrics->timer_state, TIMER_ON) == TIMER_OFF)
/* Start the timer only if it's been fully stopped (transition d)*/
hrtimer_start(&metrics->timer, HR_TIMER_DELAY_MSEC(kbdev->pm.dvfs_period),
HRTIMER_MODE_REL);
}
void kbase_pm_metrics_stop(struct kbase_device *kbdev)
{
if (unlikely(!kbdev->pm.backend.metrics.initialized))
return;
/* Timer is Stopped if its currently on (transition a) */
atomic_cmpxchg(&kbdev->pm.backend.metrics.timer_state, TIMER_ON, TIMER_STOPPED);
}
#endif /* CONFIG_MALI_VALHALL_DVFS */
#if !MALI_USE_CSF
/**
* kbase_pm_metrics_active_calc - Update PM active counts based on currently
* running atoms
* @kbdev: Device pointer
*
* The caller must hold kbdev->pm.backend.metrics.lock
*/
static void kbase_pm_metrics_active_calc(struct kbase_device *kbdev)
{
unsigned int js;
lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
kbdev->pm.backend.metrics.active_gl_ctx[0] = 0;
kbdev->pm.backend.metrics.active_gl_ctx[1] = 0;
kbdev->pm.backend.metrics.active_gl_ctx[2] = 0;
kbdev->pm.backend.metrics.active_cl_ctx[0] = 0;
kbdev->pm.backend.metrics.active_cl_ctx[1] = 0;
kbdev->pm.backend.metrics.gpu_active = false;
for (js = 0; js < BASE_JM_MAX_NR_SLOTS; js++) {
struct kbase_jd_atom *katom = kbase_gpu_inspect(kbdev, js, 0);
/* Head atom may have just completed, so if it isn't running
* then try the next atom
*/
if (katom && katom->gpu_rb_state != KBASE_ATOM_GPU_RB_SUBMITTED)
katom = kbase_gpu_inspect(kbdev, js, 1);
if (katom && katom->gpu_rb_state == KBASE_ATOM_GPU_RB_SUBMITTED) {
if (katom->core_req & BASE_JD_REQ_ONLY_COMPUTE) {
u32 device_nr =
(katom->core_req & BASE_JD_REQ_SPECIFIC_COHERENT_GROUP) ?
katom->device_nr :
0;
if (!WARN_ON(device_nr >= 2))
kbdev->pm.backend.metrics.active_cl_ctx[device_nr] = 1;
} else {
kbdev->pm.backend.metrics.active_gl_ctx[js] = 1;
trace_sysgraph(SGR_ACTIVE, 0, js);
}
kbdev->pm.backend.metrics.gpu_active = true;
} else {
trace_sysgraph(SGR_INACTIVE, 0, js);
}
}
}
/* called when job is submitted to or removed from a GPU slot */
void kbase_pm_metrics_update(struct kbase_device *kbdev, ktime_t *timestamp)
{
unsigned long flags;
ktime_t now;
lockdep_assert_held(&kbdev->hwaccess_lock);
spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
if (!timestamp) {
now = ktime_get_raw();
timestamp = &now;
}
/* Track how much of time has been spent busy or idle. For JM GPUs,
* this also evaluates how long CL and/or GL jobs have been busy for.
*/
kbase_pm_get_dvfs_utilisation_calc(kbdev, *timestamp);
kbase_pm_metrics_active_calc(kbdev);
spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
}
#endif /* !MALI_USE_CSF */

View File

@@ -0,0 +1,475 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2010-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Power policy API implementations
*/
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_pm.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <mali_kbase_reset_gpu.h>
#include <mali_kbase_io.h>
#if MALI_USE_CSF && defined CONFIG_MALI_VALHALL_DEBUG
#include <csf/mali_kbase_csf_firmware.h>
#endif
#include <linux/of.h>
static const struct kbase_pm_policy *const all_policy_list[] = {
#if IS_ENABLED(CONFIG_MALI_VALHALL_NO_MALI)
&kbase_pm_always_on_policy_ops,
&kbase_pm_coarse_demand_policy_ops,
#else /* CONFIG_MALI_VALHALL_NO_MALI */
&kbase_pm_coarse_demand_policy_ops,
&kbase_pm_always_on_policy_ops,
#endif /* CONFIG_MALI_VALHALL_NO_MALI */
};
void kbase_pm_policy_init(struct kbase_device *kbdev)
{
const struct kbase_pm_policy *default_policy = all_policy_list[0];
struct device_node *np = kbdev->dev->of_node;
const char *power_policy_name;
unsigned long flags;
unsigned int i;
/* Read "power-policy" property and fallback to "power_policy" if not found */
if ((of_property_read_string(np, "power-policy", &power_policy_name) == 0) ||
(of_property_read_string(np, "power_policy", &power_policy_name) == 0)) {
for (i = 0; i < ARRAY_SIZE(all_policy_list); i++)
if (sysfs_streq(all_policy_list[i]->name, power_policy_name)) {
default_policy = all_policy_list[i];
break;
}
}
#if MALI_USE_CSF && defined(CONFIG_MALI_VALHALL_DEBUG)
/* Use always_on policy if module param fw_debug=1 is
* passed, to aid firmware debugging.
*/
if (fw_debug)
default_policy = &kbase_pm_always_on_policy_ops;
#endif
default_policy->init(kbdev);
#if MALI_USE_CSF
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.pm_current_policy = default_policy;
kbdev->pm.backend.csf_pm_sched_flags = default_policy->pm_sched_flags;
#ifdef KBASE_PM_RUNTIME
if (kbase_pm_idle_groups_sched_suspendable(kbdev))
clear_bit(KBASE_GPU_IGNORE_IDLE_EVENT, &kbdev->pm.backend.gpu_sleep_allowed);
else
set_bit(KBASE_GPU_IGNORE_IDLE_EVENT, &kbdev->pm.backend.gpu_sleep_allowed);
#endif /* KBASE_PM_RUNTIME */
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
#else
CSTD_UNUSED(flags);
kbdev->pm.backend.pm_current_policy = default_policy;
#endif
}
void kbase_pm_policy_term(struct kbase_device *kbdev)
{
kbdev->pm.backend.pm_current_policy->term(kbdev);
}
void kbase_pm_update_active(struct kbase_device *kbdev)
{
struct kbase_pm_device_data *pm = &kbdev->pm;
struct kbase_pm_backend_data *backend = &pm->backend;
unsigned long flags;
bool active;
lockdep_assert_held(&pm->lock);
/* pm_current_policy will never be NULL while pm.lock is held */
KBASE_DEBUG_ASSERT(backend->pm_current_policy);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
active = backend->pm_current_policy->get_core_active(kbdev);
WARN((kbase_pm_is_active(kbdev) && !active),
"GPU is active but policy '%s' is indicating that it can be powered off",
kbdev->pm.backend.pm_current_policy->name);
if (active) {
/* Power on the GPU and any cores requested by the policy */
if (!pm->backend.invoke_poweroff_wait_wq_when_l2_off &&
pm->backend.poweroff_wait_in_progress) {
KBASE_DEBUG_ASSERT(kbase_io_is_gpu_powered(kbdev));
#if MALI_USE_CSF
if (likely(!pm->backend.waiting_for_mmu_fault_handling)) {
/* L2 has been powered off. Invoke the state machine to power
* up the L2 cache and also effectively cancel the GPU power off
* work item.
*/
pm->backend.poweroff_wait_in_progress = false;
pm->backend.l2_desired = true;
pm->backend.mcu_desired = true;
kbase_pm_update_state(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
wake_up(&kbdev->pm.backend.poweroff_wait);
return;
}
#endif
pm->backend.poweron_required = true;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
} else {
/* Cancel the invocation of
* kbase_pm_gpu_poweroff_wait_wq() from the L2 state
* machine. This is safe - it
* invoke_poweroff_wait_wq_when_l2_off is true, then
* the poweroff work hasn't even been queued yet,
* meaning we can go straight to powering on.
*/
pm->backend.invoke_poweroff_wait_wq_when_l2_off = false;
pm->backend.poweroff_wait_in_progress = false;
pm->backend.l2_desired = true;
#if MALI_USE_CSF
pm->backend.mcu_desired = true;
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
kbase_pm_do_poweron(kbdev, false);
}
} else {
/* It is an error for the power policy to power off the GPU
* when there are contexts active
*/
KBASE_DEBUG_ASSERT(pm->active_count == 0);
pm->backend.poweron_required = false;
/* Request power off */
if (kbase_io_is_gpu_powered(kbdev)) {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* Power off the GPU immediately */
kbase_pm_do_poweroff(kbdev);
} else {
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
}
}
void kbase_pm_update_dynamic_cores_onoff(struct kbase_device *kbdev)
{
bool shaders_desired;
lockdep_assert_held(&kbdev->hwaccess_lock);
lockdep_assert_held(&kbdev->pm.lock);
if (kbdev->pm.backend.pm_current_policy == NULL)
return;
if (kbdev->pm.backend.poweroff_wait_in_progress)
return;
#if MALI_USE_CSF
CSTD_UNUSED(shaders_desired);
/* Invoke the MCU state machine to send a request to FW for updating
* the mask of shader cores that can be used for allocation of
* endpoints requested by CSGs.
*/
if (kbase_pm_is_mcu_desired(kbdev))
kbase_pm_update_state(kbdev);
#else
/* In protected transition, don't allow outside shader core request
* affect transition, return directly
*/
if (kbdev->pm.backend.protected_transition_override)
return;
shaders_desired = kbdev->pm.backend.pm_current_policy->shaders_needed(kbdev);
if (shaders_desired && kbase_pm_is_l2_desired(kbdev))
kbase_pm_update_state(kbdev);
#endif
}
void kbase_pm_update_cores_state_nolock(struct kbase_device *kbdev)
{
bool shaders_desired = false;
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.pm_current_policy == NULL)
return;
if (kbdev->pm.backend.poweroff_wait_in_progress)
return;
#if !MALI_USE_CSF
if (kbdev->pm.backend.protected_transition_override)
/* We are trying to change in/out of protected mode - force all
* cores off so that the L2 powers down
*/
shaders_desired = false;
else
shaders_desired = kbdev->pm.backend.pm_current_policy->shaders_needed(kbdev);
#endif
if (kbdev->pm.backend.shaders_desired != shaders_desired) {
KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_DESIRED, NULL,
kbdev->pm.backend.shaders_desired);
kbdev->pm.backend.shaders_desired = shaders_desired;
kbase_pm_update_state(kbdev);
}
}
void kbase_pm_update_cores_state(struct kbase_device *kbdev)
{
unsigned long flags;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbase_pm_update_cores_state_nolock(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
size_t kbase_pm_list_policies(struct kbase_device *kbdev,
const struct kbase_pm_policy *const **list)
{
CSTD_UNUSED(kbdev);
if (list)
*list = all_policy_list;
return ARRAY_SIZE(all_policy_list);
}
KBASE_EXPORT_TEST_API(kbase_pm_list_policies);
const struct kbase_pm_policy *kbase_pm_get_policy(struct kbase_device *kbdev)
{
KBASE_DEBUG_ASSERT(kbdev != NULL);
return kbdev->pm.backend.pm_current_policy;
}
KBASE_EXPORT_TEST_API(kbase_pm_get_policy);
#if MALI_USE_CSF
static int policy_change_wait_for_L2_off(struct kbase_device *kbdev)
{
long remaining;
long timeout = kbase_csf_timeout_in_jiffies(kbase_get_timeout_ms(kbdev, CSF_PM_TIMEOUT));
int err = 0;
/* Wait for L2 becoming off, by which the MCU is also implicitly off
* since the L2 state machine would only start its power-down
* sequence when the MCU is in off state. The L2 off is required
* as the tiler may need to be power cycled for MCU reconfiguration
* for host control of shader cores.
*/
#if KERNEL_VERSION(4, 13, 1) <= LINUX_VERSION_CODE
remaining = wait_event_killable_timeout(kbdev->pm.backend.gpu_in_desired_state_wait,
kbdev->pm.backend.l2_state == KBASE_L2_OFF,
timeout);
#else
remaining = wait_event_timeout(kbdev->pm.backend.gpu_in_desired_state_wait,
kbdev->pm.backend.l2_state == KBASE_L2_OFF, timeout);
#endif
if (!remaining) {
err = -ETIMEDOUT;
} else if (remaining < 0) {
dev_info(kbdev->dev, "Wait for L2_off got interrupted");
err = (int)remaining;
}
dev_dbg(kbdev->dev, "%s: err=%d mcu_state=%d, L2_state=%d\n", __func__, err,
kbdev->pm.backend.mcu_state, kbdev->pm.backend.l2_state);
return err;
}
#endif
void kbase_pm_set_policy(struct kbase_device *kbdev, const struct kbase_pm_policy *new_policy)
{
const struct kbase_pm_policy *old_policy;
unsigned long flags;
#if MALI_USE_CSF
unsigned int new_policy_csf_pm_sched_flags;
bool sched_suspend;
bool reset_gpu = false;
bool reset_op_prevented = true;
struct kbase_csf_scheduler *scheduler = NULL;
u64 pwroff_ns;
bool switching_to_always_on;
#endif
KBASE_DEBUG_ASSERT(kbdev != NULL);
KBASE_DEBUG_ASSERT(new_policy != NULL);
KBASE_KTRACE_ADD(kbdev, PM_SET_POLICY, NULL, new_policy->id);
#if MALI_USE_CSF
pwroff_ns = kbase_csf_firmware_get_mcu_core_pwroff_time(kbdev);
switching_to_always_on = new_policy == &kbase_pm_always_on_policy_ops;
if (pwroff_ns == 0 && !switching_to_always_on) {
dev_warn(
kbdev->dev,
"power_policy: cannot switch away from always_on with mcu_shader_pwroff_timeout set to 0\n");
dev_warn(
kbdev->dev,
"power_policy: resetting mcu_shader_pwroff_timeout to default value to switch policy from always_on\n");
kbase_csf_firmware_reset_mcu_core_pwroff_time(kbdev);
}
scheduler = &kbdev->csf.scheduler;
KBASE_DEBUG_ASSERT(scheduler != NULL);
/* Serialize calls on kbase_pm_set_policy() */
mutex_lock(&kbdev->pm.backend.policy_change_lock);
if (kbase_reset_gpu_prevent_and_wait(kbdev)) {
dev_warn(kbdev->dev, "Set PM policy failing to prevent gpu reset");
reset_op_prevented = false;
}
/* In case of CSF, the scheduler may be invoked to suspend. In that
* case, there is a risk that the L2 may be turned on by the time we
* check it here. So we hold the scheduler lock to avoid other operations
* interfering with the policy change and vice versa.
*/
mutex_lock(&scheduler->lock);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
/* policy_change_clamp_state_to_off, when needed, is set/cleared in
* this function, a very limited temporal scope for covering the
* change transition.
*/
WARN_ON(kbdev->pm.backend.policy_change_clamp_state_to_off);
new_policy_csf_pm_sched_flags = new_policy->pm_sched_flags;
/* Requiring the scheduler PM suspend operation when changes involving
* the always_on policy, reflected by the CSF_DYNAMIC_PM_CORE_KEEP_ON
* flag bit.
*/
sched_suspend = reset_op_prevented &&
(CSF_DYNAMIC_PM_CORE_KEEP_ON &
(new_policy_csf_pm_sched_flags | kbdev->pm.backend.csf_pm_sched_flags));
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (sched_suspend) {
/* Update the suspend flag to reflect actually suspend being done ! */
sched_suspend = !kbase_csf_scheduler_pm_suspend_no_lock(kbdev);
/* Set the reset recovery flag if the required suspend failed */
reset_gpu = !sched_suspend;
}
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.policy_change_clamp_state_to_off = sched_suspend;
kbase_pm_update_state(kbdev);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
if (sched_suspend)
reset_gpu = policy_change_wait_for_L2_off(kbdev);
#endif
kbase_pm_lock(kbdev);
/* During a policy change we pretend the GPU is active */
/* A suspend won't happen here, because we're in a syscall from a
* userspace thread
*/
if (kbase_pm_context_active_handle_suspend_locked(kbdev,
KBASE_PM_SUSPEND_HANDLER_NOT_POSSIBLE))
dev_warn_once(
kbdev->dev,
"Error shouldn't be returned with SUSPEND_HANDLER_NOT_POSSIBLE flag.");
/* Remove the policy to prevent IRQ handlers from working on it */
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
old_policy = kbdev->pm.backend.pm_current_policy;
kbdev->pm.backend.pm_current_policy = NULL;
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
KBASE_KTRACE_ADD(kbdev, PM_CURRENT_POLICY_TERM, NULL, old_policy->id);
if (old_policy->term)
old_policy->term(kbdev);
memset(&kbdev->pm.backend.pm_policy_data, 0, sizeof(union kbase_pm_policy_data));
KBASE_KTRACE_ADD(kbdev, PM_CURRENT_POLICY_INIT, NULL, new_policy->id);
if (new_policy->init)
new_policy->init(kbdev);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
kbdev->pm.backend.pm_current_policy = new_policy;
#if MALI_USE_CSF
kbdev->pm.backend.csf_pm_sched_flags = new_policy_csf_pm_sched_flags;
/* New policy in place, release the clamping on mcu/L2 off state */
kbdev->pm.backend.policy_change_clamp_state_to_off = false;
kbase_pm_update_state(kbdev);
#ifdef KBASE_PM_RUNTIME
if (kbase_pm_idle_groups_sched_suspendable(kbdev))
clear_bit(KBASE_GPU_IGNORE_IDLE_EVENT, &kbdev->pm.backend.gpu_sleep_allowed);
else
set_bit(KBASE_GPU_IGNORE_IDLE_EVENT, &kbdev->pm.backend.gpu_sleep_allowed);
#endif /* KBASE_PM_RUNTIME */
#endif
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
/* If any core power state changes were previously attempted, but
* couldn't be made because the policy was changing (current_policy was
* NULL), then re-try them here.
*/
kbase_pm_update_active(kbdev);
kbase_pm_update_cores_state(kbdev);
/* Now the policy change is finished, we release our fake context active
* reference
*/
kbase_pm_context_idle_locked(kbdev);
kbase_pm_unlock(kbdev);
#if MALI_USE_CSF
/* Reverse the suspension done */
if (sched_suspend)
kbase_csf_scheduler_pm_resume_no_lock(kbdev);
mutex_unlock(&scheduler->lock);
if (reset_op_prevented)
kbase_reset_gpu_allow(kbdev);
if (reset_gpu) {
dev_warn(kbdev->dev, "Resorting to GPU reset for policy change\n");
if (kbase_prepare_to_reset_gpu(kbdev, RESET_FLAGS_NONE))
kbase_reset_gpu(kbdev);
kbase_reset_gpu_wait(kbdev);
}
mutex_unlock(&kbdev->pm.backend.policy_change_lock);
#endif
}
KBASE_EXPORT_TEST_API(kbase_pm_set_policy);

View File

@@ -0,0 +1,104 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2010-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Power policy API definitions
*/
#ifndef _KBASE_PM_POLICY_H_
#define _KBASE_PM_POLICY_H_
/**
* kbase_pm_policy_init - Initialize power policy framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Must be called before calling any other policy function
*/
void kbase_pm_policy_init(struct kbase_device *kbdev);
/**
* kbase_pm_policy_term - Terminate power policy framework
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*/
void kbase_pm_policy_term(struct kbase_device *kbdev);
/**
* kbase_pm_update_active - Update the active power state of the GPU
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Calls into the current power policy
*/
void kbase_pm_update_active(struct kbase_device *kbdev);
/**
* kbase_pm_update_cores - Update the desired core state of the GPU
*
* @kbdev: The kbase device structure for the device (must be a valid pointer)
*
* Calls into the current power policy
*/
void kbase_pm_update_cores(struct kbase_device *kbdev);
/**
* kbase_pm_cores_requested - Check that a power request has been locked into
* the HW.
* @kbdev: Kbase device
* @shader_required: true if shaders are required
*
* Called by the scheduler to check if a power on request has been locked into
* the HW.
*
* Note that there is no guarantee that the cores are actually ready, however
* when the request has been locked into the HW, then it is safe to submit work
* since the HW will wait for the transition to ready.
*
* A reference must first be taken prior to making this call.
*
* Caller must hold the hwaccess_lock.
*
* Return: true if the request to the HW was successfully made else false if the
* request is still pending.
*/
static inline bool kbase_pm_cores_requested(struct kbase_device *kbdev, bool shader_required)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
/* If the L2 & tiler are not on or pending, then the tiler is not yet
* available, and shaders are definitely not powered.
*/
if (kbdev->pm.backend.l2_state != KBASE_L2_PEND_ON &&
kbdev->pm.backend.l2_state != KBASE_L2_ON &&
kbdev->pm.backend.l2_state != KBASE_L2_ON_HWCNT_ENABLE)
return false;
if (shader_required &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_PEND_ON_CORESTACK_ON &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_ON_CORESTACK_ON &&
kbdev->pm.backend.shaders_state != KBASE_SHADERS_ON_CORESTACK_ON_RECHECK)
return false;
return true;
}
#endif /* _KBASE_PM_POLICY_H_ */

View File

@@ -0,0 +1,79 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Backend-specific Power Manager shader core state definitions.
* The function-like macro KBASEP_SHADER_STATE() must be defined before
* including this header file. This header file can be included multiple
* times in the same compilation unit with different definitions of
* KBASEP_SHADER_STATE().
*
* @OFF_CORESTACK_OFF: The shaders and core stacks are off
* @OFF_CORESTACK_PEND_ON: The shaders are off, core stacks have been
* requested to power on and hwcnt is being
* disabled
* @PEND_ON_CORESTACK_ON: Core stacks are on, shaders have been
* requested to power on. Or after doing
* partial shader on/off, checking whether
* it's the desired state.
* @ON_CORESTACK_ON: The shaders and core stacks are on, and
* hwcnt already enabled.
* @ON_CORESTACK_ON_RECHECK: The shaders and core stacks are on, hwcnt
* disabled, and checks to powering down or
* re-enabling hwcnt.
* @WAIT_OFF_CORESTACK_ON: The shaders have been requested to power
* off, but they remain on for the duration
* of the hysteresis timer
* @WAIT_GPU_IDLE: The shaders partial poweroff needs to
* reach a state where jobs on the GPU are
* finished including jobs currently running
* and in the GPU queue because of
* GPU2017-861
* @WAIT_FINISHED_CORESTACK_ON: The hysteresis timer has expired
* @L2_FLUSHING_CORESTACK_ON: The core stacks are on and the level 2
* cache is being flushed.
* @READY_OFF_CORESTACK_ON: The core stacks are on and the shaders are
* ready to be powered off.
* @PEND_OFF_CORESTACK_ON: The core stacks are on, and the shaders
* have been requested to power off
* @OFF_CORESTACK_PEND_OFF: The shaders are off, and the core stacks
* have been requested to power off
* @OFF_CORESTACK_OFF_TIMER_PEND_OFF: Shaders and corestacks are off, but the
* tick timer cancellation is still pending.
* @RESET_WAIT: The GPU is resetting, shader and core
* stack power states are unknown
*/
KBASEP_SHADER_STATE(OFF_CORESTACK_OFF)
KBASEP_SHADER_STATE(OFF_CORESTACK_PEND_ON)
KBASEP_SHADER_STATE(PEND_ON_CORESTACK_ON)
KBASEP_SHADER_STATE(ON_CORESTACK_ON)
KBASEP_SHADER_STATE(ON_CORESTACK_ON_RECHECK)
KBASEP_SHADER_STATE(WAIT_OFF_CORESTACK_ON)
#if !MALI_USE_CSF
KBASEP_SHADER_STATE(WAIT_GPU_IDLE)
#endif /* !MALI_USE_CSF */
KBASEP_SHADER_STATE(WAIT_FINISHED_CORESTACK_ON)
KBASEP_SHADER_STATE(L2_FLUSHING_CORESTACK_ON)
KBASEP_SHADER_STATE(READY_OFF_CORESTACK_ON)
KBASEP_SHADER_STATE(PEND_OFF_CORESTACK_ON)
KBASEP_SHADER_STATE(OFF_CORESTACK_PEND_OFF)
KBASEP_SHADER_STATE(OFF_CORESTACK_OFF_TIMER_PEND_OFF)
KBASEP_SHADER_STATE(RESET_WAIT)

View File

@@ -0,0 +1,372 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2014-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <mali_kbase_hwaccess_time.h>
#include <linux/gcd.h>
#include <csf/mali_kbase_csf_timeout.h>
#include <device/mali_kbase_device.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
#include <mali_kbase_config_defaults.h>
#include <mali_kbase_io.h>
#include <linux/version_compat_defs_for_valhall.h>
#include <asm/arch_timer.h>
#include <linux/mali_hw_access.h>
struct kbase_timeout_info {
char *selector_str;
u64 timeout_cycles;
};
#define GPU_TIMESTAMP_OFFSET_INVALID S64_MAX
static struct kbase_timeout_info timeout_info[KBASE_TIMEOUT_SELECTOR_COUNT] = {
[CSF_FIRMWARE_TIMEOUT] = { "CSF_FIRMWARE_TIMEOUT", MIN(CSF_FIRMWARE_TIMEOUT_CYCLES,
CSF_FIRMWARE_PING_TIMEOUT_CYCLES) },
[CSF_PM_TIMEOUT] = { "CSF_PM_TIMEOUT", CSF_PM_TIMEOUT_CYCLES },
[CSF_GPU_RESET_TIMEOUT] = { "CSF_GPU_RESET_TIMEOUT", CSF_GPU_RESET_TIMEOUT_CYCLES },
[CSF_CSG_TERM_TIMEOUT] = { "CSF_CSG_TERM_TIMEOUT", CSF_CSG_TERM_TIMEOUT_CYCLES },
[CSF_FIRMWARE_BOOT_TIMEOUT] = { "CSF_FIRMWARE_BOOT_TIMEOUT",
CSF_FIRMWARE_BOOT_TIMEOUT_CYCLES },
[CSF_FIRMWARE_WAKE_UP_TIMEOUT] = { "CSF_FIRMWARE_WAKE_UP_TIMEOUT",
CSF_FIRMWARE_WAKE_UP_TIMEOUT_CYCLES },
[CSF_FIRMWARE_SOI_HALT_TIMEOUT] = { "CSF_FIRMWARE_SOI_HALT_TIMEOUT",
CSF_FIRMWARE_SOI_HALT_TIMEOUT_CYCLES },
[CSF_FIRMWARE_PING_TIMEOUT] = { "CSF_FIRMWARE_PING_TIMEOUT",
CSF_FIRMWARE_PING_TIMEOUT_CYCLES },
[CSF_SCHED_PROTM_PROGRESS_TIMEOUT] = { "CSF_SCHED_PROTM_PROGRESS_TIMEOUT",
DEFAULT_PROGRESS_TIMEOUT_CYCLES },
[MMU_AS_INACTIVE_WAIT_TIMEOUT] = { "MMU_AS_INACTIVE_WAIT_TIMEOUT",
MMU_AS_INACTIVE_WAIT_TIMEOUT_CYCLES },
[KCPU_FENCE_SIGNAL_TIMEOUT] = { "KCPU_FENCE_SIGNAL_TIMEOUT",
KCPU_FENCE_SIGNAL_TIMEOUT_CYCLES },
[KBASE_PRFCNT_ACTIVE_TIMEOUT] = { "KBASE_PRFCNT_ACTIVE_TIMEOUT",
KBASE_PRFCNT_ACTIVE_TIMEOUT_CYCLES },
[KBASE_CLEAN_CACHE_TIMEOUT] = { "KBASE_CLEAN_CACHE_TIMEOUT",
KBASE_CLEAN_CACHE_TIMEOUT_CYCLES },
[KBASE_AS_INACTIVE_TIMEOUT] = { "KBASE_AS_INACTIVE_TIMEOUT",
KBASE_AS_INACTIVE_TIMEOUT_CYCLES },
[IPA_INACTIVE_TIMEOUT] = { "IPA_INACTIVE_TIMEOUT", IPA_INACTIVE_TIMEOUT_CYCLES },
[CSF_FIRMWARE_STOP_TIMEOUT] = { "CSF_FIRMWARE_STOP_TIMEOUT",
CSF_FIRMWARE_STOP_TIMEOUT_CYCLES },
[CSF_PWR_DELEGATE_TIMEOUT] = { "CSF_PWR_DELEGATE_TIMEOUT",
CSF_PWR_DELEGATE_TIMEOUT_CYCLES },
[CSF_PWR_INSPECT_TIMEOUT] = { "CSF_PWR_INSPECT_TIMEOUT", CSF_PWR_INSPECT_TIMEOUT_CYCLES },
[CSF_GPU_SUSPEND_TIMEOUT] = { "CSF_GPU_SUSPEND_TIMEOUT", CSF_GPU_SUSPEND_TIMEOUT_CYCLES },
};
void kbase_backend_invalidate_gpu_timestamp_offset(struct kbase_device *kbdev)
{
kbdev->backend_time.gpu_timestamp_offset = GPU_TIMESTAMP_OFFSET_INVALID;
}
KBASE_EXPORT_TEST_API(kbase_backend_invalidate_gpu_timestamp_offset);
/**
* kbase_backend_compute_gpu_ts_offset() - Compute GPU TS offset.
*
* @kbdev: Kbase device.
*
* This function compute the value of GPU and CPU TS offset:
* - set to zero current TIMESTAMP_OFFSET register
* - read CPU TS and convert it to ticks
* - read GPU TS
* - calculate diff between CPU and GPU ticks
* - cache the diff as the GPU TS offset
*
* To reduce delays, preemption must be disabled during reads of both CPU and GPU TS
* this function require access to GPU register to be enabled
*/
static inline void kbase_backend_compute_gpu_ts_offset(struct kbase_device *kbdev)
{
s64 cpu_ts_ticks = 0;
s64 gpu_ts_ticks = 0;
if (kbdev->backend_time.gpu_timestamp_offset != GPU_TIMESTAMP_OFFSET_INVALID)
return;
if (kbase_io_is_aw_removed(kbdev))
return;
else {
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(TIMESTAMP_OFFSET), 0);
gpu_ts_ticks = kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(TIMESTAMP));
cpu_ts_ticks = ktime_get_raw_ns();
cpu_ts_ticks = div64_u64(cpu_ts_ticks * kbdev->backend_time.divisor,
kbdev->backend_time.multiplier);
kbdev->backend_time.gpu_timestamp_offset = cpu_ts_ticks - gpu_ts_ticks;
}
}
void kbase_backend_update_gpu_timestamp_offset(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->pm.lock);
kbase_backend_compute_gpu_ts_offset(kbdev);
dev_dbg(kbdev->dev, "Setting GPU timestamp offset register to %lld (%lld ns)",
kbdev->backend_time.gpu_timestamp_offset,
div64_s64(kbdev->backend_time.gpu_timestamp_offset *
(s64)kbdev->backend_time.multiplier,
(s64)kbdev->backend_time.divisor));
kbase_reg_write64(kbdev, GPU_CONTROL_ENUM(TIMESTAMP_OFFSET),
kbdev->backend_time.gpu_timestamp_offset);
}
#if MALI_UNIT_TEST
u64 kbase_backend_read_gpu_timestamp_offset_reg(struct kbase_device *kbdev)
{
return kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(TIMESTAMP_OFFSET));
}
KBASE_EXPORT_TEST_API(kbase_backend_read_gpu_timestamp_offset_reg);
#endif
void kbase_backend_get_gpu_time_norequest(struct kbase_device *kbdev, u64 *cycle_counter,
u64 *system_time, struct timespec64 *ts)
{
if (cycle_counter)
*cycle_counter = kbase_backend_get_cycle_cnt(kbdev);
if (system_time) {
*system_time = kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(TIMESTAMP));
}
/* Record the CPU's idea of current time */
if (ts != NULL)
#if (KERNEL_VERSION(4, 17, 0) > LINUX_VERSION_CODE)
*ts = ktime_to_timespec64(ktime_get_raw());
#else
ktime_get_raw_ts64(ts);
#endif
}
KBASE_EXPORT_TEST_API(kbase_backend_get_gpu_time_norequest);
void kbase_backend_get_gpu_time(struct kbase_device *kbdev, u64 *cycle_counter, u64 *system_time,
struct timespec64 *ts)
{
kbase_backend_get_gpu_time_norequest(kbdev, cycle_counter, system_time, ts);
}
KBASE_EXPORT_TEST_API(kbase_backend_get_gpu_time);
static u64 kbase_device_get_scaling_frequency(struct kbase_device *kbdev)
{
u64 freq_khz = kbdev->lowest_gpu_freq_khz;
if (!freq_khz) {
dev_dbg(kbdev->dev,
"Lowest frequency uninitialized! Using reference frequency for scaling");
return DEFAULT_REF_TIMEOUT_FREQ_KHZ;
}
return freq_khz;
}
void kbase_device_set_timeout_ms(struct kbase_device *kbdev, enum kbase_timeout_selector selector,
unsigned int timeout_ms)
{
char *selector_str;
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
selector = KBASE_DEFAULT_TIMEOUT;
selector_str = timeout_info[selector].selector_str;
dev_warn(kbdev->dev,
"Unknown timeout selector passed, falling back to default: %s\n",
timeout_info[selector].selector_str);
}
selector_str = timeout_info[selector].selector_str;
if ((kbdev->gpu_props.impl_tech <= THREAD_FEATURES_IMPLEMENTATION_TECHNOLOGY_SILICON) &&
unlikely(timeout_ms >= MAX_TIMEOUT_MS)) {
dev_warn(kbdev->dev, "%s is capped from %dms to %dms\n",
timeout_info[selector].selector_str, timeout_ms, MAX_TIMEOUT_MS);
timeout_ms = MAX_TIMEOUT_MS;
}
kbdev->backend_time.device_scaled_timeouts[selector] = timeout_ms;
dev_dbg(kbdev->dev, "\t%-35s: %ums\n", selector_str, timeout_ms);
}
void kbase_device_set_timeout(struct kbase_device *kbdev, enum kbase_timeout_selector selector,
u64 timeout_cycles, u32 cycle_multiplier)
{
u64 final_cycles;
u64 timeout;
u64 freq_khz = kbase_device_get_scaling_frequency(kbdev);
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
selector = KBASE_DEFAULT_TIMEOUT;
dev_warn(kbdev->dev,
"Unknown timeout selector passed, falling back to default: %s\n",
timeout_info[selector].selector_str);
}
/* If the multiplication overflows, we will have unsigned wrap-around, and so might
* end up with a shorter timeout. In those cases, we then want to have the largest
* timeout possible that will not run into these issues. Note that this will not
* wait for U64_MAX/frequency ms, as it will be clamped to a max of UINT_MAX
* milliseconds by subsequent steps.
*/
if (check_mul_overflow(timeout_cycles, (u64)cycle_multiplier, &final_cycles))
final_cycles = U64_MAX;
/* Timeout calculation:
* dividing number of cycles by freq in KHz automatically gives value
* in milliseconds. nr_cycles will have to be multiplied by 1e3 to
* get result in microseconds, and 1e6 to get result in nanoseconds.
*/
timeout = div_u64(final_cycles, freq_khz);
if (unlikely(timeout > UINT_MAX)) {
dev_dbg(kbdev->dev,
"Capping excessive timeout %llums for %s at freq %llukHz to UINT_MAX ms",
timeout, timeout_info[selector].selector_str,
kbase_device_get_scaling_frequency(kbdev));
timeout = UINT_MAX;
}
kbase_device_set_timeout_ms(kbdev, selector, (unsigned int)timeout);
}
/**
* kbase_timeout_scaling_init - Initialize the table of scaled timeout
* values associated with a @kbase_device.
*
* @kbdev: KBase device pointer.
*
* Return: 0 on success, negative error code otherwise.
*/
static int kbase_timeout_scaling_init(struct kbase_device *kbdev)
{
int err;
enum kbase_timeout_selector selector;
/* First, we initialize the minimum and maximum device frequencies, which
* are used to compute the timeouts.
*/
err = kbase_pm_gpu_freq_init(kbdev);
if (unlikely(err < 0)) {
dev_dbg(kbdev->dev, "Could not initialize GPU frequency\n");
return err;
}
dev_dbg(kbdev->dev, "Scaling kbase timeouts:\n");
for (selector = 0; selector < KBASE_TIMEOUT_SELECTOR_COUNT; selector++) {
u32 cycle_multiplier = 1;
u64 nr_cycles;
nr_cycles = timeout_info[selector].timeout_cycles;
/* Special case: the scheduler progress timeout can be set manually,
* and does not have a canonical length defined in the headers. Hence,
* we query it once upon startup to get a baseline, and change it upon
* every invocation of the appropriate functions
*/
if (selector == CSF_SCHED_PROTM_PROGRESS_TIMEOUT)
nr_cycles = kbase_csf_timeout_get(kbdev);
if (selector == KCPU_FENCE_SIGNAL_TIMEOUT) {
if ((kbdev->gpu_props.impl_tech ==
THREAD_FEATURES_IMPLEMENTATION_TECHNOLOGY_FPGA) ||
(kbdev->gpu_props.impl_tech ==
THREAD_FEATURES_IMPLEMENTATION_TECHNOLOGY_SOFTWARE)) {
nr_cycles = KCPU_FENCE_SIGNAL_TIMEOUT_CYCLES_FPGA;
}
}
/* Since we are in control of the iteration bounds for the selector,
* we don't have to worry about bounds checking when setting the timeout.
*/
kbase_device_set_timeout(kbdev, selector, nr_cycles, cycle_multiplier);
}
return 0;
}
unsigned int kbase_get_timeout_ms(struct kbase_device *kbdev, enum kbase_timeout_selector selector)
{
if (unlikely(selector >= KBASE_TIMEOUT_SELECTOR_COUNT)) {
dev_warn(kbdev->dev, "Querying wrong selector, falling back to default\n");
selector = KBASE_DEFAULT_TIMEOUT;
}
return kbdev->backend_time.device_scaled_timeouts[selector];
}
KBASE_EXPORT_TEST_API(kbase_get_timeout_ms);
u64 kbase_backend_get_cycle_cnt(struct kbase_device *kbdev)
{
return kbase_reg_read64_coherent(kbdev, GPU_CONTROL_ENUM(CYCLE_COUNT));
}
u64 __maybe_unused kbase_backend_time_convert_gpu_to_cpu(struct kbase_device *kbdev, u64 gpu_ts)
{
if (WARN_ON(!kbdev))
return 0;
return div64_u64(gpu_ts * kbdev->backend_time.multiplier, kbdev->backend_time.divisor);
}
KBASE_EXPORT_TEST_API(kbase_backend_time_convert_gpu_to_cpu);
u64 kbase_arch_timer_get_cntfrq(struct kbase_device *kbdev)
{
u64 freq = mali_arch_timer_get_cntfrq();
dev_dbg(kbdev->dev, "System Timer Freq = %lluHz", freq);
return freq;
}
int kbase_backend_time_init(struct kbase_device *kbdev)
{
int err = 0;
u64 freq;
u64 common_factor;
kbase_pm_register_access_enable(kbdev);
freq = kbase_arch_timer_get_cntfrq(kbdev);
if (!freq) {
dev_warn(kbdev->dev, "arch_timer_get_rate() is zero!");
err = -EINVAL;
goto disable_registers;
}
common_factor = gcd(NSEC_PER_SEC, freq);
kbdev->backend_time.multiplier = div64_u64(NSEC_PER_SEC, common_factor);
kbdev->backend_time.divisor = div64_u64(freq, common_factor);
if (!kbdev->backend_time.divisor) {
dev_warn(kbdev->dev, "CPU to GPU divisor is zero!");
err = -EINVAL;
goto disable_registers;
}
kbase_backend_invalidate_gpu_timestamp_offset(
kbdev); /* force computation of GPU Timestamp offset */
if (kbase_timeout_scaling_init(kbdev)) {
dev_warn(kbdev->dev, "Could not initialize timeout scaling");
err = -EINVAL;
}
disable_registers:
kbase_pm_register_access_disable(kbdev);
return err;
}

View File

@@ -0,0 +1,251 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2017-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/* Kernel-side tests may include mali_kbase's headers. Therefore any config
* options which affect the sizes of any structs (e.g. adding extra members)
* must be included in these defaults, so that the structs are consistent in
* both mali_kbase and the test modules. */
bob_defaults {
name: "mali_kbase_shared_config_defaults",
defaults: [
"kernel_defaults",
],
mali_no_mali: {
kbuild_options: [
"CONFIG_MALI_BIFROST_NO_MALI=y",
"CONFIG_MALI_VALHALL_NO_MALI_DEFAULT_GPU={{.gpu}}",
"CONFIG_GPU_HWVER={{.hwver}}",
],
},
gpu_has_csf: {
kbuild_options: ["CONFIG_MALI_VALHALL_CSF_SUPPORT=y"],
},
mali_devfreq: {
kbuild_options: ["CONFIG_MALI_BIFROST_DEVFREQ=y"],
},
mali_midgard_dvfs: {
kbuild_options: ["CONFIG_MALI_BIFROST_DVFS=y"],
},
mali_gator_support: {
kbuild_options: ["CONFIG_MALI_BIFROST_GATOR_SUPPORT=y"],
},
mali_midgard_enable_trace: {
kbuild_options: ["CONFIG_MALI_BIFROST_ENABLE_TRACE=y"],
},
mali_dma_buf_map_on_demand: {
kbuild_options: ["CONFIG_MALI_VALHALL_DMA_BUF_MAP_ON_DEMAND=y"],
},
mali_dma_buf_legacy_compat: {
kbuild_options: ["CONFIG_MALI_VALHALL_DMA_BUF_LEGACY_COMPAT=y"],
},
large_page_support: {
kbuild_options: ["CONFIG_VALHALL_LARGE_PAGE_SUPPORT=y"],
},
mali_corestack: {
kbuild_options: ["CONFIG_MALI_VALHALL_CORESTACK=y"],
},
mali_real_hw: {
kbuild_options: ["CONFIG_MALI_VALHALL_REAL_HW=y"],
},
mali_debug: {
kbuild_options: [
"CONFIG_MALI_BIFROST_DEBUG=y",
"MALI_KERNEL_TEST_API={{.debug}}",
],
},
mali_fence_debug: {
kbuild_options: ["CONFIG_MALI_BIFROST_FENCE_DEBUG=y"],
},
mali_system_trace: {
kbuild_options: ["CONFIG_MALI_BIFROST_SYSTEM_TRACE=y"],
},
cinstr_vector_dump: {
kbuild_options: ["CONFIG_MALI_VECTOR_DUMP=y"],
},
cinstr_primary_hwc: {
kbuild_options: ["CONFIG_MALI_VALHALL_PRFCNT_SET_PRIMARY=y"],
},
cinstr_secondary_hwc: {
kbuild_options: ["CONFIG_MALI_BIFROST_PRFCNT_SET_SECONDARY=y"],
},
cinstr_tertiary_hwc: {
kbuild_options: ["CONFIG_MALI_VALHALL_PRFCNT_SET_TERTIARY=y"],
},
cinstr_hwc_set_select_via_debug_fs: {
kbuild_options: ["CONFIG_MALI_VALHALL_PRFCNT_SET_SELECT_VIA_DEBUG_FS=y"],
},
mali_job_dump: {
kbuild_options: ["CONFIG_MALI_VALHALL_JOB_DUMP"],
},
mali_hw_errata_1485982_not_affected: {
kbuild_options: ["CONFIG_MALI_VALHALL_HW_ERRATA_1485982_NOT_AFFECTED=y"],
},
mali_hw_errata_1485982_use_clock_alternative: {
kbuild_options: ["CONFIG_MALI_VALHALL_HW_ERRATA_1485982_USE_CLOCK_ALTERNATIVE=y"],
},
mali_coresight: {
kbuild_options: ["CONFIG_MALI_VALHALL_CORESIGHT=y"],
},
mali_fw_trace_mode_manual: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_MANUAL=y"],
},
mali_fw_trace_mode_auto_print: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_AUTO_PRINT=y"],
},
mali_fw_trace_mode_auto_discard: {
kbuild_options: ["CONFIG_MALI_FW_TRACE_MODE_AUTO_DISCARD=y"],
},
kbuild_options: [
"CONFIG_MALI_VALHALL_PLATFORM_NAME={{.mali_platform_name}}",
"MALI_CUSTOMER_RELEASE={{.release}}",
"MALI_UNIT_TEST={{.unit_test_code}}",
"MALI_USE_CSF={{.gpu_has_csf}}",
"MALI_JIT_PRESSURE_LIMIT_BASE={{.jit_pressure_limit_base}}",
// Start of CS experimental features definitions.
// If there is nothing below, definition should be added as follows:
// "MALI_EXPERIMENTAL_FEATURE={{.experimental_feature}}"
// experimental_feature above comes from Mconfig in
// <ddk_root>/product/base/
// However, in Mconfig, experimental_feature should be looked up (for
// similar explanation to this one) as ALLCAPS, i.e.
// EXPERIMENTAL_FEATURE.
//
// IMPORTANT: MALI_CS_EXPERIMENTAL should NEVER be defined below as it
// is an umbrella feature that would be open for inappropriate use
// (catch-all for experimental CS code without separating it into
// different features).
"MALI_BASE_CSF_PERFORMANCE_TESTS={{.base_csf_performance_tests}}",
],
}
bob_kernel_module {
name: "mali_kbase",
defaults: [
"mali_kbase_shared_config_defaults",
],
srcs: [
"*.c",
"*.h",
"Kbuild",
"arbiter/*.c",
"arbiter/*.h",
"arbiter/Kbuild",
"backend/gpu/*.c",
"backend/gpu/*.h",
"backend/gpu/Kbuild",
"context/*.c",
"context/*.h",
"context/Kbuild",
"hwcnt/*.c",
"hwcnt/*.h",
"hwcnt/backend/*.h",
"hwcnt/Kbuild",
"ipa/*.c",
"ipa/*.h",
"ipa/Kbuild",
"platform/*.h",
"platform/*/*.c",
"platform/*/*.h",
"platform/*/Kbuild",
"platform/*/*/*.c",
"platform/*/*/*.h",
"platform/*/*/Kbuild",
"platform/*/*/*.c",
"platform/*/*/*.h",
"platform/*/*/Kbuild",
"platform/*/*/*/*.c",
"platform/*/*/*/*.h",
"platform/*/*/*/Kbuild",
"thirdparty/*.c",
"thirdparty/*.h",
"thirdparty/Kbuild",
"debug/*.c",
"debug/*.h",
"debug/Kbuild",
"device/*.c",
"device/*.h",
"device/Kbuild",
"gpu/*.c",
"gpu/*.h",
"gpu/Kbuild",
"hw_access/*.c",
"hw_access/*.h",
"hw_access/*/*.c",
"hw_access/*/*.h",
"hw_access/Kbuild",
"tl/*.c",
"tl/*.h",
"tl/Kbuild",
"mmu/*.c",
"mmu/*.h",
"mmu/Kbuild",
],
gpu_has_job_manager: {
srcs: [
"context/backend/*_jm.c",
"debug/backend/*_jm.c",
"debug/backend/*_jm.h",
"device/backend/*_jm.c",
"gpu/backend/*_jm.c",
"gpu/backend/*_jm.h",
"hwcnt/backend/*_jm.c",
"hwcnt/backend/*_jm.h",
"hwcnt/backend/*_jm_*.c",
"hwcnt/backend/*_jm_*.h",
"jm/*.h",
"tl/backend/*_jm.c",
"mmu/backend/*_jm.c",
"mmu/backend/*_jm.h",
"ipa/backend/*_jm.c",
"ipa/backend/*_jm.h",
],
},
gpu_has_csf: {
srcs: [
"context/backend/*_csf.c",
"csf/*.c",
"csf/*.h",
"csf/Kbuild",
"csf/ipa_control/*.c",
"csf/ipa_control/*.h",
"csf/ipa_control/Kbuild",
"debug/backend/*_csf.c",
"debug/backend/*_csf.h",
"device/backend/*_csf.c",
"gpu/backend/*_csf.c",
"gpu/backend/*_csf.h",
"hwcnt/backend/*_csf.c",
"hwcnt/backend/*_csf.h",
"hwcnt/backend/*_csf_*.c",
"hwcnt/backend/*_csf_*.h",
"tl/backend/*_csf.c",
"mmu/backend/*_csf.c",
"mmu/backend/*_csf.h",
"ipa/backend/*_csf.c",
"ipa/backend/*_csf.h",
],
},
kbuild_options: [
"CONFIG_MALI_BIFROST=m",
"CONFIG_MALI_VALHALL_KUTF=n",
],
}

View File

@@ -0,0 +1,27 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2012-2013, 2016-2017, 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
valhall_kbase-y += context/mali_kbase_context.o
ifeq ($(CONFIG_MALI_VALHALL_CSF_SUPPORT),y)
valhall_kbase-y += context/backend/mali_kbase_context_csf.o
else
valhall_kbase-y += context/backend/mali_kbase_context_jm.o
endif

View File

@@ -0,0 +1,215 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs for CSF GPUs
*/
#include <context/mali_kbase_context_internal.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_mem_pool_group.h>
#include <mmu/mali_kbase_mmu.h>
#include <tl/mali_kbase_timeline.h>
#include <mali_kbase_ctx_sched.h>
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include <csf/mali_kbase_csf_csg_debugfs.h>
#include <csf/mali_kbase_csf_kcpu_debugfs.h>
#include <csf/mali_kbase_csf_sync_debugfs.h>
#include <csf/mali_kbase_csf_tiler_heap_debugfs.h>
#include <csf/mali_kbase_csf_cpu_queue_debugfs.h>
#include <mali_kbase_debug_mem_view.h>
#include <mali_kbase_debug_mem_zones.h>
#include <mali_kbase_debug_mem_allocs.h>
#include <mali_kbase_mem_pool_debugfs.h>
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
kbase_debug_mem_view_init(kctx);
kbase_debug_mem_zones_init(kctx);
kbase_debug_mem_allocs_init(kctx);
kbase_mem_pool_debugfs_init(kctx->kctx_dentry, kctx);
kbase_jit_debugfs_init(kctx);
kbase_csf_queue_group_debugfs_init(kctx);
kbase_csf_kcpu_debugfs_init(kctx);
kbase_csf_sync_debugfs_init(kctx);
kbase_csf_tiler_heap_debugfs_init(kctx);
kbase_csf_tiler_heap_total_debugfs_init(kctx);
kbase_csf_cpu_queue_debugfs_init(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
debugfs_remove_recursive(kctx->kctx_dentry);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#else
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#endif /* CONFIG_DEBUG_FS */
static void kbase_context_free(struct kbase_context *kctx)
{
kbase_timeline_post_kbase_context_destroy(kctx);
vfree(kctx);
}
static const struct kbase_context_init context_init[] = {
{ NULL, kbase_context_free, NULL },
{ kbase_context_common_init, kbase_context_common_term,
"Common context initialization failed" },
{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
"Memory pool group initialization failed" },
{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
"Memory evictable initialization failed" },
{ kbase_ctx_sched_init_ctx, NULL, NULL },
{ kbase_context_mmu_init, kbase_context_mmu_term, "MMU initialization failed" },
{ kbase_context_mem_alloc_page, kbase_context_mem_pool_free, "Memory alloc page failed" },
{ kbase_region_tracker_init, kbase_region_tracker_term,
"Region tracker initialization failed" },
{ kbase_sticky_resource_init, kbase_context_sticky_resource_term,
"Sticky resource initialization failed" },
{ kbase_jit_init, kbase_jit_term, "JIT initialization failed" },
{ kbase_csf_ctx_init, kbase_csf_ctx_term, "CSF context initialization failed" },
{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
"Adding kctx to device failed" },
};
static void kbase_context_term_partial(struct kbase_context *kctx, unsigned int i)
{
while (i-- > 0) {
if (context_init[i].term)
context_init[i].term(kctx);
}
}
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long const api_version, struct file *const filp)
{
struct kbase_context *kctx;
unsigned int i = 0;
if (WARN_ON(!kbdev))
return NULL;
/* Validate flags */
if (WARN_ON(flags != (flags & BASEP_CONTEXT_CREATE_KERNEL_FLAGS)))
return NULL;
/* zero-inited as lot of code assume it's zero'ed out on create */
kctx = vzalloc(sizeof(*kctx));
if (WARN_ON(!kctx))
return NULL;
kctx->kbdev = kbdev;
kctx->api_version = api_version;
kctx->filp = filp;
kctx->create_flags = flags;
memcpy(kctx->comm, current->comm, sizeof(current->comm));
if (is_compat)
kbase_ctx_flag_set(kctx, KCTX_COMPAT);
#if defined(CONFIG_64BIT)
else
kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
#endif /* defined(CONFIG_64BIT) */
for (i = 0; i < ARRAY_SIZE(context_init); i++) {
int err = 0;
if (context_init[i].init)
err = context_init[i].init(kctx);
if (err) {
dev_err(kbdev->dev, "%s error = %d\n", context_init[i].err_mes, err);
/* kctx should be freed by kbase_context_free().
* Otherwise it will result in memory leak.
*/
WARN_ON(i == 0);
kbase_context_term_partial(kctx, i);
return NULL;
}
}
return kctx;
}
KBASE_EXPORT_SYMBOL(kbase_create_context);
void kbase_destroy_context(struct kbase_context *kctx)
{
struct kbase_device *kbdev;
if (WARN_ON(!kctx))
return;
kbdev = kctx->kbdev;
if (WARN_ON(!kbdev))
return;
/* Context termination could happen whilst the system suspend of
* the GPU device is ongoing or has completed. It has been seen on
* Customer side that a hang could occur if context termination is
* not blocked until the resume of GPU device.
*/
if (kbase_has_arbiter(kbdev))
atomic_inc(&kbdev->pm.gpu_users_waiting);
while (kbase_pm_context_active_handle_suspend(kbdev,
KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
dev_dbg(kbdev->dev, "Suspend in progress when destroying context");
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_suspending(kbdev));
}
if (kbase_has_arbiter(kbdev))
atomic_dec(&kbdev->pm.gpu_users_waiting);
/* Have synchronized against the System suspend and incremented the
* pm.active_count. So any subsequent invocation of System suspend
* callback would get blocked.
* If System suspend callback was already in progress then the above loop
* would have waited till the System resume callback has begun.
* So wait for the System resume callback to also complete as we want to
* avoid context termination during System resume also.
*/
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_resuming(kbdev));
kbase_mem_pool_group_mark_dying(&kctx->mem_pools);
kbase_context_term_partial(kctx, ARRAY_SIZE(context_init));
kbase_pm_context_idle(kbdev);
}
KBASE_EXPORT_SYMBOL(kbase_destroy_context);

View File

@@ -0,0 +1,266 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs for Job Manager GPUs
*/
#include <context/mali_kbase_context_internal.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase.h>
#include <mali_kbase_ctx_sched.h>
#include <mali_kbase_kinstr_jm.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_mem_pool_group.h>
#include <mmu/mali_kbase_mmu.h>
#include <tl/mali_kbase_timeline.h>
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include <mali_kbase_debug_mem_view.h>
#include <mali_kbase_debug_mem_zones.h>
#include <mali_kbase_debug_mem_allocs.h>
#include <mali_kbase_mem_pool_debugfs.h>
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
kbase_debug_mem_view_init(kctx);
kbase_debug_mem_zones_init(kctx);
kbase_debug_mem_allocs_init(kctx);
kbase_mem_pool_debugfs_init(kctx->kctx_dentry, kctx);
kbase_jit_debugfs_init(kctx);
kbasep_jd_debugfs_ctx_init(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
debugfs_remove_recursive(kctx->kctx_dentry);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#else
void kbase_context_debugfs_init(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);
void kbase_context_debugfs_term(struct kbase_context *const kctx)
{
CSTD_UNUSED(kctx);
}
KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
#endif /* CONFIG_DEBUG_FS */
static int kbase_context_kbase_kinstr_jm_init(struct kbase_context *kctx)
{
return kbase_kinstr_jm_init(&kctx->kinstr_jm);
}
static void kbase_context_kbase_kinstr_jm_term(struct kbase_context *kctx)
{
kbase_kinstr_jm_term(kctx->kinstr_jm);
}
static int kbase_context_kbase_timer_setup(struct kbase_context *kctx)
{
kbase_timer_setup(&kctx->soft_job_timeout, kbasep_soft_job_timeout_worker);
return 0;
}
static int kbase_context_submit_check(struct kbase_context *kctx)
{
struct kbasep_js_kctx_info *js_kctx_info = &kctx->jctx.sched_info;
unsigned long irq_flags = 0;
base_context_create_flags const flags = kctx->create_flags;
mutex_lock(&js_kctx_info->ctx.jsctx_mutex);
spin_lock_irqsave(&kctx->kbdev->hwaccess_lock, irq_flags);
/* Translate the flags */
if ((flags & BASE_CONTEXT_SYSTEM_MONITOR_SUBMIT_DISABLED) == 0)
kbase_ctx_flag_clear(kctx, KCTX_SUBMIT_DISABLED);
spin_unlock_irqrestore(&kctx->kbdev->hwaccess_lock, irq_flags);
mutex_unlock(&js_kctx_info->ctx.jsctx_mutex);
return 0;
}
static void kbase_context_flush_jobs(struct kbase_context *kctx)
{
kbase_jd_zap_context(kctx);
flush_workqueue(kctx->jctx.job_done_wq);
}
/**
* kbase_context_free - Free kcontext at its destruction
*
* @kctx: kcontext to be freed
*/
static void kbase_context_free(struct kbase_context *kctx)
{
kbase_timeline_post_kbase_context_destroy(kctx);
vfree(kctx);
}
static const struct kbase_context_init context_init[] = {
{ NULL, kbase_context_free, NULL },
{ kbase_context_common_init, kbase_context_common_term,
"Common context initialization failed" },
{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
"Memory pool group initialization failed" },
{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
"Memory evictable initialization failed" },
{ kbase_ctx_sched_init_ctx, NULL, NULL },
{ kbase_context_mmu_init, kbase_context_mmu_term, "MMU initialization failed" },
{ kbase_context_mem_alloc_page, kbase_context_mem_pool_free, "Memory alloc page failed" },
{ kbase_region_tracker_init, kbase_region_tracker_term,
"Region tracker initialization failed" },
{ kbase_sticky_resource_init, kbase_context_sticky_resource_term,
"Sticky resource initialization failed" },
{ kbase_jit_init, kbase_jit_term, "JIT initialization failed" },
{ kbase_context_kbase_kinstr_jm_init, kbase_context_kbase_kinstr_jm_term,
"JM instrumentation initialization failed" },
{ kbase_context_kbase_timer_setup, NULL, "Timers initialization failed" },
{ kbase_event_init, kbase_event_cleanup, "Event initialization failed" },
{ kbasep_js_kctx_init, kbasep_js_kctx_term, "JS kctx initialization failed" },
{ kbase_jd_init, kbase_jd_exit, "JD initialization failed" },
{ kbase_context_submit_check, NULL, "Enabling job submission failed" },
#if IS_ENABLED(CONFIG_DEBUG_FS)
{ kbase_debug_job_fault_context_init, kbase_debug_job_fault_context_term,
"Job fault context initialization failed" },
#endif
{ kbasep_platform_context_init, kbasep_platform_context_term,
"Platform callback for kctx initialization failed" },
{ NULL, kbase_context_flush_jobs, NULL },
{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
"Adding kctx to device failed" },
};
static void kbase_context_term_partial(struct kbase_context *kctx, unsigned int i)
{
while (i-- > 0) {
if (context_init[i].term)
context_init[i].term(kctx);
}
}
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long const api_version, struct file *const filp)
{
struct kbase_context *kctx;
unsigned int i = 0;
if (WARN_ON(!kbdev))
return NULL;
/* Validate flags */
if (WARN_ON(flags != (flags & BASEP_CONTEXT_CREATE_KERNEL_FLAGS)))
return NULL;
/* zero-inited as lot of code assume it's zero'ed out on create */
kctx = vzalloc(sizeof(*kctx));
if (WARN_ON(!kctx))
return NULL;
kctx->kbdev = kbdev;
kctx->api_version = api_version;
kctx->filp = filp;
kctx->create_flags = flags;
if (is_compat)
kbase_ctx_flag_set(kctx, KCTX_COMPAT);
#if defined(CONFIG_64BIT)
else
kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
#endif /* defined(CONFIG_64BIT) */
for (i = 0; i < ARRAY_SIZE(context_init); i++) {
int err = 0;
if (context_init[i].init)
err = context_init[i].init(kctx);
if (err) {
dev_err(kbdev->dev, "%s error = %d\n", context_init[i].err_mes, err);
/* kctx should be freed by kbase_context_free().
* Otherwise it will result in memory leak.
*/
WARN_ON(i == 0);
kbase_context_term_partial(kctx, i);
return NULL;
}
}
return kctx;
}
KBASE_EXPORT_SYMBOL(kbase_create_context);
void kbase_destroy_context(struct kbase_context *kctx)
{
struct kbase_device *kbdev;
if (WARN_ON(!kctx))
return;
kbdev = kctx->kbdev;
if (WARN_ON(!kbdev))
return;
/* Context termination could happen whilst the system suspend of
* the GPU device is ongoing or has completed. It has been seen on
* Customer side that a hang could occur if context termination is
* not blocked until the resume of GPU device.
*/
if (kbase_has_arbiter(kbdev))
atomic_inc(&kbdev->pm.gpu_users_waiting);
while (kbase_pm_context_active_handle_suspend(kbdev,
KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
dev_dbg(kbdev->dev, "Suspend in progress when destroying context");
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_suspending(kbdev));
}
/* Have synchronized against the System suspend and incremented the
* pm.active_count. So any subsequent invocation of System suspend
* callback would get blocked.
* If System suspend callback was already in progress then the above loop
* would have waited till the System resume callback has begun.
* So wait for the System resume callback to also complete as we want to
* avoid context termination during System resume also.
*/
wait_event(kbdev->pm.resume_wait, !kbase_pm_is_resuming(kbdev));
if (kbase_has_arbiter(kbdev))
atomic_dec(&kbdev->pm.gpu_users_waiting);
kbase_mem_pool_group_mark_dying(&kctx->mem_pools);
kbase_context_term_partial(kctx, ARRAY_SIZE(context_init));
kbase_pm_context_idle(kbdev);
}
KBASE_EXPORT_SYMBOL(kbase_destroy_context);

View File

@@ -0,0 +1,372 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2019-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
/*
* Base kernel context APIs
*/
#include <linux/version.h>
#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
#include <linux/sched/task.h>
#endif
#if KERNEL_VERSION(4, 19, 0) <= LINUX_VERSION_CODE
#include <linux/sched/signal.h>
#else
#include <linux/sched.h>
#endif
#include <mali_kbase.h>
#include <hw_access/mali_kbase_hw_access_regmap.h>
#include <mali_kbase_mem_linux.h>
#include <mali_kbase_ctx_sched.h>
#include <mali_kbase_mem_pool_group.h>
#include <tl/mali_kbase_timeline.h>
#include <mmu/mali_kbase_mmu.h>
#include <context/mali_kbase_context_internal.h>
/**
* find_process_node - Used to traverse the process rb_tree to find if
* process exists already in process rb_tree.
*
* @node: Pointer to root node to start search.
* @tgid: Thread group PID to search for.
*
* Return: Pointer to kbase_process if exists otherwise NULL.
*/
static struct kbase_process *find_process_node(struct rb_node *node, pid_t tgid)
{
struct kbase_process *kprcs = NULL;
/* Check if the kctx creation request is from a existing process.*/
while (node) {
struct kbase_process *prcs_node = rb_entry(node, struct kbase_process, kprcs_node);
if (prcs_node->tgid == tgid) {
kprcs = prcs_node;
break;
}
if (tgid < prcs_node->tgid)
node = node->rb_left;
else
node = node->rb_right;
}
return kprcs;
}
/**
* kbase_insert_kctx_to_process - Initialise kbase process context.
*
* @kctx: Pointer to kbase context.
*
* Here we initialise per process rb_tree managed by kbase_device.
* We maintain a rb_tree of each unique process that gets created.
* and Each process maintains a list of kbase context.
* This setup is currently used by kernel trace functionality
* to trace and visualise gpu memory consumption.
*
* Return: 0 on success and error number on failure.
*/
static int kbase_insert_kctx_to_process(struct kbase_context *kctx)
{
struct rb_root *const prcs_root = &kctx->kbdev->process_root;
const pid_t tgid = kctx->tgid;
struct kbase_process *kprcs = NULL;
lockdep_assert_held(&kctx->kbdev->kctx_list_lock);
kprcs = find_process_node(prcs_root->rb_node, tgid);
/* if the kctx is from new process then create a new kbase_process
* and add it to the &kbase_device->rb_tree
*/
if (!kprcs) {
struct rb_node **new = &prcs_root->rb_node, *parent = NULL;
kprcs = kzalloc(sizeof(*kprcs), GFP_KERNEL);
if (kprcs == NULL)
return -ENOMEM;
kprcs->tgid = tgid;
INIT_LIST_HEAD(&kprcs->kctx_list);
kprcs->dma_buf_root = RB_ROOT;
kprcs->total_gpu_pages = 0;
while (*new) {
struct kbase_process *prcs_node;
parent = *new;
prcs_node = rb_entry(parent, struct kbase_process, kprcs_node);
if (tgid < prcs_node->tgid)
new = &(*new)->rb_left;
else
new = &(*new)->rb_right;
}
rb_link_node(&kprcs->kprcs_node, parent, new);
rb_insert_color(&kprcs->kprcs_node, prcs_root);
}
kctx->kprcs = kprcs;
list_add(&kctx->kprcs_link, &kprcs->kctx_list);
return 0;
}
int kbase_context_common_init(struct kbase_context *kctx)
{
const unsigned long cookies_mask = KBASE_COOKIE_MASK;
int err = 0;
/* creating a context is considered a disjoint event */
kbase_disjoint_event(kctx->kbdev);
kctx->tgid = task_tgid_vnr(current);
kctx->pid = task_pid_vnr(current);
/* Check if this is a Userspace created context */
if (likely(kctx->filp)) {
struct pid *pid_struct;
rcu_read_lock();
pid_struct = get_pid(task_tgid(current));
if (likely(pid_struct)) {
struct task_struct *task = pid_task(pid_struct, PIDTYPE_PID);
if (likely(task)) {
/* Take a reference on the task to avoid slow lookup
* later on from the page allocation loop.
*/
get_task_struct(task);
kctx->task = task;
} else {
dev_err(kctx->kbdev->dev, "Failed to get task pointer for %s/%d",
current->comm, kctx->pid);
err = -ESRCH;
}
put_pid(pid_struct);
} else {
dev_err(kctx->kbdev->dev, "Failed to get pid pointer for %s/%d",
current->comm, kctx->pid);
err = -ESRCH;
}
rcu_read_unlock();
if (unlikely(err))
return err;
kbase_mem_mmgrab();
kctx->process_mm = current->mm;
}
mutex_init(&kctx->reg_lock);
spin_lock_init(&kctx->mem_partials_lock);
INIT_LIST_HEAD(&kctx->mem_partials);
spin_lock_init(&kctx->waiting_soft_jobs_lock);
INIT_LIST_HEAD(&kctx->waiting_soft_jobs);
init_waitqueue_head(&kctx->event_queue);
kbase_gpu_vm_lock(kctx);
bitmap_copy(kctx->cookies, &cookies_mask, BITS_PER_LONG);
kbase_gpu_vm_unlock(kctx);
kctx->id = (u32)atomic_add_return(1, &(kctx->kbdev->ctx_num)) - 1;
mutex_lock(&kctx->kbdev->kctx_list_lock);
err = kbase_insert_kctx_to_process(kctx);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
if (err) {
dev_err(kctx->kbdev->dev, "(err:%d) failed to insert kctx to kbase_process", err);
if (likely(kctx->filp)) {
mmdrop(kctx->process_mm);
put_task_struct(kctx->task);
}
}
return err;
}
int kbase_context_add_to_dev_list(struct kbase_context *kctx)
{
if (WARN_ON(!kctx))
return -EINVAL;
if (WARN_ON(!kctx->kbdev))
return -EINVAL;
mutex_lock(&kctx->kbdev->kctx_list_lock);
list_add(&kctx->kctx_list_link, &kctx->kbdev->kctx_list);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
kbase_timeline_post_kbase_context_create(kctx);
return 0;
}
void kbase_context_remove_from_dev_list(struct kbase_context *kctx)
{
if (WARN_ON(!kctx))
return;
if (WARN_ON(!kctx->kbdev))
return;
kbase_timeline_pre_kbase_context_destroy(kctx);
mutex_lock(&kctx->kbdev->kctx_list_lock);
list_del_init(&kctx->kctx_list_link);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
}
/**
* kbase_remove_kctx_from_process - remove a terminating context from
* the process list.
*
* @kctx: Pointer to kbase context.
*
* Remove the tracking of context from the list of contexts maintained under
* kbase process and if the list if empty then there no outstanding contexts
* we can remove the process node as well.
*/
static void kbase_remove_kctx_from_process(struct kbase_context *kctx)
{
struct kbase_process *kprcs = kctx->kprcs;
lockdep_assert_held(&kctx->kbdev->kctx_list_lock);
list_del(&kctx->kprcs_link);
/* if there are no outstanding contexts in current process node,
* we can remove it from the process rb_tree.
*/
if (list_empty(&kprcs->kctx_list)) {
rb_erase(&kprcs->kprcs_node, &kctx->kbdev->process_root);
/* Add checks, so that the terminating process Should not
* hold any gpu_memory.
*/
spin_lock(&kctx->kbdev->gpu_mem_usage_lock);
WARN_ON(kprcs->total_gpu_pages);
spin_unlock(&kctx->kbdev->gpu_mem_usage_lock);
WARN_ON(!RB_EMPTY_ROOT(&kprcs->dma_buf_root));
kfree(kprcs);
}
}
void kbase_context_common_term(struct kbase_context *kctx)
{
int pages;
pages = atomic_read(&kctx->used_pages);
if (pages != 0)
dev_warn(kctx->kbdev->dev, "%s: %d pages in use!\n", __func__, pages);
WARN_ON(atomic_read(&kctx->nonmapped_pages) != 0);
mutex_lock(&kctx->kbdev->kctx_list_lock);
kbase_remove_kctx_from_process(kctx);
mutex_unlock(&kctx->kbdev->kctx_list_lock);
if (likely(kctx->filp)) {
mmdrop(kctx->process_mm);
put_task_struct(kctx->task);
}
KBASE_KTRACE_ADD(kctx->kbdev, CORE_CTX_DESTROY, kctx, 0u);
}
int kbase_context_mem_pool_group_init(struct kbase_context *kctx)
{
return kbase_mem_pool_group_init(&kctx->mem_pools, kctx->kbdev,
&kctx->kbdev->mem_pool_defaults, &kctx->kbdev->mem_pools);
}
void kbase_context_mem_pool_group_term(struct kbase_context *kctx)
{
kbase_mem_pool_group_term(&kctx->mem_pools);
}
int kbase_context_mmu_init(struct kbase_context *kctx)
{
return kbase_mmu_init(kctx->kbdev, &kctx->mmu, kctx,
kbase_context_mmu_group_id_get(kctx->create_flags));
}
void kbase_context_mmu_term(struct kbase_context *kctx)
{
kbase_mmu_term(kctx->kbdev, &kctx->mmu);
}
int kbase_context_mem_alloc_page(struct kbase_context *kctx)
{
struct page *p;
p = kbase_mem_alloc_page(&kctx->mem_pools.small[KBASE_MEM_GROUP_SINK]);
if (!p)
return -ENOMEM;
kctx->aliasing_sink_page = as_tagged(page_to_phys(p));
return 0;
}
void kbase_context_mem_pool_free(struct kbase_context *kctx)
{
/* drop the aliasing sink page now that it can't be mapped anymore */
kbase_mem_pool_free(&kctx->mem_pools.small[KBASE_MEM_GROUP_SINK],
as_page(kctx->aliasing_sink_page), false);
}
void kbase_context_sticky_resource_term(struct kbase_context *kctx)
{
unsigned long pending_regions_to_clean;
kbase_gpu_vm_lock(kctx);
kbase_sticky_resource_term(kctx);
/* free pending region setups */
pending_regions_to_clean = KBASE_COOKIE_MASK;
bitmap_andnot(&pending_regions_to_clean, &pending_regions_to_clean, kctx->cookies,
BITS_PER_LONG);
while (pending_regions_to_clean) {
unsigned int cookie = find_first_bit(&pending_regions_to_clean, BITS_PER_LONG);
if (!WARN_ON(!kctx->pending_regions[cookie])) {
dev_dbg(kctx->kbdev->dev, "Freeing pending unmapped region\n");
kbase_mem_phy_alloc_put(kctx->pending_regions[cookie]->cpu_alloc);
kbase_mem_phy_alloc_put(kctx->pending_regions[cookie]->gpu_alloc);
kfree(kctx->pending_regions[cookie]);
kctx->pending_regions[cookie] = NULL;
}
bitmap_clear(&pending_regions_to_clean, cookie, 1);
}
kbase_gpu_vm_unlock(kctx);
}
bool kbase_ctx_compat_mode(struct kbase_context *kctx)
{
return !IS_ENABLED(CONFIG_64BIT) ||
(IS_ENABLED(CONFIG_64BIT) && kbase_ctx_flag(kctx, KCTX_COMPAT));
}
KBASE_EXPORT_TEST_API(kbase_ctx_compat_mode);

View File

@@ -0,0 +1,133 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2011-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CONTEXT_H_
#define _KBASE_CONTEXT_H_
#include <linux/atomic.h>
/**
* kbase_context_debugfs_init - Initialize the kctx platform
* specific debugfs
*
* @kctx: kbase context
*
* This initializes some debugfs interfaces specific to the platform the source
* is compiled for.
*/
void kbase_context_debugfs_init(struct kbase_context *const kctx);
/**
* kbase_context_debugfs_term - Terminate the kctx platform
* specific debugfs
*
* @kctx: kbase context
*
* This terminates some debugfs interfaces specific to the platform the source
* is compiled for.
*/
void kbase_context_debugfs_term(struct kbase_context *const kctx);
/**
* kbase_create_context() - Create a kernel base context.
*
* @kbdev: Object representing an instance of GPU platform device,
* allocated from the probe method of the Mali driver.
* @is_compat: Force creation of a 32-bit context
* @flags: Flags to set, which shall be any combination of
* BASEP_CONTEXT_CREATE_KERNEL_FLAGS.
* @api_version: Application program interface version, as encoded in
* a single integer by the KBASE_API_VERSION macro.
* @filp: Pointer to the struct file corresponding to device file
* /dev/malixx instance, passed to the file's open method.
* Shall be passed as NULL for internally created contexts.
*
* Up to one context can be created for each client that opens the device file
* /dev/malixx. Context creation is deferred until a special ioctl() system call
* is made on the device file. Each context has its own GPU address space.
*
* Return: new kbase context or NULL on failure
*/
struct kbase_context *kbase_create_context(struct kbase_device *kbdev, bool is_compat,
base_context_create_flags const flags,
unsigned long api_version, struct file *filp);
/**
* kbase_destroy_context - Destroy a kernel base context.
* @kctx: Context to destroy
*
* Will release all outstanding regions.
*/
void kbase_destroy_context(struct kbase_context *kctx);
/**
* kbase_ctx_flag - Check if @flag is set on @kctx
* @kctx: Pointer to kbase context to check
* @flag: Flag to check
*
* Return: true if @flag is set on @kctx, false if not.
*/
static inline bool kbase_ctx_flag(struct kbase_context *kctx, enum kbase_context_flags flag)
{
return atomic_read(&kctx->flags) & (int)flag;
}
/**
* kbase_ctx_compat_mode - Indicate whether a kbase context needs to operate
* in compatibility mode for 32-bit userspace.
* @kctx: kbase context
*
* Return: True if needs to maintain compatibility, False otherwise.
*/
bool kbase_ctx_compat_mode(struct kbase_context *kctx);
/**
* kbase_ctx_flag_clear - Clear @flag on @kctx
* @kctx: Pointer to kbase context
* @flag: Flag to clear
*
* Clear the @flag on @kctx. This is done atomically, so other flags being
* cleared or set at the same time will be safe.
*
* Some flags have locking requirements, check the documentation for the
* respective flags.
*/
static inline void kbase_ctx_flag_clear(struct kbase_context *kctx, enum kbase_context_flags flag)
{
atomic_andnot(flag, &kctx->flags);
}
/**
* kbase_ctx_flag_set - Set @flag on @kctx
* @kctx: Pointer to kbase context
* @flag: Flag to set
*
* Set the @flag on @kctx. This is done atomically, so other flags being
* cleared or set at the same time will be safe.
*
* Some flags have locking requirements, check the documentation for the
* respective flags.
*/
static inline void kbase_ctx_flag_set(struct kbase_context *kctx, enum kbase_context_flags flag)
{
atomic_or(flag, &kctx->flags);
}
#endif /* _KBASE_CONTEXT_H_ */

View File

@@ -0,0 +1,62 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2019-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
typedef int kbase_context_init_method(struct kbase_context *kctx);
typedef void kbase_context_term_method(struct kbase_context *kctx);
/**
* struct kbase_context_init - Device init/term methods.
* @init: Function pointer to a initialise method.
* @term: Function pointer to a terminate method.
* @err_mes: Error message to be printed when init method fails.
*/
struct kbase_context_init {
kbase_context_init_method *init;
kbase_context_term_method *term;
char *err_mes;
};
/**
* kbase_context_common_init() - Initialize kbase context
* @kctx: Pointer to the kbase context to be initialized.
*
* This function must be called only when a kbase context is instantiated.
*
* Return: 0 on success.
*/
int kbase_context_common_init(struct kbase_context *kctx);
void kbase_context_common_term(struct kbase_context *kctx);
int kbase_context_mem_pool_group_init(struct kbase_context *kctx);
void kbase_context_mem_pool_group_term(struct kbase_context *kctx);
int kbase_context_mmu_init(struct kbase_context *kctx);
void kbase_context_mmu_term(struct kbase_context *kctx);
int kbase_context_mem_alloc_page(struct kbase_context *kctx);
void kbase_context_mem_pool_free(struct kbase_context *kctx);
void kbase_context_sticky_resource_term(struct kbase_context *kctx);
int kbase_context_add_to_dev_list(struct kbase_context *kctx);
void kbase_context_remove_from_dev_list(struct kbase_context *kctx);

View File

@@ -0,0 +1,66 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2018-2024 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
valhall_kbase-y += \
csf/mali_kbase_csf_util.o \
csf/mali_kbase_csf_firmware_cfg.o \
csf/mali_kbase_csf_trace_buffer.o \
csf/mali_kbase_csf.o \
csf/mali_kbase_csf_scheduler.o \
csf/mali_kbase_csf_kcpu.o \
csf/mali_kbase_csf_tiler_heap.o \
csf/mali_kbase_csf_timeout.o \
csf/mali_kbase_csf_tl_reader.o \
csf/mali_kbase_csf_heap_context_alloc.o \
csf/mali_kbase_csf_reset_gpu.o \
csf/mali_kbase_csf_csg.o \
csf/mali_kbase_csf_csg_debugfs.o \
csf/mali_kbase_csf_kcpu_debugfs.o \
csf/mali_kbase_csf_sync.o \
csf/mali_kbase_csf_sync_debugfs.o \
csf/mali_kbase_csf_kcpu_fence_debugfs.o \
csf/mali_kbase_csf_protected_memory.o \
csf/mali_kbase_csf_tiler_heap_debugfs.o \
csf/mali_kbase_csf_cpu_queue.o \
csf/mali_kbase_csf_cpu_queue_debugfs.o \
csf/mali_kbase_csf_event.o \
csf/mali_kbase_csf_firmware_log.o \
csf/mali_kbase_csf_firmware_core_dump.o \
csf/mali_kbase_csf_tiler_heap_reclaim.o \
csf/mali_kbase_csf_mcu_shared_reg.o \
csf/mali_kbase_csf_ne_debugfs.o
ifeq ($(CONFIG_MALI_VALHALL_NO_MALI),y)
valhall_kbase-y += csf/mali_kbase_csf_firmware_no_mali.o
valhall_kbase-y += csf/mali_kbase_csf_fw_io_no_mali.o
else
valhall_kbase-y += csf/mali_kbase_csf_firmware.o
valhall_kbase-y += csf/mali_kbase_csf_fw_io.o
endif
valhall_kbase-$(CONFIG_DEBUG_FS) += csf/mali_kbase_debug_csf_fault.o
ifeq ($(KBUILD_EXTMOD),)
# in-tree
-include $(src)/csf/ipa_control/Kbuild
else
# out-of-tree
include $(src)/csf/ipa_control/Kbuild
endif

View File

@@ -0,0 +1,22 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
#
# This program is free software and is provided to you under the terms of the
# GNU General Public License version 2 as published by the Free Software
# Foundation, and any use by you of this program is subject to the terms
# of such GNU license.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
#
valhall_kbase-y += \
csf/ipa_control/mali_kbase_csf_ipa_control.o

View File

@@ -0,0 +1,995 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2020-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include <mali_kbase.h>
#include <mali_kbase_config_defaults.h>
#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"
#include "mali_kbase_csf_ipa_control.h"
#include <mali_kbase_io.h>
/*
* Status flags from the STATUS register of the IPA Control interface.
*/
#define STATUS_COMMAND_ACTIVE ((u32)1 << 0)
#define STATUS_PROTECTED_MODE ((u32)1 << 8)
#define STATUS_RESET ((u32)1 << 9)
#define STATUS_TIMER_ENABLED ((u32)1 << 31)
/*
* Commands for the COMMAND register of the IPA Control interface.
*/
#define COMMAND_APPLY ((u32)1)
#define COMMAND_SAMPLE ((u32)3)
#define COMMAND_PROTECTED_ACK ((u32)4)
#define COMMAND_RESET_ACK ((u32)5)
/*
* Number of timer events per second.
*/
#define TIMER_EVENTS_PER_SECOND ((u32)1000 / IPA_CONTROL_TIMER_DEFAULT_VALUE_MS)
/*
* Number of bits used to configure a performance counter in SELECT registers.
*/
#define IPA_CONTROL_SELECT_BITS_PER_CNT ((u64)8)
/*
* Maximum value of a performance counter.
*/
#define MAX_PRFCNT_VALUE (((u64)1 << 48) - 1)
/**
* struct kbase_ipa_control_listener_data - Data for the GPU clock frequency
* listener
*
* @listener: GPU clock frequency listener.
* @kbdev: Pointer to kbase device.
* @clk_chg_wq: Dedicated workqueue to process the work item corresponding to
* a clock rate notification.
* @clk_chg_work: Work item to process the clock rate change
* @rate: The latest notified rate change, in unit of Hz
*/
struct kbase_ipa_control_listener_data {
struct kbase_clk_rate_listener listener;
struct kbase_device *kbdev;
struct workqueue_struct *clk_chg_wq;
struct work_struct clk_chg_work;
atomic_t rate;
};
static u32 timer_value(u32 gpu_rate)
{
return gpu_rate / TIMER_EVENTS_PER_SECOND;
}
static int wait_status(struct kbase_device *kbdev, u32 flags)
{
u32 val;
const u32 timeout_us = kbase_get_timeout_ms(kbdev, IPA_INACTIVE_TIMEOUT) * USEC_PER_MSEC;
/*
* Wait for the STATUS register to indicate that flags have been
* cleared, in case a transition is pending.
*/
const int err = kbase_reg_poll32_timeout(kbdev, IPA_CONTROL_ENUM(STATUS), val,
!(val & flags), 0, timeout_us, false);
if (err) {
dev_err(kbdev->dev, "IPA_CONTROL STATUS register stuck");
return -EBUSY;
}
return 0;
}
static int apply_select_config(struct kbase_device *kbdev, u64 *select)
{
int ret;
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_CSHW), select[KBASE_IPA_CORE_TYPE_CSHW]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_MEMSYS),
select[KBASE_IPA_CORE_TYPE_MEMSYS]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_TILER), select[KBASE_IPA_CORE_TYPE_TILER]);
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_SHADER),
select[KBASE_IPA_CORE_TYPE_SHADER]);
if (kbase_csf_dev_has_ne(kbdev))
kbase_reg_write64(kbdev, GOV_IPA_CONTROL_ENUM(SELECT_NEURAL),
select[KBASE_IPA_CORE_TYPE_NEURAL]);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (!ret) {
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_APPLY);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
} else {
dev_err(kbdev->dev, "Wait for the pending command failed");
}
return ret;
}
static u64 read_value_cnt(struct kbase_device *kbdev, u8 type, u8 select_idx)
{
switch (type) {
case KBASE_IPA_CORE_TYPE_CSHW:
return kbase_reg_read64(kbdev, IPA_VALUE_CSHW_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_MEMSYS:
return kbase_reg_read64(kbdev, IPA_VALUE_MEMSYS_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_TILER:
return kbase_reg_read64(kbdev, IPA_VALUE_TILER_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_SHADER:
return kbase_reg_read64(kbdev, IPA_VALUE_SHADER_OFFSET(select_idx));
case KBASE_IPA_CORE_TYPE_NEURAL:
if (kbase_csf_dev_has_ne(kbdev))
return kbase_reg_read64(kbdev, IPA_VALUE_NEURAL_OFFSET(select_idx));
else
return 0;
default:
WARN(1, "Unknown core type: %u\n", type);
return 0;
}
}
static void build_select_config(struct kbase_ipa_control *ipa_ctrl, u64 *select_config)
{
size_t i;
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
size_t j;
select_config[i] = 0ULL;
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config =
&ipa_ctrl->blocks[i].select[j];
select_config[i] |=
((u64)prfcnt_config->idx << (IPA_CONTROL_SELECT_BITS_PER_CNT * j));
}
}
}
static int update_select_registers(struct kbase_device *kbdev)
{
u64 select_config[KBASE_IPA_CORE_TYPE_NUM];
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
build_select_config(&kbdev->csf.ipa_control, select_config);
return apply_select_config(kbdev, select_config);
}
static inline void calc_prfcnt_delta(struct kbase_device *kbdev,
struct kbase_ipa_control_prfcnt *prfcnt, bool gpu_ready)
{
u64 delta_value, raw_value;
if (gpu_ready)
raw_value = read_value_cnt(kbdev, (u8)prfcnt->type, prfcnt->select_idx);
else
raw_value = prfcnt->latest_raw_value;
if (raw_value < prfcnt->latest_raw_value) {
delta_value = (MAX_PRFCNT_VALUE - prfcnt->latest_raw_value) + raw_value;
} else {
delta_value = raw_value - prfcnt->latest_raw_value;
}
delta_value *= prfcnt->scaling_factor;
if (kbdev->csf.ipa_control.cur_gpu_rate == 0) {
static bool warned;
if (!warned) {
dev_warn(kbdev->dev, "%s: GPU freq is unexpectedly 0", __func__);
warned = true;
}
} else if (prfcnt->gpu_norm)
delta_value = div_u64(delta_value, kbdev->csf.ipa_control.cur_gpu_rate);
prfcnt->latest_raw_value = raw_value;
/* Accumulate the difference */
prfcnt->accumulated_diff += delta_value;
}
/**
* kbase_ipa_control_rate_change_notify - GPU frequency change callback
*
* @listener: Clock frequency change listener.
* @clk_index: Index of the clock for which the change has occurred.
* @clk_rate_hz: Clock frequency(Hz).
*
* This callback notifies kbase_ipa_control about GPU frequency changes.
* Only top-level clock changes are meaningful. GPU frequency updates
* affect all performance counters which require GPU normalization
* in every session.
*/
static void kbase_ipa_control_rate_change_notify(struct kbase_clk_rate_listener *listener,
u32 clk_index, u32 clk_rate_hz)
{
if ((clk_index == KBASE_CLOCK_DOMAIN_TOP) && (clk_rate_hz != 0)) {
struct kbase_ipa_control_listener_data *listener_data =
container_of(listener, struct kbase_ipa_control_listener_data, listener);
/* Save the rate and delegate the job to a work item */
atomic_set(&listener_data->rate, clk_rate_hz);
queue_work(listener_data->clk_chg_wq, &listener_data->clk_chg_work);
}
}
static void kbase_ipa_ctrl_rate_change_worker(struct work_struct *data)
{
struct kbase_ipa_control_listener_data *listener_data =
container_of(data, struct kbase_ipa_control_listener_data, clk_chg_work);
struct kbase_device *kbdev = listener_data->kbdev;
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
unsigned long flags;
u32 rate;
size_t i;
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (!kbdev->pm.backend.gpu_ready) {
dev_err(kbdev->dev, "%s: GPU frequency cannot change while GPU is off", __func__);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
return;
}
spin_lock(&ipa_ctrl->lock);
/* Picking up the latest notified rate */
rate = (u32)atomic_read(&listener_data->rate);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[i];
if (session->active) {
size_t j;
for (j = 0; j < session->num_prfcnts; j++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[j];
if (prfcnt->gpu_norm)
calc_prfcnt_delta(kbdev, prfcnt, true);
}
}
}
ipa_ctrl->cur_gpu_rate = rate;
/* Update the timer for automatic sampling if active sessions
* are present. Counters have already been manually sampled.
*/
if (ipa_ctrl->num_active_sessions > 0)
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), timer_value(rate));
spin_unlock(&ipa_ctrl->lock);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
void kbase_ipa_control_init(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
struct kbase_ipa_control_listener_data *listener_data;
size_t i;
unsigned long flags;
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
ipa_ctrl->blocks[i].num_available_counters = KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS;
}
spin_lock_init(&ipa_ctrl->lock);
listener_data = kmalloc(sizeof(struct kbase_ipa_control_listener_data), GFP_KERNEL);
if (listener_data) {
listener_data->clk_chg_wq =
alloc_workqueue("ipa_ctrl_wq", WQ_HIGHPRI | WQ_UNBOUND, 1);
if (listener_data->clk_chg_wq) {
INIT_WORK(&listener_data->clk_chg_work, kbase_ipa_ctrl_rate_change_worker);
listener_data->listener.notify = kbase_ipa_control_rate_change_notify;
listener_data->kbdev = kbdev;
ipa_ctrl->rtm_listener_data = listener_data;
/* Initialise to 0, which is out of normal notified rates */
atomic_set(&listener_data->rate, 0);
} else {
dev_warn(kbdev->dev,
"%s: failed to allocate workqueue, clock rate update disabled",
__func__);
kfree(listener_data);
listener_data = NULL;
}
} else
dev_warn(kbdev->dev,
"%s: failed to allocate memory, IPA control clock rate update disabled",
__func__);
spin_lock_irqsave(&clk_rtm->lock, flags);
if (clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP])
ipa_ctrl->cur_gpu_rate = clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP]->clock_val;
if (listener_data)
kbase_clk_rate_trace_manager_subscribe_no_lock(clk_rtm, &listener_data->listener);
spin_unlock_irqrestore(&clk_rtm->lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_init);
void kbase_ipa_control_term(struct kbase_device *kbdev)
{
unsigned long flags;
struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_ipa_control_listener_data *listener_data = ipa_ctrl->rtm_listener_data;
WARN_ON(ipa_ctrl->num_active_sessions);
if (listener_data) {
kbase_clk_rate_trace_manager_unsubscribe(clk_rtm, &listener_data->listener);
destroy_workqueue(listener_data->clk_chg_wq);
}
kfree(ipa_ctrl->rtm_listener_data);
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
if (kbase_io_is_gpu_powered(kbdev))
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), 0);
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_term);
/** session_read_raw_values - Read latest raw values for a sessions
* @kbdev: Pointer to kbase device.
* @session: Pointer to the session whose performance counters shall be read.
*
* Read and update the latest raw values of all the performance counters
* belonging to a given session.
*/
static void session_read_raw_values(struct kbase_device *kbdev,
struct kbase_ipa_control_session *session)
{
size_t i;
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
u64 raw_value = read_value_cnt(kbdev, (u8)prfcnt->type, prfcnt->select_idx);
prfcnt->latest_raw_value = raw_value;
}
}
/** session_gpu_start - Start one or all sessions
* @kbdev: Pointer to kbase device.
* @ipa_ctrl: Pointer to IPA_CONTROL descriptor.
* @session: Pointer to the session to initialize, or NULL to initialize
* all sessions.
*
* This function starts one or all sessions by capturing a manual sample,
* reading the latest raw value of performance counters and possibly enabling
* the timer for automatic sampling if necessary.
*
* If a single session is given, it is assumed to be active, regardless of
* the number of active sessions. The number of performance counters belonging
* to the session shall be set in advance.
*
* If no session is given, the function shall start all sessions.
* The function does nothing if there are no active sessions.
*
* Return: 0 on success, or error code on failure.
*/
static int session_gpu_start(struct kbase_device *kbdev, struct kbase_ipa_control *ipa_ctrl,
struct kbase_ipa_control_session *session)
{
bool first_start = (session != NULL) && (ipa_ctrl->num_active_sessions == 0);
int ret = 0;
lockdep_assert_held(&kbdev->csf.ipa_control.lock);
/*
* Exit immediately if the caller intends to start all sessions
* but there are no active sessions. It's important that no operation
* is done on the IPA_CONTROL interface in that case.
*/
if (!session && ipa_ctrl->num_active_sessions == 0)
return ret;
/*
* Take a manual sample unconditionally if the caller intends
* to start all sessions. Otherwise, only take a manual sample
* if this is the first session to be initialized, for accumulator
* registers are empty and no timer has been configured for automatic
* sampling.
*/
if (!session || first_start) {
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_SAMPLE);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (ret)
dev_err(kbdev->dev, "%s: failed to sample new counters", __func__);
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER),
timer_value(ipa_ctrl->cur_gpu_rate));
}
/*
* Read current raw value to start the session.
* This is necessary to put the first query in condition
* to generate a correct value by calculating the difference
* from the beginning of the session. This consideration
* is true regardless of the number of sessions the caller
* intends to start.
*/
if (!ret) {
if (session) {
/* On starting a session, value read is required for
* IPA power model's calculation initialization.
*/
session_read_raw_values(kbdev, session);
} else {
size_t session_idx;
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS;
session_idx++) {
struct kbase_ipa_control_session *session_to_check =
&ipa_ctrl->sessions[session_idx];
if (session_to_check->active)
session_read_raw_values(kbdev, session_to_check);
}
}
}
return ret;
}
int kbase_ipa_control_register(struct kbase_device *kbdev,
const struct kbase_ipa_control_perf_counter *perf_counters,
size_t num_counters, void **client)
{
int ret = 0;
size_t i, session_idx, req_counters[KBASE_IPA_CORE_TYPE_NUM];
bool already_configured[KBASE_IPA_CONTROL_MAX_COUNTERS];
bool new_config = false;
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session = NULL;
unsigned long flags;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(perf_counters == NULL) || WARN_ON(client == NULL) ||
WARN_ON(num_counters > KBASE_IPA_CONTROL_MAX_COUNTERS)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
kbase_pm_context_active(kbdev);
ipa_ctrl = &kbdev->csf.ipa_control;
spin_lock_irqsave(&ipa_ctrl->lock, flags);
if (ipa_ctrl->num_active_sessions == KBASE_IPA_CONTROL_MAX_SESSIONS) {
dev_err(kbdev->dev, "%s: too many sessions", __func__);
ret = -EBUSY;
goto exit;
}
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
req_counters[i] = 0;
/*
* Count how many counters would need to be configured in order to
* satisfy the request. Requested counters which happen to be already
* configured can be skipped.
*/
for (i = 0; i < num_counters; i++) {
size_t j;
enum kbase_ipa_core_type type = perf_counters[i].type;
u8 idx = perf_counters[i].idx;
if ((type >= KBASE_IPA_CORE_TYPE_NUM) || (idx >= KBASE_IPA_CONTROL_CNT_MAX_IDX)) {
dev_err(kbdev->dev, "%s: invalid requested type %u and/or index %u",
__func__, type, idx);
ret = -EINVAL;
goto exit;
}
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config =
&ipa_ctrl->blocks[type].select[j];
if (prfcnt_config->refcount > 0) {
if (prfcnt_config->idx == idx) {
already_configured[i] = true;
break;
}
}
}
if (j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS) {
already_configured[i] = false;
req_counters[type]++;
new_config = true;
}
}
for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
if (req_counters[i] > ipa_ctrl->blocks[i].num_available_counters) {
dev_err(kbdev->dev,
"%s: more counters (%zu) than available (%zu) have been requested for type %zu",
__func__, req_counters[i],
ipa_ctrl->blocks[i].num_available_counters, i);
ret = -EINVAL;
goto exit;
}
/*
* The request has been validated.
* Firstly, find an available session and then set up the initial state
* of the session and update the configuration of performance counters
* in the internal state of kbase_ipa_control.
*/
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS; session_idx++) {
if (!ipa_ctrl->sessions[session_idx].active) {
session = &ipa_ctrl->sessions[session_idx];
break;
}
}
if (!session) {
dev_err(kbdev->dev, "%s: wrong or corrupt session state", __func__);
ret = -EBUSY;
goto exit;
}
for (i = 0; i < num_counters; i++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config;
size_t j;
u8 type = perf_counters[i].type;
u8 idx = perf_counters[i].idx;
for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
prfcnt_config = &ipa_ctrl->blocks[type].select[j];
if (already_configured[i]) {
if ((prfcnt_config->refcount > 0) && (prfcnt_config->idx == idx)) {
break;
}
} else {
if (prfcnt_config->refcount == 0)
break;
}
}
if (WARN_ON((prfcnt_config->refcount > 0 && prfcnt_config->idx != idx) ||
(j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS))) {
dev_err(kbdev->dev,
"%s: invalid internal state: counter already configured or no counter available to configure",
__func__);
ret = -EBUSY;
goto exit;
}
if (prfcnt_config->refcount == 0) {
prfcnt_config->idx = idx;
ipa_ctrl->blocks[type].num_available_counters--;
}
session->prfcnts[i].accumulated_diff = 0;
session->prfcnts[i].type = type;
session->prfcnts[i].select_idx = j;
session->prfcnts[i].scaling_factor = perf_counters[i].scaling_factor;
session->prfcnts[i].gpu_norm = perf_counters[i].gpu_norm;
/* Reports to this client for GPU time spent in protected mode
* should begin from the point of registration.
*/
session->last_query_time = ktime_get_raw_ns();
/* Initially, no time has been spent in protected mode */
session->protm_time = 0;
prfcnt_config->refcount++;
}
/*
* Apply new configuration, if necessary.
* As a temporary solution, make sure that the GPU is on
* before applying the new configuration.
*/
if (new_config) {
ret = update_select_registers(kbdev);
if (ret)
dev_err(kbdev->dev, "%s: failed to apply new SELECT configuration",
__func__);
}
if (!ret) {
session->num_prfcnts = num_counters;
ret = session_gpu_start(kbdev, ipa_ctrl, session);
}
if (!ret) {
session->active = true;
ipa_ctrl->num_active_sessions++;
*client = session;
}
exit:
spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
kbase_pm_context_idle(kbdev);
return ret;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_register);
int kbase_ipa_control_unregister(struct kbase_device *kbdev, const void *client)
{
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session;
int ret = 0;
size_t i;
unsigned long flags;
bool new_config = false, valid_session = false;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(client == NULL)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
kbase_pm_context_active(kbdev);
ipa_ctrl = &kbdev->csf.ipa_control;
session = (struct kbase_ipa_control_session *)client;
spin_lock_irqsave(&ipa_ctrl->lock, flags);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
if (session == &ipa_ctrl->sessions[i]) {
valid_session = true;
break;
}
}
if (!valid_session) {
dev_err(kbdev->dev, "%s: invalid session handle", __func__);
ret = -EINVAL;
goto exit;
}
if (ipa_ctrl->num_active_sessions == 0) {
dev_err(kbdev->dev, "%s: no active sessions found", __func__);
ret = -EINVAL;
goto exit;
}
if (!session->active) {
dev_err(kbdev->dev, "%s: session is already inactive", __func__);
ret = -EINVAL;
goto exit;
}
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt_config *prfcnt_config;
u8 type = session->prfcnts[i].type;
u8 idx = session->prfcnts[i].select_idx;
prfcnt_config = &ipa_ctrl->blocks[type].select[idx];
if (!WARN_ON(prfcnt_config->refcount == 0)) {
prfcnt_config->refcount--;
if (prfcnt_config->refcount == 0) {
new_config = true;
ipa_ctrl->blocks[type].num_available_counters++;
}
}
}
if (new_config) {
ret = update_select_registers(kbdev);
if (ret)
dev_err(kbdev->dev, "%s: failed to apply SELECT configuration", __func__);
}
session->num_prfcnts = 0;
session->active = false;
ipa_ctrl->num_active_sessions--;
exit:
spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
kbase_pm_context_idle(kbdev);
return ret;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_unregister);
int kbase_ipa_control_query(struct kbase_device *kbdev, const void *client, u64 *values,
size_t num_values, u64 *protected_time)
{
struct kbase_ipa_control *ipa_ctrl;
struct kbase_ipa_control_session *session;
size_t i;
unsigned long flags;
bool gpu_ready;
if (WARN_ON(unlikely(kbdev == NULL)))
return -ENODEV;
if (WARN_ON(client == NULL) || WARN_ON(values == NULL)) {
dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
return -EINVAL;
}
ipa_ctrl = &kbdev->csf.ipa_control;
session = (struct kbase_ipa_control_session *)client;
if (!session->active) {
dev_err(kbdev->dev, "%s: attempt to query inactive session", __func__);
return -EINVAL;
}
if (WARN_ON(num_values < session->num_prfcnts)) {
dev_err(kbdev->dev, "%s: not enough space (%zu) to return all counter values (%zu)",
__func__, num_values, session->num_prfcnts);
return -EINVAL;
}
spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
gpu_ready = kbdev->pm.backend.gpu_ready;
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
calc_prfcnt_delta(kbdev, prfcnt, gpu_ready);
/* Return all the accumulated difference */
values[i] = prfcnt->accumulated_diff;
prfcnt->accumulated_diff = 0;
}
if (protected_time) {
u64 time_now = ktime_get_raw_ns();
/* This is the amount of protected-mode time spent prior to
* the current protm period.
*/
*protected_time = session->protm_time;
if (kbdev->protected_mode) {
*protected_time +=
time_now - MAX(session->last_query_time, ipa_ctrl->protm_start);
}
session->last_query_time = time_now;
session->protm_time = 0;
}
spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
for (i = session->num_prfcnts; i < num_values; i++)
values[i] = 0;
return 0;
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_query);
void kbase_ipa_control_handle_gpu_power_off(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
size_t session_idx;
int ret;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should still be ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
/* First disable the automatic sampling through TIMER */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(TIMER), 0);
ret = wait_status(kbdev, STATUS_TIMER_ENABLED);
if (ret) {
dev_err(kbdev->dev, "Wait for disabling of IPA control timer failed: %d", ret);
}
/* Now issue the manual SAMPLE command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_SAMPLE);
ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
if (ret) {
dev_err(kbdev->dev, "Wait for the completion of manual sample failed: %d", ret);
}
for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS; session_idx++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[session_idx];
if (session->active) {
size_t i;
for (i = 0; i < session->num_prfcnts; i++) {
struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
calc_prfcnt_delta(kbdev, prfcnt, true);
}
}
}
spin_unlock(&ipa_ctrl->lock);
}
void kbase_ipa_control_handle_gpu_power_on(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
int ret;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should have become ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
ret = update_select_registers(kbdev);
if (ret) {
dev_err(kbdev->dev, "Failed to reconfigure the select registers: %d", ret);
}
/* Accumulator registers would not contain any sample after GPU power
* cycle if the timer has not been enabled first. Initialize all sessions.
*/
ret = session_gpu_start(kbdev, ipa_ctrl, NULL);
spin_unlock(&ipa_ctrl->lock);
}
void kbase_ipa_control_handle_gpu_reset_pre(struct kbase_device *kbdev)
{
/* A soft reset is treated as a power down */
kbase_ipa_control_handle_gpu_power_off(kbdev);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_pre);
void kbase_ipa_control_handle_gpu_reset_post(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
int ret;
u32 status;
lockdep_assert_held(&kbdev->hwaccess_lock);
/* GPU should have become ready for use when this function gets called */
WARN_ON(!kbdev->pm.backend.gpu_ready);
/* Interrupts are already disabled and interrupt state is also saved */
spin_lock(&ipa_ctrl->lock);
/* Check the status reset bit is set before acknowledging it */
status = kbase_reg_read32(kbdev, IPA_CONTROL_ENUM(STATUS));
if (status & STATUS_RESET) {
/* Acknowledge the reset command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_RESET_ACK);
ret = wait_status(kbdev, STATUS_RESET);
if (ret) {
dev_err(kbdev->dev, "Wait for the reset ack command failed: %d", ret);
}
}
spin_unlock(&ipa_ctrl->lock);
kbase_ipa_control_handle_gpu_power_on(kbdev);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_post);
#ifdef KBASE_PM_RUNTIME
void kbase_ipa_control_handle_gpu_sleep_enter(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.mcu_state == KBASE_MCU_IN_SLEEP) {
/* GPU Sleep is treated as a power down */
kbase_ipa_control_handle_gpu_power_off(kbdev);
/* SELECT_CSHW register needs to be cleared to prevent any
* IPA control message to be sent to the top level GPU HWCNT.
*/
kbase_reg_write64(kbdev, IPA_CONTROL_ENUM(SELECT_CSHW), 0);
/* No need to issue the APPLY command here */
}
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_sleep_enter);
void kbase_ipa_control_handle_gpu_sleep_exit(struct kbase_device *kbdev)
{
lockdep_assert_held(&kbdev->hwaccess_lock);
if (kbdev->pm.backend.mcu_state == KBASE_MCU_IN_SLEEP) {
/* To keep things simple, currently exit from
* GPU Sleep is treated as a power on event where
* all 4 SELECT registers are reconfigured.
* On exit from sleep, reconfiguration is needed
* only for the SELECT_CSHW register.
*/
kbase_ipa_control_handle_gpu_power_on(kbdev);
}
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_sleep_exit);
#endif
#if MALI_UNIT_TEST
void kbase_ipa_control_rate_change_notify_test(struct kbase_device *kbdev, u32 clk_index,
u32 clk_rate_hz)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
struct kbase_ipa_control_listener_data *listener_data = ipa_ctrl->rtm_listener_data;
kbase_ipa_control_rate_change_notify(&listener_data->listener, clk_index, clk_rate_hz);
/* Ensure the callback has taken effect before returning back to the test caller */
flush_work(&listener_data->clk_chg_work);
}
KBASE_EXPORT_TEST_API(kbase_ipa_control_rate_change_notify_test);
#endif
void kbase_ipa_control_protm_entered(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
lockdep_assert_held(&kbdev->hwaccess_lock);
ipa_ctrl->protm_start = ktime_get_raw_ns();
}
void kbase_ipa_control_protm_exited(struct kbase_device *kbdev)
{
struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
size_t i;
u64 time_now = ktime_get_raw_ns();
u32 status;
lockdep_assert_held(&kbdev->hwaccess_lock);
for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[i];
if (session->active) {
u64 protm_time =
time_now - MAX(session->last_query_time, ipa_ctrl->protm_start);
session->protm_time += protm_time;
}
}
/* Acknowledge the protected_mode bit in the IPA_CONTROL STATUS
* register
*/
status = kbase_reg_read32(kbdev, IPA_CONTROL_ENUM(STATUS));
if (status & STATUS_PROTECTED_MODE) {
int ret;
/* Acknowledge the protm command */
kbase_reg_write32(kbdev, IPA_CONTROL_ENUM(COMMAND), COMMAND_PROTECTED_ACK);
ret = wait_status(kbdev, STATUS_PROTECTED_MODE);
if (ret) {
dev_err(kbdev->dev, "Wait for the protm ack command failed: %d", ret);
}
}
}

View File

@@ -0,0 +1,270 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CSF_IPA_CONTROL_H_
#define _KBASE_CSF_IPA_CONTROL_H_
#include <mali_kbase.h>
/*
* Maximum index accepted to configure an IPA Control performance counter.
*/
#define KBASE_IPA_CONTROL_CNT_MAX_IDX ((u8)64 * 3)
/**
* struct kbase_ipa_control_perf_counter - Performance counter description
*
* @scaling_factor: Scaling factor by which the counter's value shall be
* multiplied. A scaling factor of 1 corresponds to units
* of 1 second if values are normalised by GPU frequency.
* @gpu_norm: Indicating whether counter values shall be normalized by
* GPU frequency. If true, returned values represent
* an interval of time expressed in seconds (when the scaling
* factor is set to 1).
* @type: Type of counter block for performance counter.
* @idx: Index of the performance counter inside the block.
* It may be dependent on GPU architecture.
* It cannot be greater than KBASE_IPA_CONTROL_CNT_MAX_IDX.
*
* This structure is used by clients of the IPA Control component to describe
* a performance counter that they intend to read. The counter is identified
* by block and index. In addition to that, the client also specifies how
* values shall be represented. Raw values are a number of GPU cycles;
* if normalized, they are divided by GPU frequency and become an interval
* of time expressed in seconds, since the GPU frequency is given in Hz.
* The client may specify a scaling factor to multiply counter values before
* they are divided by frequency, in case the unit of time of 1 second is
* too low in resolution. For instance: a scaling factor of 1000 implies
* that the returned value is a time expressed in milliseconds; a scaling
* factor of 1000 * 1000 implies that the returned value is a time expressed
* in microseconds.
*/
struct kbase_ipa_control_perf_counter {
u64 scaling_factor;
bool gpu_norm;
enum kbase_ipa_core_type type;
u8 idx;
};
/**
* kbase_ipa_control_init - Initialize the IPA Control component
*
* @kbdev: Pointer to Kbase device.
*
* This function must be called only when a kbase device is initialized.
*/
void kbase_ipa_control_init(struct kbase_device *kbdev);
/**
* kbase_ipa_control_term - Terminate the IPA Control component
*
* @kbdev: Pointer to Kbase device.
*/
void kbase_ipa_control_term(struct kbase_device *kbdev);
/**
* kbase_ipa_control_register - Register a client to the IPA Control component
*
* @kbdev: Pointer to Kbase device.
* @perf_counters: Array of performance counters the client intends to read.
* For each counter the client specifies block, index,
* scaling factor and whether it must be normalized by GPU
* frequency.
* @num_counters: Number of performance counters. It cannot exceed the total
* number of counters that exist on the IPA Control interface.
* @client: Handle to an opaque structure set by IPA Control if
* the registration is successful. This handle identifies
* a client's session and shall be provided in its future
* queries.
*
* A client needs to subscribe to the IPA Control component by declaring which
* performance counters it intends to read, and specifying a scaling factor
* and whether normalization is requested for each performance counter.
* The function shall configure the IPA Control interface accordingly and start
* a session for the client that made the request. A unique handle is returned
* if registration is successful in order to identify the client's session
* and be used for future queries.
*
* Return: 0 on success, negative -errno on error
*/
int kbase_ipa_control_register(struct kbase_device *kbdev,
const struct kbase_ipa_control_perf_counter *perf_counters,
size_t num_counters, void **client);
/**
* kbase_ipa_control_unregister - Unregister a client from IPA Control
*
* @kbdev: Pointer to kbase device.
* @client: Handle to an opaque structure that identifies the client session
* to terminate, as returned by kbase_ipa_control_register.
*
* Return: 0 on success, negative -errno on error
*/
int kbase_ipa_control_unregister(struct kbase_device *kbdev, const void *client);
/**
* kbase_ipa_control_query - Query performance counters
*
* @kbdev: Pointer to kbase device.
* @client: Handle to an opaque structure that identifies the client
* session, as returned by kbase_ipa_control_register.
* @values: Array of values queried from performance counters, whose
* length depends on the number of counters requested at
* the time of registration. Values are scaled and normalized
* and represent the difference since the last query.
* @num_values: Number of entries in the array of values that has been
* passed by the caller. It must be at least equal to the
* number of performance counters the client registered itself
* to read.
* @protected_time: Time spent in protected mode since last query,
* expressed in nanoseconds. This pointer may be NULL if the
* client doesn't want to know about this.
*
* A client that has already opened a session by registering itself to read
* some performance counters may use this function to query the values of
* those counters. The values returned are normalized by GPU frequency if
* requested and then multiplied by the scaling factor provided at the time
* of registration. Values always represent a difference since the last query.
*
* Performance counters are not updated while the GPU operates in protected
* mode. For this reason, returned values may be unreliable if the GPU has
* been in protected mode since the last query. The function returns success
* in that case, but it also gives a measure of how much time has been spent
* in protected mode.
*
* Return: 0 on success, negative -errno on error
*/
int kbase_ipa_control_query(struct kbase_device *kbdev, const void *client, u64 *values,
size_t num_values, u64 *protected_time);
/**
* kbase_ipa_control_handle_gpu_power_on - Handle the GPU power on event
*
* @kbdev: Pointer to kbase device.
*
* This function is called after GPU has been powered and is ready for use.
* After the GPU power on, IPA Control component needs to ensure that the
* counters start incrementing again.
*/
void kbase_ipa_control_handle_gpu_power_on(struct kbase_device *kbdev);
/**
* kbase_ipa_control_handle_gpu_power_off - Handle the GPU power off event
*
* @kbdev: Pointer to kbase device.
*
* This function is called just before the GPU is powered off when it is still
* ready for use.
* IPA Control component needs to be aware of the GPU power off so that it can
* handle the query from Clients appropriately and return meaningful values
* to them.
*/
void kbase_ipa_control_handle_gpu_power_off(struct kbase_device *kbdev);
/**
* kbase_ipa_control_handle_gpu_reset_pre - Handle the pre GPU reset event
*
* @kbdev: Pointer to kbase device.
*
* This function is called when the GPU is about to be reset.
*/
void kbase_ipa_control_handle_gpu_reset_pre(struct kbase_device *kbdev);
/**
* kbase_ipa_control_handle_gpu_reset_post - Handle the post GPU reset event
*
* @kbdev: Pointer to kbase device.
*
* This function is called after the GPU has been reset.
*/
void kbase_ipa_control_handle_gpu_reset_post(struct kbase_device *kbdev);
#ifdef KBASE_PM_RUNTIME
/**
* kbase_ipa_control_handle_gpu_sleep_enter - Handle the pre GPU Sleep event
*
* @kbdev: Pointer to kbase device.
*
* This function is called after MCU has been put to sleep state & L2 cache has
* been powered down. The top level part of GPU is still powered up when this
* function is called.
*/
void kbase_ipa_control_handle_gpu_sleep_enter(struct kbase_device *kbdev);
/**
* kbase_ipa_control_handle_gpu_sleep_exit - Handle the post GPU Sleep event
*
* @kbdev: Pointer to kbase device.
*
* This function is called when L2 needs to be powered up and MCU can exit the
* sleep state. The top level part of GPU is powered up when this function is
* called.
*
* This function must be called only if kbase_ipa_control_handle_gpu_sleep_enter()
* was called previously.
*/
void kbase_ipa_control_handle_gpu_sleep_exit(struct kbase_device *kbdev);
#endif
#if MALI_UNIT_TEST
/**
* kbase_ipa_control_rate_change_notify_test - Notify GPU rate change
* (only for testing)
*
* @kbdev: Pointer to kbase device.
* @clk_index: Index of the clock for which the change has occurred.
* @clk_rate_hz: Clock frequency(Hz).
*
* Notify the IPA Control component about a GPU rate change.
*/
void kbase_ipa_control_rate_change_notify_test(struct kbase_device *kbdev, u32 clk_index,
u32 clk_rate_hz);
#endif /* MALI_UNIT_TEST */
/**
* kbase_ipa_control_protm_entered - Tell IPA_CONTROL that protected mode
* has been entered.
*
* @kbdev: Pointer to kbase device.
*
* This function provides a means through which IPA_CONTROL can be informed
* that the GPU has entered protected mode. Since the GPU cannot access
* performance counters while in this mode, this information is useful as
* it implies (a) the values of these registers cannot change, so theres no
* point trying to read them, and (b) IPA_CONTROL has a means through which
* to record the duration of time the GPU is in protected mode, which can
* then be forwarded on to clients, who may wish, for example, to assume
* that the GPU was busy 100% of the time while in this mode.
*/
void kbase_ipa_control_protm_entered(struct kbase_device *kbdev);
/**
* kbase_ipa_control_protm_exited - Tell IPA_CONTROL that protected mode
* has been exited.
*
* @kbdev: Pointer to kbase device
*
* This function provides a means through which IPA_CONTROL can be informed
* that the GPU has exited from protected mode.
*/
void kbase_ipa_control_protm_exited(struct kbase_device *kbdev);
#endif /* _KBASE_CSF_IPA_CONTROL_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,745 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2018-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CSF_H_
#define _KBASE_CSF_H_
#include "mali_kbase_csf_kcpu.h"
#include "mali_kbase_csf_scheduler.h"
#include "mali_kbase_csf_firmware.h"
#include "mali_kbase_csf_protected_memory.h"
#include "mali_kbase_hwaccess_time.h"
/* Indicate invalid CS h/w interface
*/
#define KBASEP_IF_NR_INVALID ((s8)-1)
/* Indicate invalid CSG number for a GPU command queue group
*/
#define KBASEP_CSG_NR_INVALID ((s8)-1)
/* Indicate invalid user doorbell number for a GPU command queue
*/
#define KBASEP_USER_DB_NR_INVALID ((s8)-1)
/* Indicates an invalid value for the scan out sequence number, used to
* signify there is no group that has protected mode execution pending.
*/
#define KBASEP_TICK_PROTM_PEND_SCAN_SEQ_NR_INVALID (U32_MAX)
#define FIRMWARE_IDLE_HYSTERESIS_TIME_NS (10 * 1000 * 1000) /* Default 10 milliseconds */
/* Idle hysteresis time can be scaled down when GPU sleep feature is used */
#define FIRMWARE_IDLE_HYSTERESIS_GPU_SLEEP_SCALER (5)
/**
* kbase_csf_ctx_init - Initialize the CSF interface for a GPU address space.
*
* @kctx: Pointer to the kbase context which is being initialized.
*
* Return: 0 if successful or a negative error code on failure.
*/
int kbase_csf_ctx_init(struct kbase_context *kctx);
/**
* kbase_csf_ctx_handle_fault - Terminate queue groups & notify fault upon
* GPU bus fault, MMU page fault or similar.
*
* @kctx: Pointer to faulty kbase context.
* @fault: Pointer to the fault.
* @fw_unresponsive: Whether or not the FW is deemed unresponsive
*
* This function terminates all GPU command queue groups in the context and
* notifies the event notification thread of the fault. If the FW is deemed
* unresponsive, e.g. when recovering from a GLB_FATAL, it will not wait
* for the groups to be terminated by the MCU, since in this case it will
* time-out anyway.
*/
void kbase_csf_ctx_handle_fault(struct kbase_context *kctx, struct kbase_fault *fault,
bool fw_unresponsive);
/**
* kbase_csf_ctx_report_page_fault_for_active_groups - Notify Userspace about GPU page fault
* for active groups of the faulty context.
*
* @kctx: Pointer to faulty kbase context.
* @fault: Pointer to the fault.
*
* This function notifies the event notification thread of the GPU page fault.
*/
void kbase_csf_ctx_report_page_fault_for_active_groups(struct kbase_context *kctx,
struct kbase_fault *fault);
/**
* kbase_csf_ctx_term - Terminate the CSF interface for a GPU address space.
*
* @kctx: Pointer to the kbase context which is being terminated.
*
* This function terminates any remaining CSGs and CSs which weren't destroyed
* before context termination.
*/
void kbase_csf_ctx_term(struct kbase_context *kctx);
/**
* kbase_csf_queue_register - Register a GPU command queue.
*
* @kctx: Pointer to the kbase context within which the
* queue is to be registered.
* @reg: Pointer to the structure which contains details of the
* queue to be registered within the provided
* context.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_register(struct kbase_context *kctx, struct kbase_ioctl_cs_queue_register *reg);
/**
* kbase_csf_queue_register_ex - Register a GPU command queue with
* extended format.
*
* @kctx: Pointer to the kbase context within which the
* queue is to be registered.
* @reg: Pointer to the structure which contains details of the
* queue to be registered within the provided
* context, together with the extended parameter fields
* for supporting cs trace command.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_register_ex(struct kbase_context *kctx,
struct kbase_ioctl_cs_queue_register_ex *reg);
/**
* kbase_csf_queue_terminate - Terminate a GPU command queue.
*
* @kctx: Pointer to the kbase context within which the
* queue is to be terminated.
* @term: Pointer to the structure which identifies which
* queue is to be terminated.
*/
void kbase_csf_queue_terminate(struct kbase_context *kctx,
struct kbase_ioctl_cs_queue_terminate *term);
/**
* kbase_csf_free_command_stream_user_pages() - Free queue resources
* from bind time.
*
* @kctx: Address of the kbase context within which the queue was created.
* @queue: Pointer to the queue to be unlinked.
*
* This function releases the hardware doorbell page assigned to the queue
* and releases the reference taken on the queue.
*
* When mali_kbase_supports_csg_cs_user_page_allocation() is false:
* This function will free the pair of CS_USER IO physical pages allocated
* for a GPU command queue, that were mapped into the process address space
* to enable direct submission of commands to the hardware.
*
* This function will be called only when the mapping is being removed and
* so the resources for queue will not get freed up until the mapping is
* removed even though userspace could have terminated the queue.
* Kernel will ensure that the termination of Kbase context would only be
* triggered after the mapping is removed.
*
* If an explicit or implicit unbind was missed by the userspace then the
* mapping will persist. On process exit kernel itself will remove the mapping.
*
* When mali_kbase_supports_csg_cs_user_page_allocation() is true:
* No specific actions are required for CS_USER IO pages. CSG termination
* will take care of it.
*/
void kbase_csf_free_command_stream_user_pages(struct kbase_context *kctx,
struct kbase_queue *queue);
/**
* kbase_csf_alloc_command_stream_user_pages() - Allocate queue resources
* at bind time.
*
* @kctx: Pointer to the kbase context within which the resources
* for the queue are being allocated.
* @queue: Pointer to the queue for which to allocate resources.
*
* This function reserves a hardware doorbell page for the queue and
* takes a reference on the queue.
*
* When mali_kbase_supports_csg_cs_user_page_allocation() is false:
* The function allocates a pair of User mode input/output pages for a
* GPU command queue and maps them in the shared interface segment of MCU
* firmware address space.
*
* When mali_kbase_supports_csg_cs_user_page_allocation() is true:
* A slot of size CS_USER_INPUT_BLOCK_SIZE is assigned to the queue in
* the CS_USER IO page owned by the CSG.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_alloc_command_stream_user_pages(struct kbase_context *kctx,
struct kbase_queue *queue);
/**
* kbase_csf_queue_bind - Bind a GPU command queue to a queue group.
*
* @kctx: The kbase context.
* @bind: Pointer to the union which specifies a queue group and a
* queue to be bound to that group.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_bind(struct kbase_context *kctx, union kbase_ioctl_cs_queue_bind *bind);
/**
* kbase_csf_queue_unbind - Unbind a GPU command queue from a queue group
* to which it has been bound and free
* resources allocated for this queue if there
* are any.
*
* @queue: Pointer to queue to be unbound.
* @process_exit: Flag to indicate if process exit is happening.
*/
void kbase_csf_queue_unbind(struct kbase_queue *queue, bool process_exit);
/**
* kbase_csf_queue_unbind_stopped - Unbind a GPU command queue in the case
* where it was never started.
* @queue: Pointer to queue to be unbound.
*
* Variant of kbase_csf_queue_unbind() for use on error paths for cleaning up
* queues that failed to fully bind.
*/
void kbase_csf_queue_unbind_stopped(struct kbase_queue *queue);
/**
* kbase_csf_queue_kick - Schedule a GPU command queue on the firmware
*
* @kctx: The kbase context.
* @kick: Pointer to the struct which specifies the queue
* that needs to be scheduled.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_kick(struct kbase_context *kctx, struct kbase_ioctl_cs_queue_kick *kick);
/**
* kbase_csf_find_queue_group - Find the queue group corresponding
* to the indicated handle.
*
* @kctx: The kbase context under which the queue group exists.
* @group_handle: Handle for the group which uniquely identifies it within
* the context with which it was created.
*
* This function is used to find the queue group when passed a handle.
*
* Return: Pointer to a queue group on success, NULL on failure
*/
struct kbase_queue_group *kbase_csf_find_queue_group(struct kbase_context *kctx, u8 group_handle);
/**
* kbase_csf_queue_group_handle_is_valid - Find if the given queue group handle
* is valid.
*
* @kctx: The kbase context under which the queue group exists.
* @group_handle: Handle for the group which uniquely identifies it within
* the context with which it was created.
*
* This function is used to determine if the queue group handle is valid.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_group_handle_is_valid(struct kbase_context *kctx, u8 group_handle);
/**
* kbase_csf_queue_group_clear_faults - Re-enable CS Fault reporting.
*
* @kctx: Pointer to the kbase context within which the
* CS Faults for the queues has to be re-enabled.
* @clear_faults: Pointer to the structure which contains details of the
* queues for which the CS Fault reporting has to be re-enabled.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_group_clear_faults(struct kbase_context *kctx,
struct kbase_ioctl_queue_group_clear_faults *clear_faults);
/**
* kbase_csf_queue_group_create - Create a GPU command queue group.
*
* @kctx: Pointer to the kbase context within which the
* queue group is to be created.
* @create: Pointer to the structure which contains details of the
* queue group which is to be created within the
* provided kbase context.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_queue_group_create(struct kbase_context *kctx,
union kbase_ioctl_cs_queue_group_create *create);
/**
* kbase_csf_queue_group_terminate - Terminate a GPU command queue group.
*
* @kctx: Pointer to the kbase context within which the
* queue group is to be terminated.
* @group_handle: Pointer to the structure which identifies the queue
* group which is to be terminated.
*/
void kbase_csf_queue_group_terminate(struct kbase_context *kctx, u8 group_handle);
/**
* kbase_csf_term_descheduled_queue_group - Terminate a GPU command queue
* group that is not operational
* inside the scheduler.
*
* @group: Pointer to the structure which identifies the queue
* group to be terminated. The function assumes that the caller
* is sure that the given group is not operational inside the
* scheduler. If in doubt, use its alternative:
* @ref kbase_csf_queue_group_terminate().
*/
void kbase_csf_term_descheduled_queue_group(struct kbase_queue_group *group);
#if IS_ENABLED(CONFIG_MALI_VECTOR_DUMP) || MALI_UNIT_TEST
/**
* kbase_csf_queue_group_suspend - Suspend a GPU command queue group
*
* @kctx: The kbase context for which the queue group is to be
* suspended.
* @sus_buf: Pointer to the structure which contains details of the
* user buffer and its kernel pinned pages.
* @group_handle: Handle for the group which uniquely identifies it within
* the context within which it was created.
*
* This function is used to suspend a queue group and copy the suspend buffer.
*
* Return: 0 on success or negative value if failed to suspend
* queue group and copy suspend buffer contents.
*/
int kbase_csf_queue_group_suspend(struct kbase_context *kctx,
struct kbase_suspend_copy_buffer *sus_buf, u8 group_handle);
#endif
/**
* kbase_csf_add_group_fatal_error - Report a fatal group error to userspace
*
* @group: GPU command queue group.
* @err_payload: Error payload to report.
*/
void kbase_csf_add_group_fatal_error(struct kbase_queue_group *const group,
struct base_gpu_queue_group_error const *const err_payload);
/**
* kbase_csf_interrupt - Handle interrupts issued by CSF firmware.
*
* @kbdev: The kbase device to handle an IRQ for
* @val: The value of JOB IRQ status register which triggered the interrupt
*/
void kbase_csf_interrupt(struct kbase_device *kbdev, u32 val);
/**
* kbase_csf_handle_csg_sync_update - Handle SYNC_UPDATE notification for the group.
*
* @kbdev: The kbase device to handle the SYNC_UPDATE interrupt.
* @group_id: CSG index.
* @group: Pointer to the GPU command queue group.
* @req: CSG_REQ register value corresponding to @group.
* @ack: CSG_ACK register value corresponding to @group.
*/
void kbase_csf_handle_csg_sync_update(struct kbase_device *const kbdev, u32 group_id,
struct kbase_queue_group *group, u32 req, u32 ack);
/**
* kbase_csf_doorbell_mapping_init - Initialize the fields that facilitates
* the update of userspace mapping of HW
* doorbell page.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*
* The function creates a file and allocates a dummy page to facilitate the
* update of userspace mapping to point to the dummy page instead of the real
* HW doorbell page after the suspend of queue group.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_doorbell_mapping_init(struct kbase_device *kbdev);
/**
* kbase_csf_doorbell_mapping_term - Free the dummy page & close the file used
* to update the userspace mapping of HW doorbell page
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*/
void kbase_csf_doorbell_mapping_term(struct kbase_device *kbdev);
/**
* kbase_csf_setup_dummy_user_reg_page - Setup the dummy page that is accessed
* instead of the User register page after
* the GPU power down.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*
* The function allocates a dummy page which is used to replace the User
* register page in the userspace mapping after the power down of GPU.
* On the power up of GPU, the mapping is updated to point to the real
* User register page. The mapping is used to allow access to LATEST_FLUSH
* register from userspace.
*
* Return: 0 on success, or negative on failure.
*/
int kbase_csf_setup_dummy_user_reg_page(struct kbase_device *kbdev);
/**
* kbase_csf_free_dummy_user_reg_page - Free the dummy page that was used
* to replace the User register page
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*/
void kbase_csf_free_dummy_user_reg_page(struct kbase_device *kbdev);
/**
* kbase_csf_pending_gpuq_kick_queues_init - Initialize the data used for handling
* GPU queue kicks.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*/
void kbase_csf_pending_gpuq_kick_queues_init(struct kbase_device *kbdev);
/**
* kbase_csf_pending_gpuq_kick_queues_term - De-initialize the data used for handling
* GPU queue kicks.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*/
void kbase_csf_pending_gpuq_kick_queues_term(struct kbase_device *kbdev);
/**
* kbase_csf_ring_csg_doorbell - ring the doorbell for a CSG interface.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @slot: Index of CSG interface for ringing the door-bell.
*
* The function kicks a notification on the CSG interface to firmware.
*/
void kbase_csf_ring_csg_doorbell(struct kbase_device *kbdev, int slot);
/**
* kbase_csf_ring_csg_slots_doorbell - ring the doorbell for a set of CSG
* interfaces.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @slot_bitmap: bitmap for the given slots, slot-0 on bit-0, etc.
*
* The function kicks a notification on a set of CSG interfaces to firmware.
*/
void kbase_csf_ring_csg_slots_doorbell(struct kbase_device *kbdev, u32 slot_bitmap);
/**
* kbase_csf_ring_cs_kernel_doorbell - ring the kernel doorbell for a CSI
* assigned to a GPU queue
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @csi_index: ID of the CSI assigned to the GPU queue.
* @csg_nr: Index of the CSG slot assigned to the queue
* group to which the GPU queue is bound.
* @ring_csg_doorbell: Flag to indicate if the CSG doorbell needs to be rung
* after updating the CSG_DB_REQ. So if this flag is false
* the doorbell interrupt will not be sent to FW.
* The flag is supposed be false only when the input page
* for bound GPU queues is programmed at the time of
* starting/resuming the group on a CSG slot.
*
* The function sends a doorbell interrupt notification to the firmware for
* a CSI assigned to a GPU queue.
*/
void kbase_csf_ring_cs_kernel_doorbell(struct kbase_device *kbdev, int csi_index, int csg_nr,
bool ring_csg_doorbell);
/**
* kbase_csf_ring_cs_user_doorbell - ring the user doorbell allocated for a
* queue.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @queue: Pointer to the queue for ringing the door-bell.
*
* The function kicks a notification to the firmware on the doorbell assigned
* to the queue.
*/
void kbase_csf_ring_cs_user_doorbell(struct kbase_device *kbdev, struct kbase_queue *queue);
/**
* kbase_csf_active_queue_groups_reset - Reset the state of all active GPU
* command queue groups associated with the context.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @kctx: The kbase context.
*
* This function will iterate through all the active/scheduled GPU command
* queue groups associated with the context, deschedule and mark them as
* terminated (which will then lead to unbinding of all the queues bound to
* them) and also no more work would be allowed to execute for them.
*
* This is similar to the action taken in response to an unexpected OoM event.
*/
void kbase_csf_active_queue_groups_reset(struct kbase_device *kbdev, struct kbase_context *kctx);
/**
* kbase_csf_priority_check - Check the priority requested
*
* @kbdev: Device pointer
* @req_priority: Requested priority
*
* This will determine whether the requested priority can be satisfied.
*
* Return: The same or lower priority than requested.
*/
u8 kbase_csf_priority_check(struct kbase_device *kbdev, u8 req_priority);
extern const u8 kbasep_csf_queue_group_priority_to_relative[BASE_QUEUE_GROUP_PRIORITY_COUNT];
extern const u8 kbasep_csf_relative_to_queue_group_priority[KBASE_QUEUE_GROUP_PRIORITY_COUNT];
/**
* kbase_csf_priority_relative_to_queue_group_priority - Convert relative to base priority
*
* @priority: kbase relative priority
*
* This will convert the monotonically increasing realtive priority to the
* fixed base priority list.
*
* Return: base_queue_group_priority priority.
*/
static inline u8 kbase_csf_priority_relative_to_queue_group_priority(u8 priority)
{
if (priority >= KBASE_QUEUE_GROUP_PRIORITY_COUNT)
priority = KBASE_QUEUE_GROUP_PRIORITY_LOW;
return kbasep_csf_relative_to_queue_group_priority[priority];
}
/**
* kbase_csf_priority_queue_group_priority_to_relative - Convert base priority to relative
*
* @priority: base_queue_group_priority priority
*
* This will convert the fixed base priority list to monotonically increasing realtive priority.
*
* Return: kbase relative priority.
*/
static inline u8 kbase_csf_priority_queue_group_priority_to_relative(u8 priority)
{
/* Apply low priority in case of invalid priority */
if (priority >= BASE_QUEUE_GROUP_PRIORITY_COUNT)
priority = BASE_QUEUE_GROUP_PRIORITY_LOW;
return kbasep_csf_queue_group_priority_to_relative[priority];
}
/**
* kbase_csf_ktrace_gpu_cycle_cnt - Wrapper to retrieve the GPU cycle counter
* value for Ktrace purpose.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*
* This function is just a wrapper to retrieve the GPU cycle counter value, to
* avoid any overhead on Release builds where Ktrace is disabled by default.
*
* Return: Snapshot of the GPU cycle count register.
*/
static inline u64 kbase_csf_ktrace_gpu_cycle_cnt(struct kbase_device *kbdev)
{
#if KBASE_KTRACE_ENABLE
return kbase_backend_get_cycle_cnt(kbdev);
#else
CSTD_UNUSED(kbdev);
return 0;
#endif
}
/**
* kbase_csf_process_queue_kick() - Process a pending kicked GPU command queue.
*
* @queue: Pointer to the queue to process.
*
* This function starts the pending queue, for which the work
* was previously submitted via ioctl call from application thread.
* If the queue is already scheduled and resident, it will be started
* right away, otherwise once the group is made resident.
*/
void kbase_csf_process_queue_kick(struct kbase_queue *queue);
/**
* kbase_csf_process_protm_event_request - Handle protected mode switch request
*
* @group: The group to handle protected mode request
*
* Request to switch to protected mode.
*/
void kbase_csf_process_protm_event_request(struct kbase_queue_group *group);
/**
* kbase_csf_glb_fatal_worker - Worker function for handling GLB FATAL error
*
* @data: Pointer to a work_struct embedded in kbase device.
*
* Handle the GLB fatal error
*/
void kbase_csf_glb_fatal_worker(struct work_struct *const data);
/**
* kbase_csf_queue_oom_state_str() - Helper function to get string
* for kbase queue OoM tracking state.
* @state: kbase OoM track state
*
* Return: string representation of kbase OoM track state
*/
static inline const char *kbase_csf_queue_oom_state_str(enum kbase_csf_queue_oom_state state)
{
switch (state) {
case KBASE_CSF_QUEUE_OOM_NONE:
return "KBASE_CSF_QUEUE_OOM_NONE";
case KBASE_CSF_QUEUE_OOM_PENDING:
return "KBASE_CSF_QUEUE_OOM_PENDING";
case KBASE_CSF_QUEUE_OOM_COMPLETE:
return "KBASE_CSF_QUEUE_OOM_COMPLETE";
case KBASE_CSF_QUEUE_OOM_ERROR_ABORT:
return "KBASE_CSF_QUEUE_OOM_ERROR_ABORT";
default:
return "[UnknownState]";
}
}
/**
* kbase_csf_cs_get_pending_oom - Get the data for the pending OoM event.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
* @queue: Pointer to the queue to process.
* @slot_id: Slot index where the CSG is residing.
*
* The OoM data and the request state is saved in the queue's OoM tracking
* structure.
*
* Return: 0 on success,
* -EINVAL if slot_id is invalid, or tiler OoM state is incorrect.
* -EBUSY if tiler OoM is already in KBASE_CSF_QUEUE_OOM_PENDING state.
*/
int kbase_csf_cs_get_pending_oom(struct kbase_device *kbdev, struct kbase_queue *queue,
int const slot_id);
/**
* kbase_csf_program_cs_oom_prepared_chunk - Program the prepared OoM chunk to CSF.
*
* @queue: Pointer to the queue to process.
* @slot_id: Slot index where the CSG is residing.
* @cs_oom_req: Value for CS_REQ reg to clear the pending Tiler OoM request.
*
*/
void kbase_csf_program_cs_oom_prepared_chunk(struct kbase_queue *queue, u32 slot_id,
u32 cs_oom_req);
/**
* kbase_csf_cs_prepare_pending_oom_tiler_heap_chunk - Prepare the chunk for
* the pending Tiler OoM event.
*
* @queue: Pointer to the queue to process.
*
* The pointer to the allocated chunk is saved in the queue's OoM tracking data and
* the OoM tracking state is updated.
*
* Return: 0 on success,
* -EINVAL if tiler OoM tracking state is incorrect.
* negative error code on allocation failure.
*/
int kbase_csf_cs_prepare_pending_oom_tiler_heap_chunk(struct kbase_queue *queue);
/**
* kbase_csf_free_oom_tiler_heap_chunk - Free the allocated tiler OoM chunk
*
* @queue: Pointer to the queue to process.
*
* This function should be used to free the allocated chunk if the chunk can't be
* programmed to FW. OoM tracking data and state are updated.
*
* Return: 0 on success,
* -EINVAL if tiler OoM tracking state is incorrect.
* negative error code on freeing failure.
*/
int kbase_csf_free_oom_tiler_heap_chunk(struct kbase_queue *queue);
/**
* kbase_csf_handle_pending_oom_interrupt() - Handler for a tiler heap OoM request IRQ.
*
* @queue: Pointer to queue for which OoM event was received.
* @group_id: CSG index.
*
* Get pending OoM request and enqueue the OoM event work.
*
* Return: 0 on success,
* -EBUSY when trying to enqueue an already-queued OoM work.
*/
int kbase_csf_handle_pending_oom_interrupt(struct kbase_queue *const queue, u32 group_id);
/**
* kbase_csf_report_cs_fault_info() - Assmble the CS fault event information.
*
* @queue: Pointer to queue for which a fault event was received.
* @slot_id: On-slot CSG index, where the queue fault was raised.
* @atomic_ctx: Calling from an interrupt handler, or from a kthread.
*
* Assembles the CS fault information and prints it out in a meaningful way in the log. The
* function is expected to be only called when the caller is notified with a valid CS fault
* event and the queue/bound-csg resides in the given slot.
*/
void kbase_csf_report_cs_fault_info(struct kbase_queue *const queue, u32 slot_id, bool atomic_ctx);
/**
* kbase_csf_report_cs_fatal_info() - Assmble the CS fatal information.
*
* @queue: Pointer to queue for which fatal event was received.
* @slot_id: On-slot CSG index, where the queue fatal error was raised.
* @atomic_ctx: Calling from an interrupt handler, or from a kthread.
*
* Assembles the CS fatal information and prints it out in a meaningful way in the log. The
* function is expected to be only called when the caller is notified with a valid CS fatal
* error event and the queue/bound-csg resides in the given slot.
*
* Return: the extracted CS_FATAL_EXCEPTION_TYPE.
*/
u32 kbase_csf_report_cs_fatal_info(struct kbase_queue *const queue, u32 slot_id, bool atomic_ctx);
/**
* kbase_csf_dev_has_ne - Report whether the device has Neural Engine support.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*
* Return: true on Neural Engine supported, otherwise false.
*/
static inline bool kbase_csf_dev_has_ne(struct kbase_device *kbdev)
{
return kbdev->gpu_props.gpu_features.neural_engine;
}
/**
* kbase_csf_dev_has_rtu - Report whether the device has Ray Traversal support.
*
* @kbdev: Instance of a GPU platform device that implements a CSF interface.
*
* Return: true if Ray Traversal supported, otherwise false.
*/
static inline bool kbase_csf_dev_has_rtu(struct kbase_device *kbdev)
{
return kbdev->gpu_props.gpu_features.ray_traversal;
}
#endif /* _KBASE_CSF_H_ */

View File

@@ -0,0 +1,138 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2023-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include "mali_kbase_csf_cpu_queue.h"
#include "mali_kbase_csf_util.h"
#include <mali_kbase.h>
#include <asm/atomic.h>
void kbase_csf_cpu_queue_init(struct kbase_context *kctx)
{
if (WARN_ON(!kctx))
return;
kctx->csf.cpu_queue.buffer = NULL;
kctx->csf.cpu_queue.buffer_size = 0;
atomic_set(&kctx->csf.cpu_queue.dump_req_status, BASE_CSF_CPU_QUEUE_DUMP_COMPLETE);
}
bool kbase_csf_cpu_queue_read_dump_req(struct kbase_context *kctx,
struct base_csf_notification *req)
{
if (atomic_cmpxchg(&kctx->csf.cpu_queue.dump_req_status, BASE_CSF_CPU_QUEUE_DUMP_ISSUED,
BASE_CSF_CPU_QUEUE_DUMP_PENDING) != BASE_CSF_CPU_QUEUE_DUMP_ISSUED) {
return false;
}
req->type = BASE_CSF_NOTIFICATION_CPU_QUEUE_DUMP;
return true;
}
bool kbase_csf_cpu_queue_dump_needed(struct kbase_context *kctx)
{
return (atomic_read(&kctx->csf.cpu_queue.dump_req_status) ==
BASE_CSF_CPU_QUEUE_DUMP_ISSUED);
}
int kbase_csf_cpu_queue_dump_buffer(struct kbase_context *kctx, u64 buffer, size_t buf_size)
{
size_t alloc_size = buf_size;
char *dump_buffer;
if (!buffer || !buf_size)
return 0;
if (alloc_size > KBASE_MEM_ALLOC_MAX_SIZE)
return -EINVAL;
alloc_size = (alloc_size + PAGE_SIZE) & ~(PAGE_SIZE - 1);
dump_buffer = kzalloc(alloc_size, GFP_KERNEL);
if (!dump_buffer)
return -ENOMEM;
WARN_ON(kctx->csf.cpu_queue.buffer != NULL);
if (copy_from_user(dump_buffer, u64_to_user_ptr(buffer), buf_size)) {
kfree(dump_buffer);
return -EFAULT;
}
mutex_lock(&kctx->csf.lock);
kfree(kctx->csf.cpu_queue.buffer);
if (atomic_read(&kctx->csf.cpu_queue.dump_req_status) == BASE_CSF_CPU_QUEUE_DUMP_PENDING) {
kctx->csf.cpu_queue.buffer = dump_buffer;
kctx->csf.cpu_queue.buffer_size = buf_size;
complete_all(&kctx->csf.cpu_queue.dump_cmp);
} else
kfree(dump_buffer);
mutex_unlock(&kctx->csf.lock);
return 0;
}
int kbasep_csf_cpu_queue_dump_print(struct kbase_context *kctx, struct kbasep_printer *kbpr)
{
bool timed_out = false;
mutex_lock(&kctx->csf.lock);
if (atomic_read(&kctx->csf.cpu_queue.dump_req_status) != BASE_CSF_CPU_QUEUE_DUMP_COMPLETE) {
kbasep_print(kbpr, "Dump request already started! (try again)\n");
mutex_unlock(&kctx->csf.lock);
return -EBUSY;
}
atomic_set(&kctx->csf.cpu_queue.dump_req_status, BASE_CSF_CPU_QUEUE_DUMP_ISSUED);
init_completion(&kctx->csf.cpu_queue.dump_cmp);
kbase_event_wakeup(kctx);
mutex_unlock(&kctx->csf.lock);
kbasep_print(kbpr, "CPU Queues table (version:v" __stringify(
MALI_CSF_CPU_QUEUE_DUMP_VERSION) "):\n");
if (WARN_ON(!wait_for_completion_timeout(&kctx->csf.cpu_queue.dump_cmp,
msecs_to_jiffies(3000)))) {
kbasep_print(kbpr, "Failed to wait for completion of dump request\n");
timed_out = true;
}
mutex_lock(&kctx->csf.lock);
if (!timed_out && kctx->csf.cpu_queue.buffer) {
WARN_ON(atomic_read(&kctx->csf.cpu_queue.dump_req_status) !=
BASE_CSF_CPU_QUEUE_DUMP_PENDING);
/* The CPU queue dump is returned as a single formatted string */
kbasep_puts(kbpr, kctx->csf.cpu_queue.buffer);
kbasep_puts(kbpr, "\n");
kfree(kctx->csf.cpu_queue.buffer);
kctx->csf.cpu_queue.buffer = NULL;
kctx->csf.cpu_queue.buffer_size = 0;
} else
kbasep_print(kbpr, "Dump error! (timed_out = %d)\n", timed_out);
atomic_set(&kctx->csf.cpu_queue.dump_req_status, BASE_CSF_CPU_QUEUE_DUMP_COMPLETE);
mutex_unlock(&kctx->csf.lock);
return 0;
}

View File

@@ -0,0 +1,90 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CSF_CPU_QUEUE_H_
#define _KBASE_CSF_CPU_QUEUE_H_
#include <linux/types.h>
/* Forward declaration */
struct base_csf_notification;
struct kbase_context;
struct kbasep_printer;
#define MALI_CSF_CPU_QUEUE_DUMP_VERSION 0
/* CPU queue dump status */
/* Dumping is done or no dumping is in progress. */
#define BASE_CSF_CPU_QUEUE_DUMP_COMPLETE 0
/* Dumping request is pending. */
#define BASE_CSF_CPU_QUEUE_DUMP_PENDING 1
/* Dumping request is issued to Userspace */
#define BASE_CSF_CPU_QUEUE_DUMP_ISSUED 2
/**
* kbase_csf_cpu_queue_init() - Initialise cpu queue handling per context cpu queue(s)
*
* @kctx: The kbase_context
*/
void kbase_csf_cpu_queue_init(struct kbase_context *kctx);
/**
* kbase_csf_cpu_queue_read_dump_req() - Read cpu queue dump request event
*
* @kctx: The kbase_context which cpu queue dumped belongs to.
* @req: Notification with cpu queue dump request.
*
* Return: true if needs CPU queue dump, or false otherwise.
*/
bool kbase_csf_cpu_queue_read_dump_req(struct kbase_context *kctx,
struct base_csf_notification *req);
/**
* kbase_csf_cpu_queue_dump_needed() - Check the requirement for cpu queue dump
*
* @kctx: The kbase_context which cpu queue dumped belongs to.
*
* Return: true if it needs cpu queue dump, or false otherwise.
*/
bool kbase_csf_cpu_queue_dump_needed(struct kbase_context *kctx);
/**
* kbase_csf_cpu_queue_dump_buffer() - dump buffer containing cpu queue information
*
* @kctx: The kbase_context which cpu queue dumped belongs to.
* @buffer: Buffer containing the cpu queue information.
* @buf_size: Buffer size.
*
* Return: Return 0 for dump successfully, or error code.
*/
int kbase_csf_cpu_queue_dump_buffer(struct kbase_context *kctx, u64 buffer, size_t buf_size);
/**
* kbasep_csf_cpu_queue_dump_print() - Dump cpu queue information to file
*
* @kctx: The kbase_context which cpu queue dumped belongs to.
* @kbpr: Pointer to printer instance.
*
* Return: Return 0 for dump successfully, or error code.
*/
int kbasep_csf_cpu_queue_dump_print(struct kbase_context *kctx, struct kbasep_printer *kbpr);
#endif /* _KBASE_CSF_CPU_QUEUE_H_ */

View File

@@ -0,0 +1,89 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include "mali_kbase_csf_cpu_queue_debugfs.h"
#if IS_ENABLED(CONFIG_DEBUG_FS)
#include "mali_kbase_csf_cpu_queue.h"
#include "mali_kbase_csf_util.h"
#include <mali_kbase.h>
#include <linux/seq_file.h>
/**
* kbasep_csf_cpu_queue_debugfs_show() - Print cpu queue information for per context
*
* @file: The seq_file for printing to
* @data: The debugfs dentry private data, a pointer to kbase_context
*
* Return: Negative error code or 0 on success.
*/
static int kbasep_csf_cpu_queue_debugfs_show(struct seq_file *file, void *data)
{
struct kbasep_printer *kbpr;
struct kbase_context *const kctx = file->private;
int ret = -EINVAL;
CSTD_UNUSED(data);
kbpr = kbasep_printer_file_init(file);
if (kbpr != NULL) {
ret = kbasep_csf_cpu_queue_dump_print(kctx, kbpr);
kbasep_printer_term(kbpr);
}
return ret;
}
static int kbasep_csf_cpu_queue_debugfs_open(struct inode *in, struct file *file)
{
return single_open(file, kbasep_csf_cpu_queue_debugfs_show, in->i_private);
}
static const struct file_operations kbasep_csf_cpu_queue_debugfs_fops = {
.open = kbasep_csf_cpu_queue_debugfs_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx)
{
struct dentry *file;
if (WARN_ON(!kctx || IS_ERR_OR_NULL(kctx->kctx_dentry)))
return;
file = debugfs_create_file("cpu_queue", 0444, kctx->kctx_dentry, kctx,
&kbasep_csf_cpu_queue_debugfs_fops);
if (IS_ERR_OR_NULL(file)) {
dev_warn(kctx->kbdev->dev, "Unable to create cpu queue debugfs entry");
}
}
#else
/*
* Stub functions for when debugfs is disabled
*/
void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx)
{
}
#endif /* CONFIG_DEBUG_FS */

View File

@@ -0,0 +1,35 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
*
* (C) COPYRIGHT 2020-2023 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#ifndef _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_
#define _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_
/* Forward declaration */
struct kbase_context;
/**
* kbase_csf_cpu_queue_debugfs_init() - Create a debugfs entry for per context cpu queue(s)
*
* @kctx: The kbase_context for which to create the debugfs entry
*/
void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx);
#endif /* _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_ */

View File

@@ -0,0 +1,709 @@
// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
/*
*
* (C) COPYRIGHT 2023-2024 ARM Limited. All rights reserved.
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, you can access it online at
* http://www.gnu.org/licenses/gpl-2.0.html.
*
*/
#include "mali_kbase_csf_csg.h"
#include "mali_kbase_csf_scheduler.h"
#include "mali_kbase_csf_util.h"
#include <mali_kbase.h>
#include <linux/delay.h>
#include <backend/gpu/mali_kbase_pm_internal.h>
/* Wait time to be used cumulatively for all the CSG slots.
* Since scheduler lock is held when STATUS_UPDATE request is sent, there won't be
* any other Host request pending on the FW side and usually FW would be responsive
* to the Doorbell IRQs as it won't do any polling for a long time and also it won't
* have to wait for any HW state transition to complete for publishing the status.
* So it is reasonable to expect that handling of STATUS_UPDATE request would be
* relatively very quick.
*/
#define STATUS_UPDATE_WAIT_TIMEOUT_NS 500
/* Number of nearby commands around the "cmd_ptr" of GPU queues.
*
* [cmd_ptr - MAX_NR_NEARBY_INSTR, cmd_ptr + MAX_NR_NEARBY_INSTR].
*/
#define MAX_NR_NEARBY_INSTR 32
/* The bitmask of CSG slots for which the STATUS_UPDATE request completed.
* The access to it is serialized with scheduler lock, so at a time it would
* get used either for "active_groups" or per context "groups".
*/
static DECLARE_BITMAP(csg_slots_status_updated, BASEP_QUEUE_GROUP_MAX);
/* String header for dumping cs user I/O status information */
#define KBASEP_CSF_CSG_DUMP_CS_HEADER_USER_IO \
"Bind Idx, Ringbuf addr, Size, Prio, Insert offset, Extract offset, Active, Doorbell\n"
/* String representation of WAITING */
#define WAITING "Waiting"
/* String representation of NOT_WAITING */
#define NOT_WAITING "Not waiting"
/**
* csg_slot_status_update_finish() - Complete STATUS_UPDATE request for a group slot.
*
* @kbdev: Pointer to kbase device.
* @csg_nr: The group slot number.
*
* Return: True if completed, false otherwise.
*/
static bool csg_slot_status_update_finish(struct kbase_device *kbdev, u32 csg_nr)
{
u32 csg_req;
u32 csg_ack;
csg_req = kbase_csf_fw_io_group_input_read(&kbdev->csf.fw_io, csg_nr, CSG_REQ);
csg_ack = kbase_csf_fw_io_group_read(&kbdev->csf.fw_io, csg_nr, CSG_ACK);
return !((csg_req ^ csg_ack) & CSG_REQ_STATUS_UPDATE_MASK);
}
/**
* csg_slots_status_update_finish() - Complete STATUS_UPDATE requests for all group slots.
*
* @kbdev: Pointer to kbase device.
* @slots_mask: The group slots mask.
*
* Return: True if completed, false otherwise.
*/
static bool csg_slots_status_update_finish(struct kbase_device *kbdev,
const unsigned long *slots_mask)
{
const u32 max_csg_slots = kbdev->csf.global_iface.group_num;
bool changed = false;
u32 csg_nr;
lockdep_assert_held(&kbdev->csf.scheduler.lock);
for_each_set_bit(csg_nr, slots_mask, max_csg_slots) {
if (csg_slot_status_update_finish(kbdev, csg_nr)) {
set_bit(csg_nr, csg_slots_status_updated);
changed = true;
}
}
return changed;
}
/**
* wait_csg_slots_status_update_finish() - Wait completion of STATUS_UPDATE requests for all
* group slots.
*
* @kbdev: Pointer to kbase device.
* @slots_mask: The group slots mask.
*/
static void wait_csg_slots_status_update_finish(struct kbase_device *kbdev,
unsigned long *slots_mask)
{
const u32 max_csg_slots = kbdev->csf.global_iface.group_num;
long remaining = kbase_csf_timeout_in_jiffies(STATUS_UPDATE_WAIT_TIMEOUT_NS);
lockdep_assert_held(&kbdev->csf.scheduler.lock);
bitmap_zero(csg_slots_status_updated, max_csg_slots);
while (!bitmap_empty(slots_mask, max_csg_slots) && remaining) {
remaining = kbase_csf_fw_io_wait_event_timeout(
&kbdev->csf.fw_io, kbdev->csf.event_wait,
csg_slots_status_update_finish(kbdev, slots_mask), remaining);
if (likely(remaining > 0)) {
bitmap_andnot(slots_mask, slots_mask, csg_slots_status_updated,
max_csg_slots);
} else if (!remaining) {
dev_warn(kbdev->dev, "STATUS_UPDATE request timed out for slots 0x%lx",
slots_mask[0]);
}
}
}
/**
* blocked_reason_to_string() - Convert blocking reason id to a string
*
* @reason_id: blocked_reason
*
* Return: Suitable string
*/
static const char *blocked_reason_to_string(u32 reason_id)
{
/* possible blocking reasons of a cs */
static const char *const cs_blocked_reason[] = {
[CS_STATUS_BLOCKED_REASON_REASON_UNBLOCKED] = "UNBLOCKED",
[CS_STATUS_BLOCKED_REASON_REASON_WAIT] = "WAIT",
[CS_STATUS_BLOCKED_REASON_REASON_PROGRESS_WAIT] = "PROGRESS_WAIT",
[CS_STATUS_BLOCKED_REASON_REASON_SYNC_WAIT] = "SYNC_WAIT",
[CS_STATUS_BLOCKED_REASON_REASON_DEFERRED] = "DEFERRED",
[CS_STATUS_BLOCKED_REASON_REASON_RESOURCE] = "RESOURCE",
[CS_STATUS_BLOCKED_REASON_REASON_FLUSH] = "FLUSH"
};
if (WARN_ON(reason_id >= ARRAY_SIZE(cs_blocked_reason)))
return "UNKNOWN_BLOCKED_REASON_ID";
return cs_blocked_reason[reason_id];
}
/**
* sb_source_supported() - Check SB_SOURCE GLB version support
*
* @glb_version: The GLB version
*
* Return: False or true on success.
*/
static bool sb_source_supported(u32 glb_version)
{
bool supported = false;
if (((GLB_VERSION_MAJOR_GET(glb_version) == 3) &&
(GLB_VERSION_MINOR_GET(glb_version) >= 5)) ||
((GLB_VERSION_MAJOR_GET(glb_version) == 2) &&
(GLB_VERSION_MINOR_GET(glb_version) >= 6)) ||
((GLB_VERSION_MAJOR_GET(glb_version) == 1) &&
(GLB_VERSION_MINOR_GET(glb_version) >= 3)))
supported = true;
return supported;
}
/**
* progress_counters_supported() - Check PROGRESS_COUNTER GLB version support
*
* @glb_version: The GLB version
*
* Return: False or true on success.
*/
static bool progress_counters_supported(u32 glb_version)
{
return !(GLB_VERSION_MAJOR_GET(glb_version) >= 4);
}
/**
* kbasep_csf_csg_active_dump_cs_status_wait() - Dump active queue sync status information.
*
* @kctx: Pointer to kbase context.
* @kbpr: Pointer to printer instance.
* @glb_version: The GLB version.
* @wait_status: The CS_STATUS_WAIT value.
* @wait_sync_value: The queue's cached sync value.
* @wait_sync_live_value: The queue's sync object current value.
* @wait_sync_pointer: The queue's sync object pointer.
* @sb_status: The CS_STATUS_SCOREBOARDS value.
* @blocked_reason: The CS_STATUS_BLCOKED_REASON value.
*/
static void kbasep_csf_csg_active_dump_cs_status_wait(struct kbase_context *kctx,
struct kbasep_printer *kbpr, u32 glb_version,
u32 wait_status, u32 wait_sync_value,
u64 wait_sync_live_value,
u64 wait_sync_pointer, u32 sb_status,
u32 blocked_reason)
{
kbasep_print(kbpr, "SB_MASK: %d\n", CS_STATUS_WAIT_SB_MASK_GET(wait_status));
if (sb_source_supported(glb_version))
kbasep_print(kbpr, "SB_SOURCE: %d\n", CS_STATUS_WAIT_SB_SOURCE_GET(wait_status));
if (progress_counters_supported(glb_version)) {
kbasep_print(kbpr, "PROGRESS_WAIT: %s\n",
CS_STATUS_WAIT_PROGRESS_WAIT_GET(wait_status) ? WAITING : NOT_WAITING);
}
kbasep_print(kbpr, "PROTM_PEND: %s\n",
CS_STATUS_WAIT_PROTM_PEND_GET(wait_status) ? WAITING : NOT_WAITING);
kbasep_print(kbpr, "SYNC_WAIT: %s\n",
CS_STATUS_WAIT_SYNC_WAIT_GET(wait_status) ? WAITING : NOT_WAITING);
kbasep_print(kbpr, "WAIT_CONDITION: %s\n",
CS_STATUS_WAIT_SYNC_WAIT_CONDITION_GET(wait_status) ? "greater than" :
"less or equal");
kbasep_print(kbpr, "SYNC_POINTER: 0x%llx\n", wait_sync_pointer);
kbasep_print(kbpr, "SYNC_VALUE: %d\n", wait_sync_value);
kbasep_print(kbpr, "SYNC_LIVE_VALUE: 0x%016llx\n", wait_sync_live_value);
kbasep_print(kbpr, "SB_STATUS: %u\n", CS_STATUS_SCOREBOARDS_NONZERO_GET(sb_status));
kbasep_print(kbpr, "BLOCKED_REASON: %s\n",
blocked_reason_to_string(CS_STATUS_BLOCKED_REASON_REASON_GET(blocked_reason)));
}
/**
* kbasep_csf_csg_active_dump_cs_trace() - Dump active queue CS trace information.
*
* @kctx: Pointer to kbase context.
* @kbpr: Pointer to printer instance.
* @group_id: CSG index.
* @stream_id: CS index.
*/
static void kbasep_csf_csg_active_dump_cs_trace(struct kbase_context *kctx,
struct kbasep_printer *kbpr, u32 group_id,
u32 stream_id)
{
u32 val;
u64 addr;
val = kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_BUFFER_BASE_LO);
addr = ((u64)kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_BUFFER_BASE_HI)
<< 32) |
val;
val = kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_BUFFER_SIZE);
kbasep_print(kbpr, "CS_TRACE_BUF_ADDR: 0x%16llx, SIZE: %u\n", addr, val);
/* Write offset variable address (pointer) */
val = kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_BUFFER_OFFSET_POINTER_LO);
addr = ((u64)kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_BUFFER_OFFSET_POINTER_HI)
<< 32) |
val;
kbasep_print(kbpr, "CS_TRACE_BUF_OFFSET_PTR: 0x%16llx\n", addr);
/* EVENT_SIZE and EVENT_STATEs */
val = kbase_csf_fw_io_stream_input_read(&kctx->kbdev->csf.fw_io, group_id, stream_id,
CS_INSTR_CONFIG);
kbasep_print(kbpr, "TRACE_EVENT_SIZE: 0x%x, TRACE_EVENT_STATES 0x%x\n",
CS_INSTR_CONFIG_EVENT_SIZE_GET(val), CS_INSTR_CONFIG_EVENT_STATE_GET(val));
}
/**
* kbasep_csf_read_cmdbuff_value() - Read a command from a queue offset.
*
* @queue: Address of a GPU command queue to examine.
* @cmdbuff_offset: GPU address offset in queue's memory buffer.
*
* Return: Encoded CSF command (64-bit)
*/
static u64 kbasep_csf_read_cmdbuff_value(struct kbase_queue *queue, u32 cmdbuff_offset)
{
u64 page_off = cmdbuff_offset >> PAGE_SHIFT;
u64 offset_within_page = cmdbuff_offset & ~PAGE_MASK;
struct page *page = as_page(queue->queue_reg->gpu_alloc->pages[page_off]);
u64 *cmdbuff = vmap(&page, 1, VM_MAP, pgprot_noncached(PAGE_KERNEL));
u64 value;
if (!cmdbuff) {
struct kbase_context *kctx = queue->kctx;
dev_info(kctx->kbdev->dev, "%s failed to map the buffer page for read a command!",
__func__);
/* Return an alternative 0 for dumping operation*/
value = 0;
} else {
value = cmdbuff[offset_within_page / sizeof(u64)];
vunmap(cmdbuff);
}
return value;
}
/**
* kbasep_csf_csg_active_dump_cs_status_cmd_ptr() - Dump CMD_PTR information and nearby commands.
*
* @kbpr: Pointer to printer instance.
* @queue: Address of a GPU command queue to examine.
* @cmd_ptr: CMD_PTR address.
*/
static void kbasep_csf_csg_active_dump_cs_status_cmd_ptr(struct kbasep_printer *kbpr,
struct kbase_queue *queue, u64 cmd_ptr)
{
u64 cmd_ptr_offset;
u64 cursor, end_cursor, instr;
u32 nr_nearby_instr_size;
struct kbase_va_region *reg;
kbase_gpu_vm_lock(queue->kctx);
reg = kbase_region_tracker_find_region_enclosing_address(queue->kctx, cmd_ptr);
if (reg && !(reg->flags & KBASE_REG_FREE) && (reg->flags & KBASE_REG_CPU_RD) &&
(reg->gpu_alloc->type == KBASE_MEM_TYPE_NATIVE)) {
kbasep_print(kbpr, "CMD_PTR region nr_pages: %zu\n", reg->nr_pages);
nr_nearby_instr_size = MAX_NR_NEARBY_INSTR * sizeof(u64);
cmd_ptr_offset = cmd_ptr - queue->base_addr;
cursor = (cmd_ptr_offset > nr_nearby_instr_size) ?
cmd_ptr_offset - nr_nearby_instr_size :
0;
end_cursor = cmd_ptr_offset + nr_nearby_instr_size;
if (end_cursor > queue->size)
end_cursor = queue->size;
kbasep_print(kbpr,
"queue:GPU-%u-%u-%u at:0x%.16llx cmd_ptr:0x%.16llx "
"dump_begin:0x%.16llx dump_end:0x%.16llx\n",
queue->kctx->id, queue->group->handle, queue->csi_index,
(queue->base_addr + cursor), cmd_ptr, (queue->base_addr + cursor),
(queue->base_addr + end_cursor));
while ((cursor < end_cursor)) {
instr = kbasep_csf_read_cmdbuff_value(queue, (u32)cursor);
if (instr != 0)
kbasep_print(kbpr,
"queue:GPU-%u-%u-%u at:0x%.16llx cmd:0x%.16llx\n",
queue->kctx->id, queue->group->handle,
queue->csi_index, (queue->base_addr + cursor), instr);
cursor += sizeof(u64);
}
}
kbase_gpu_vm_unlock(queue->kctx);
}
/**
* kbasep_csf_csg_active_dump_queue() - Dump GPU command queue debug information.
*
* @kbpr: Pointer to printer instance.
* @queue: Address of a GPU command queue to examine
*/
static void kbasep_csf_csg_active_dump_queue(struct kbasep_printer *kbpr, struct kbase_queue *queue)
{
u64 *addr;
u32 *addr32;
u64 cs_extract;
u64 cs_insert;
u32 cs_active;
u64 wait_sync_pointer;
u32 wait_status, wait_sync_value;
u32 sb_status;
u32 blocked_reason;
struct kbase_vmap_struct *mapping;
u64 *evt;
u64 wait_sync_live_value;
u32 glb_version;
u64 cmd_ptr;
if (!queue)
return;
glb_version = queue->kctx->kbdev->csf.global_iface.version;
if (WARN_ON(queue->csi_index == KBASEP_IF_NR_INVALID || !queue->group))
return;
addr = queue->user_io_addr;
cs_insert = addr[CS_INSERT_LO / sizeof(*addr)];
addr = queue->user_io_addr + PAGE_SIZE / sizeof(*addr);
cs_extract = addr[CS_EXTRACT_LO / sizeof(*addr)];
addr32 = (u32 *)(queue->user_io_addr + PAGE_SIZE / sizeof(*addr));
cs_active = addr32[CS_ACTIVE / sizeof(*addr32)];
kbasep_puts(kbpr, KBASEP_CSF_CSG_DUMP_CS_HEADER_USER_IO);
kbasep_print(kbpr, "%8d, %16llx, %8x, %4u, %16llx, %16llx, %6u, %8d\n", queue->csi_index,
queue->base_addr, queue->size, queue->priority, cs_insert, cs_extract,
cs_active, queue->doorbell_nr);
/* Print status information for blocked group waiting for sync object. For on-slot queues,
* if cs_trace is enabled, dump the interface's cs_trace configuration.
*/
if (kbase_csf_scheduler_group_get_slot(queue->group) < 0) {
kbasep_print(kbpr, "SAVED_CMD_PTR: 0x%llx\n", queue->saved_cmd_ptr);
if (CS_STATUS_WAIT_SYNC_WAIT_GET(queue->status_wait)) {
wait_status = queue->status_wait;
wait_sync_value = queue->sync_value;
wait_sync_pointer = queue->sync_ptr;
sb_status = queue->sb_status;
blocked_reason = queue->blocked_reason;
evt = (u64 *)kbase_phy_alloc_mapping_get(queue->kctx, wait_sync_pointer,
&mapping);
if (evt) {
wait_sync_live_value = evt[0];
kbase_phy_alloc_mapping_put(queue->kctx, mapping);
} else {
wait_sync_live_value = U64_MAX;
}
kbasep_csf_csg_active_dump_cs_status_wait(
queue->kctx, kbpr, glb_version, wait_status, wait_sync_value,
wait_sync_live_value, wait_sync_pointer, sb_status, blocked_reason);
}
kbasep_csf_csg_active_dump_cs_status_cmd_ptr(kbpr, queue, queue->saved_cmd_ptr);
} else {
struct kbase_device *kbdev = queue->group->kctx->kbdev;
u32 group_id = queue->group->csg_nr;
u32 stream_id = queue->csi_index;
u32 req_res;
cmd_ptr = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_CMD_PTR_LO);
cmd_ptr |= (u64)kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_CMD_PTR_HI)
<< 32;
req_res = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_REQ_RESOURCE);
kbasep_print(kbpr, "CMD_PTR: 0x%llx\n", cmd_ptr);
kbasep_print(kbpr, "REQ_RESOURCE [COMPUTE]: %d\n",
CS_STATUS_REQ_RESOURCE_COMPUTE_RESOURCES_GET(req_res));
kbasep_print(kbpr, "REQ_RESOURCE [FRAGMENT]: %d\n",
CS_STATUS_REQ_RESOURCE_FRAGMENT_RESOURCES_GET(req_res));
kbasep_print(kbpr, "REQ_RESOURCE [TILER]: %d\n",
CS_STATUS_REQ_RESOURCE_TILER_RESOURCES_GET(req_res));
if (kbdev->gpu_props.gpu_id.product_model >= GPU_ID_MODEL_MAKE(14, 0))
kbasep_print(kbpr, "REQ_RESOURCE [NEURAL]: %d\n",
CS_STATUS_REQ_RESOURCE_NEURAL_RESOURCES_GET(req_res));
kbasep_print(kbpr, "REQ_RESOURCE [IDVS]: %d\n",
CS_STATUS_REQ_RESOURCE_IDVS_RESOURCES_GET(req_res));
wait_status = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_WAIT);
wait_sync_value = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id,
stream_id, CS_STATUS_WAIT_SYNC_VALUE);
wait_sync_pointer = kbase_csf_fw_io_stream_read(
&kbdev->csf.fw_io, group_id, stream_id, CS_STATUS_WAIT_SYNC_POINTER_LO);
wait_sync_pointer |=
(u64)kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_WAIT_SYNC_POINTER_HI)
<< 32;
sb_status = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_SCOREBOARDS);
blocked_reason = kbase_csf_fw_io_stream_read(&kbdev->csf.fw_io, group_id, stream_id,
CS_STATUS_BLOCKED_REASON);
evt = (u64 *)kbase_phy_alloc_mapping_get(queue->kctx, wait_sync_pointer, &mapping);
if (evt) {
wait_sync_live_value = evt[0];
kbase_phy_alloc_mapping_put(queue->kctx, mapping);
} else {
wait_sync_live_value = U64_MAX;
}
kbasep_csf_csg_active_dump_cs_status_wait(queue->kctx, kbpr, glb_version,
wait_status, wait_sync_value,
wait_sync_live_value, wait_sync_pointer,
sb_status, blocked_reason);
/* Dealing with cs_trace */
if (kbase_csf_scheduler_queue_has_trace(queue))
kbasep_csf_csg_active_dump_cs_trace(queue->kctx, kbpr, group_id, stream_id);
else
kbasep_print(kbpr, "NO CS_TRACE\n");
kbasep_csf_csg_active_dump_cs_status_cmd_ptr(kbpr, queue, cmd_ptr);
}
}
/**
* kbasep_csf_csg_active_dump_group() - Dump an active group.
*
* @kbpr: Pointer to printer instance.
* @group: GPU group.
*/
static void kbasep_csf_csg_active_dump_group(struct kbasep_printer *kbpr,
struct kbase_queue_group *const group)
{
if (kbase_csf_scheduler_group_get_slot(group) >= 0) {
struct kbase_device *const kbdev = group->kctx->kbdev;
u32 ep_c, ep_r;
char exclusive;
char idle = 'N';
u8 slot_priority = kbdev->csf.scheduler.csg_slots[group->csg_nr].priority;
ep_c = kbase_csf_fw_io_group_read(&kbdev->csf.fw_io, group->csg_nr,
CSG_STATUS_EP_CURRENT);
ep_r = kbase_csf_fw_io_group_read(&kbdev->csf.fw_io, group->csg_nr,
CSG_STATUS_EP_REQ);
if (CSG_STATUS_EP_REQ_EXCLUSIVE_COMPUTE_GET(ep_r))
exclusive = 'C';
else if (CSG_STATUS_EP_REQ_EXCLUSIVE_FRAGMENT_GET(ep_r))
exclusive = 'F';
else if ((kbdev->gpu_props.gpu_id.arch_id >= GPU_ID_ARCH_MAKE(14, 0, 0)) &&
CSG_STATUS_EP_REQ_EXCLUSIVE_NEURAL_GET(ep_r))
exclusive = 'N';
else
exclusive = '0';
if (kbase_csf_fw_io_group_read(&kbdev->csf.fw_io, group->csg_nr, CSG_STATUS_STATE) &
CSG_STATUS_STATE_IDLE_MASK)
idle = 'Y';
if (!test_bit(group->csg_nr, csg_slots_status_updated)) {
kbasep_print(kbpr, "*** Warn: Timed out for STATUS_UPDATE on slot %d\n",
group->csg_nr);
kbasep_print(kbpr, "*** The following group-record is likely stale\n");
}
if (kbdev->gpu_props.gpu_id.product_model >= GPU_ID_MODEL_MAKE(14, 0)) {
kbasep_print(kbpr, "GroupID, CSG NR, CSG Prio, Run State, Priority,"
" C_EP(Alloc/Req), F_EP(Alloc/Req), T_EP(Alloc/Req),"
" N_EP(Alloc/Req), Exclusive, Idle\n");
kbasep_print(
kbpr,
"%7d, %6d, %8d, %9d, %8d, %11d/%3d, %11d/%3d, %11d/%3d, %11d/%3d,"
" %4d, %2d, %9c, %4c\n",
group->handle, group->csg_nr, slot_priority, group->run_state,
group->priority, CSG_STATUS_EP_CURRENT_COMPUTE_EP_GET(ep_c),
CSG_STATUS_EP_REQ_COMPUTE_EP_GET(ep_r),
CSG_STATUS_EP_CURRENT_FRAGMENT_EP_GET(ep_c),
CSG_STATUS_EP_REQ_FRAGMENT_EP_GET(ep_r),
CSG_STATUS_EP_CURRENT_TILER_EP_GET(ep_c),
CSG_STATUS_EP_REQ_TILER_EP_GET(ep_r),
CSG_STATUS_EP_CURRENT_NEURAL_EP_GET(ep_c),
CSG_STATUS_EP_REQ_NEURAL_EP_GET(ep_r), group->comp_pri_threshold,
group->comp_pri_ratio, exclusive, idle);
} else {
kbasep_print(
kbpr,
"GroupID, CSG NR, CSG Prio, Run State, Priority, C_EP(Alloc/Req),"
" F_EP(Alloc/Req), T_EP(Alloc/Req), Exclusive, Idle\n");
kbasep_print(
kbpr,
"%7d, %6d, %8d, %9d, %8d, %11d/%3d, %11d/%3d, %11d/%3d, %9c, %4c\n",
group->handle, group->csg_nr, slot_priority, group->run_state,
group->priority, CSG_STATUS_EP_CURRENT_COMPUTE_EP_GET(ep_c),
CSG_STATUS_EP_REQ_COMPUTE_EP_GET(ep_r),
CSG_STATUS_EP_CURRENT_FRAGMENT_EP_GET(ep_c),
CSG_STATUS_EP_REQ_FRAGMENT_EP_GET(ep_r),
CSG_STATUS_EP_CURRENT_TILER_EP_GET(ep_c),
CSG_STATUS_EP_REQ_TILER_EP_GET(ep_r), exclusive, idle);
}
} else {
kbasep_print(kbpr, "GroupID, CSG NR, Run State, Priority\n");
kbasep_print(kbpr, "%7d, %6d, %9d, %8d\n", group->handle, group->csg_nr,
group->run_state, group->priority);
}
if (group->run_state != KBASE_CSF_GROUP_TERMINATED) {
unsigned int i;
kbasep_print(kbpr, "Bound queues:\n");
for (i = 0; i < BASEP_GPU_QUEUE_PER_QUEUE_GROUP_MAX; i++)
kbasep_csf_csg_active_dump_queue(kbpr, group->bound_queues[i]);
}
}
void kbase_csf_csg_update_status(struct kbase_device *kbdev)
{
u32 max_csg_slots = kbdev->csf.global_iface.group_num;
DECLARE_BITMAP(used_csgs, BASEP_QUEUE_GROUP_MAX) = { 0 };
u32 csg_nr;
unsigned long flags, fw_io_flags;
lockdep_assert_held(&kbdev->csf.scheduler.lock);
/* Global doorbell ring for CSG STATUS_UPDATE request or User doorbell
* ring for Extract offset update, shall not be made when MCU has been
* put to sleep otherwise it will undesirably make MCU exit the sleep
* state. Also it isn't really needed as FW will implicitly update the
* status of all on-slot groups when MCU sleep request is sent to it.
*/
if (kbdev->csf.scheduler.state == SCHED_SLEEPING) {
/* Wait for the MCU sleep request to complete. */
kbase_pm_wait_for_desired_state(kbdev);
bitmap_copy(csg_slots_status_updated, kbdev->csf.scheduler.csg_inuse_bitmap,
max_csg_slots);
return;
}
for (csg_nr = 0; csg_nr < max_csg_slots; csg_nr++) {
struct kbase_queue_group *const group =
kbdev->csf.scheduler.csg_slots[csg_nr].resident_group;
if (!group)
continue;
/* Ring the User doorbell for FW to update the Extract offset */
kbase_csf_ring_doorbell(kbdev, group->doorbell_nr);
set_bit(csg_nr, used_csgs);
}
/* Return early if there are no on-slot groups */
if (bitmap_empty(used_csgs, max_csg_slots))
return;
kbase_csf_scheduler_spin_lock(kbdev, &flags);
/* Return early if FW is unresponsive. */
if (kbase_csf_fw_io_open(&kbdev->csf.fw_io, &fw_io_flags)) {
kbase_csf_scheduler_spin_unlock(kbdev, flags);
return;
}
for_each_set_bit(csg_nr, used_csgs, max_csg_slots)
kbase_csf_fw_io_group_write_mask(&kbdev->csf.fw_io, csg_nr, CSG_REQ,
~kbase_csf_fw_io_group_read(&kbdev->csf.fw_io,
csg_nr, CSG_ACK),
CSG_REQ_STATUS_UPDATE_MASK);
BUILD_BUG_ON(BASEP_QUEUE_GROUP_MAX > (sizeof(used_csgs[0]) * BITS_PER_BYTE));
kbase_csf_ring_csg_slots_doorbell(kbdev, used_csgs[0]);
kbase_csf_fw_io_close(&kbdev->csf.fw_io, fw_io_flags);
kbase_csf_scheduler_spin_unlock(kbdev, flags);
wait_csg_slots_status_update_finish(kbdev, used_csgs);
/* Wait for the user doorbell ring to take effect */
msleep(100);
}
int kbasep_csf_csg_dump_print(struct kbase_context *const kctx, struct kbasep_printer *kbpr)
{
u32 gr;
struct kbase_device *kbdev;
if (WARN_ON(!kctx))
return -EINVAL;
kbdev = kctx->kbdev;
kbasep_print(kbpr,
"CSF groups status (version: v" __stringify(MALI_CSF_CSG_DUMP_VERSION) "):\n");
mutex_lock(&kctx->csf.lock);
kbase_csf_scheduler_lock(kbdev);
kbase_csf_csg_update_status(kbdev);
kbasep_print(kbpr, "Ctx %d_%d\n", kctx->tgid, kctx->id);
for (gr = 0; gr < MAX_QUEUE_GROUP_NUM; gr++) {
struct kbase_queue_group *const group = kctx->csf.queue_groups[gr];
if (!group)
continue;
kbasep_csf_csg_active_dump_group(kbpr, group);
}
kbase_csf_scheduler_unlock(kbdev);
mutex_unlock(&kctx->csf.lock);
return 0;
}
int kbasep_csf_csg_active_dump_print(struct kbase_device *kbdev, struct kbasep_printer *kbpr)
{
u32 csg_nr;
u32 num_groups;
if (WARN_ON(!kbdev))
return -EINVAL;
num_groups = kbdev->csf.global_iface.group_num;
kbasep_print(kbpr, "CSF active groups status (version: v" __stringify(
MALI_CSF_CSG_DUMP_VERSION) "):\n");
kbase_csf_scheduler_lock(kbdev);
kbase_csf_csg_update_status(kbdev);
for (csg_nr = 0; csg_nr < num_groups; csg_nr++) {
struct kbase_queue_group *const group =
kbdev->csf.scheduler.csg_slots[csg_nr].resident_group;
if (!group)
continue;
kbasep_print(kbpr, "Ctx %d_%d\n", group->kctx->tgid, group->kctx->id);
kbasep_csf_csg_active_dump_group(kbpr, group);
}
kbase_csf_scheduler_unlock(kbdev);
return 0;
}

Some files were not shown because too many files have changed in this diff Show More