MALI: rockchip: upgrade bifrost DDK to g6p0-01eac0, from g2p0-01eac0

Include a new directory include/uapi/gpu/arm/bifrost/, which includes some header files of bifrost device driver. In the original part of g6, the path is include/uapi/gpu/arm/midgard/. I changed the "midgard" to "bifrost", and modified the paths of the header files in .c files. I resolved some conflicts between modifications form ARM and RK, manually. In addition, introduce source files of protected_memory_allocator that might be needed by bifrost_device_driver into build system. Further more, to avoid errors when building in GKI mode, add "WITH Linux-syscall-note" to SPDX tag of uapi headers. Change-Id: I09d500a0fdbc5da352c81dc4fcfbffb5b7f907f5 Signed-off-by: Zhen Chen <chenzhen@rock-chips.com>
2026-06-05 18:41:58 +09:00 · 2021-03-17 14:12:25 +08:00
parent da27a9f52c
commit 404110b7de
382 changed files with 23564 additions and 8341 deletions
--- a/Documentation/ABI/testing/sysfs-device-mali
+++ b/Documentation/ABI/testing/sysfs-device-mali
@@ -0,0 +1,284 @@
+/*
+ *
+ * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation) and any use by you of this program is subject to the terms
+ * of such GNU licence.
+ *
+ * A copy of the licence is included with the program) and can also be obtained
+ * from Free Software Foundation) Inc.) 51 Franklin Street) Fifth Floor)
+ * Boston) MA  02110-1301) USA.
+ *
+ */
+
+What:		/sys/class/misc/mali%u/device/core_mask
+Description:
+		This attribute is used to restrict number of shader core
+		available, is useful for debugging purposes. Reading
+		this attribute provides us mask of cores available.
+		Writing to it will set the current core mask.
+
+What:		/sys/class/misc/mali%u/device/debug_command
+Description:
+		This attribute is used to issue debug commands that supported
+		by the driver. On reading it provides the list of debug commands
+		that are supported, and writing back one of those commands will
+		enable that debug option.
+
+What:		/sys/class/misc/mali%u/device/dvfs_period
+Description:
+		This is used to set the DVFS sampling period to be used by the
+		driver, On reading it provides the current DVFS sampling period,
+		on writing a value we set the DVFS sampling period.
+
+What:		/sys/class/misc/mali%u/device/dummy_job_wa_info
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU that requires a GPU workaround
+		to execute the dummy fragment job on all shader cores to
+		workaround a hang issue.
+
+		Its a readonly attribute and on reading gives details on the
+		options used with the dummy workaround.
+
+What:		/sys/class/misc/mali%u/device/gpuinfo
+Description:
+		This attribute provides description of the present Mali GPU.
+		Its a read only attribute provides details like GPU family, the
+		number of cores, the hardware version and the raw product id.
+
+What:		/sys/class/misc/mali%u/device/idle_hysteresis_time
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. This attribute is
+		used to set the duration value in milliseconds for the
+		configuring hysteresis field for determining GPU idle detection.
+
+What:		/sys/class/misc/mali%u/device/js_ctx_scheduling_mode
+Description:
+		This attribute is available only with platform device that
+		supports a Job Manager based GPU. This attribute is used to set
+		context scheduling priority for a job slot.
+
+		On Reading it provides the currently set job slot context
+		priority.
+
+		Writing 0 to this attribute sets it to the mode were
+		higher priority atoms will be scheduled first, regardless of
+		the context they belong to. Newly-runnable higher priority atoms
+		can preempt lower priority atoms currently running on the GPU,
+		even if they belong to a different context.
+
+		Writing 1 to this attribute set it to the mode were the
+		highest-priority atom will be chosen from each context in turn
+		using a round-robin algorithm, so priority only has an effect
+		within the context an atom belongs to. Newly-runnable higher
+		priority atoms can preempt the lower priority atoms currently
+		running on the GPU, but only if they belong to the same context.
+
+What:		/sys/class/misc/mali%u/device/js_scheduling_period
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU. Used to set the job scheduler
+		tick period in nano-seconds. The Job Scheduler determines the
+		jobs that are run on the GPU, and for how long, Job Scheduler
+		makes decisions at a regular time interval determined by value
+		in js_scheduling_period.
+
+What:		/sys/class/misc/mali%u/device/js_softstop_always
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU. Soft-stops are disabled when
+		only a single context is present, this attribute is used to
+		enable soft-stop when only a single context is present can be
+		used for debug and unit-testing purposes.
+
+What:		/sys/class/misc/mali%u/device/js_timeouts
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU. It used to set the soft stop
+		and hard stop times for the job scheduler.
+
+		Writing value 0 causes no change, or -1 to restore the
+		default timeout.
+
+		The format used to set js_timeouts is
+		"<soft_stop_ms> <soft_stop_ms_cl> <hard_stop_ms_ss>
+		<hard_stop_ms_cl> <hard_stop_ms_dumping> <reset_ms_ss>
+		<reset_ms_cl> <reset_ms_dumping>"
+
+
+What:		/sys/class/misc/mali%u/device/lp_mem_pool_max_size
+Description:
+		This attribute is used to set the maximum number of large pages
+		memory pools that the driver can contain. Large pages are of
+		size 2MB. On read it displays all the max size of all memory
+		pools and can be used to modify each individual pools as well.
+
+What:		/sys/class/misc/mali%u/device/lp_mem_pool_size
+Description:
+		This attribute is used to set the number of large memory pages
+		which should be	populated, changing this value may cause
+		existing pages to be removed from the pool, or new pages to be
+		created and then added to the pool. On read it will provide
+		pool size for all available pools and we can modify individual
+		pool.
+
+What:		/sys/class/misc/mali%u/device/mem_pool_max_size
+Description:
+		This attribute is used to set the maximum number of small pages
+		for memory pools that the driver can contain. Here small pages
+		are of size 4KB. On read it will display the max size for all
+		available pools and allows us to set max size of
+		individual pools.
+
+What:		/sys/class/misc/mali%u/device/mem_pool_size
+Description:
+		This attribute is used to set the number of small memory pages
+		which should be populated, changing this value may cause
+		existing pages to be removed from the pool, or new pages to
+		be created and then added to the pool. On read it will provide
+		pool size for all available pools and we can modify individual
+		pool.
+
+What:		/sys/class/misc/mali%u/device/device/mempool/ctx_default_max_size
+Description:
+		This attribute is used to set maximum memory pool size for
+		all the memory pool so that the maximum amount of free memory
+		that each pool can hold is identical.
+
+What:		/sys/class/misc/mali%u/device/device/mempool/lp_max_size
+Description:
+		This attribute is used to set the maximum number of large pages
+		for all memory pools that the driver can contain.
+		Large pages are of size 2MB.
+
+What:		/sys/class/misc/mali%u/device/device/mempool/max_size
+Description:
+		This attribute is used to set the maximum number of small pages
+		for all the memory pools that the driver can contain.
+		Here small pages are of size 4KB.
+
+What:		/sys/class/misc/mali%u/device/pm_poweroff
+Description:
+		This attribute contains the current values, represented as the
+		following space-separated integers:
+		• PM_GPU_POWEROFF_TICK_NS.
+		• PM_POWEROFF_TICK_SHADER.
+		• PM_POWEROFF_TICK_GPU.
+
+		Example:
+		echo 100000 4 4 > /sys/class/misc/mali0/device/pm_poweroff
+
+		Sets the following new values: 100,000ns tick, four ticks
+		for shader power down, and four ticks for GPU power down.
+
+What:		/sys/class/misc/mali%u/device/power_policy
+Description:
+		This attribute is used to find the current power policy been
+		used, reading will list the power policies available and
+		enclosed in square bracket is the current one been selected.
+
+		Example:
+		cat /sys/class/misc/mali0/device/power_policy
+		[demand] coarse_demand always_on
+
+		To switch to a different policy at runtime write the valid entry
+		name back to the attribute.
+
+		Example:
+		echo "coarse_demand" > /sys/class/misc/mali0/device/power_policy
+
+What:		/sys/class/misc/mali%u/device/progress_timeout
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. This attribute
+		is used to set the progress timeout value and read the current
+		progress timeout value.
+
+		Progress timeout value is the maximum number of GPU cycles
+		without forward progress to allow to elapse before terminating a
+		GPU command queue group.
+
+What:		/sys/class/misc/mali%u/device/reset_timeout
+Description:
+		This attribute is used to set the number of milliseconds to
+		wait for the soft stop to complete for the GPU jobs before
+		proceeding with the GPU reset.
+
+What:		/sys/class/misc/mali%u/device/soft_job_timeout
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU. It used to set the timeout
+		value for waiting for any soft event to complete.
+
+What:		/sys/class/misc/mali%u/device/scheduling/serialize_jobs
+Description:
+		This attribute is available only with platform device that
+                supports a Job Manager based GPU.
+
+		Various options available under this are:
+		• none - for disabling serialization.
+		• intra-slot - Serialize atoms within a slot, only one
+				atom per job slot.
+		• inter-slot - Serialize atoms between slots, only one
+				job slot running at any time.
+		• full - it a combination of both inter and intra slot,
+				so only one atom and one job slot running
+				at any time.
+		• full-reset - full serialization and Reset the GPU after
+				each atom completion
+
+		These options are useful for debugging and investigating
+		failures and gpu hangs to narrow down atoms that could cause
+		troubles.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/Compute iterator count/*
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. Its a read-only attribute
+		which indicates the maximum number of Compute iterators
+		supported by the GPU.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/CSHWIF count/*
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. Its a read-only
+		attribute which indicates the maximum number of	CSHWIFs
+		supported by the GPU.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/Fragment iterator count/*
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. Its a read-only
+		attribute which indicates the maximum number of
+		Fragment iterators supported by the GPU.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/Scoreboard set count/*
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. Its a read-only
+		attribute which indicates the maximum number of
+		Scoreboard set supported by the GPU.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/Tiler iterator count/*
+Description:
+		This attribute is available only with mali platform
+		device-driver that supports a CSF GPU. Its a read-only
+		attribute which indicates the maximum number of	Tiler iterators
+		supported by the GPU.
+
+What:		/sys/class/misc/mali%u/device/firmware_config/Log verbosity/*
+Description:
+		This attribute is available only with mali platform
+                device-driver that supports a CSF GPU.
+
+		Used to enable firmware logs, logging levels valid values
+		are indicated using 'min and 'max' attribute values
+		values that are read-only.
+
+		Log level can be set using the 'cur' read, write attribute,
+		we can use a valid log level value from min and max range values
+		and set a valid desired log level for firmware logs.
--- a/Documentation/devicetree/bindings/arm/mali-bifrost.txt
+++ b/Documentation/devicetree/bindings/arm/mali-bifrost.txt
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2013-2020 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2013-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 * ARM Mali Midgard / Bifrost devices
@@ -46,12 +45,12 @@ Documentation/devicetree/bindings/regulator/regulator.txt for details.
                       This is optional.
 - operating-points-v2 : Refer to Documentation/devicetree/bindings/power/mali-opp.txt
 for details.
- quirks_jm : Used to write to the JM_CONFIG register or equivalent.
+- quirks_gpu : Used to write to the JM_CONFIG or CSF_CONFIG register.
 	  Should be used with care. Options passed here are used to override
 	  certain default behavior. Note: This will override 'idvs-group-size'
 	  field in devicetree and module param 'corestack_driver_control',
-	  therefore if 'quirks_jm' is used then 'idvs-group-size' and
-	  'corestack_driver_control' value should be incorporated into 'quirks_jm'.
+	  therefore if 'quirks_gpu' is used then 'idvs-group-size' and
+	  'corestack_driver_control' value should be incorporated into 'quirks_gpu'.
 - quirks_sc : Used to write to the SHADER_CONFIG register.
 	  Should be used with care. Options passed here are used to override
 	  certain default behavior.
--- a/Documentation/devicetree/bindings/arm/memory_group_manager.txt
+++ b/Documentation/devicetree/bindings/arm/memory_group_manager.txt
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 * Arm memory group manager for Mali GPU device drivers
--- a/Documentation/devicetree/bindings/arm/priority_control_manager.txt
+++ b/Documentation/devicetree/bindings/arm/priority_control_manager.txt
@@ -0,0 +1,48 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+* Arm priority control manager for Mali GPU device drivers
+
+Required properties:
+
+- compatible: Must be "arm,priority-control-manager"
+
+An example node:
+
+        gpu_priority_control_manager: priority-control-manager {
+                compatible = "arm,priority-control-manager";
+        };
+
+It must be referenced by the GPU as well, see priority-control-manager:
+
+	gpu: gpu@0x6e000000 {
+		compatible = "arm,mali-midgard";
+		reg = <0x0 0x6e000000 0x0 0x200000>;
+		interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
+		interrupt-names = "JOB", "MMU", "GPU";
+		clocks = <&scpi_dvfs 2>;
+		clock-names = "clk_mali";
+		system-coherency = <31>;
+		priority-control-manager = <&gpu_priority_control_manager>;
+		operating-points = <
+			/* KHz uV */
+			50000 820000
+		>;
+	};
--- a/Documentation/devicetree/bindings/arm/protected_memory_allocator.txt
+++ b/Documentation/devicetree/bindings/arm/protected_memory_allocator.txt
@@ -0,0 +1,68 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+* Arm protected memory allocator for Mali GPU device drivers
+
+Required properties:
+
+- compatible: Must be "arm,protected-memory-allocator"
+
+The protected memory allocator manages allocation of physical pages of a
+reserved memory region of protected memory, therefore its device node shall
+reference a reserved memory region.
+
+In addition to that, the protected memory allocator shall be referenced
+by the GPU.
+
+A complete example configuration for the device tree:
+
+	reserved-memory {
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges;
+
+		mali_protected: mali_protected@c0000000 {
+			compatible = "mali-reserved";
+			reg = <0x0 0xc0000000 0x0 0x1000000>;
+		};
+	};
+
+	gpu_protected_memory_allocator: protected-memory-allocator {
+		compatible = "arm,protected-memory-allocator";
+		memory-region = <&mali_protected>;
+	};
+
+	gpu_fpga: gpu@0x6e000000 {
+		compatible = "arm,mali-midgard";
+		reg = <0x0 0x6e000000 0x0 0x200000>;
+		interrupts = <0 168 4>, <0 168 4>, <0 168 4>;
+		interrupt-names = "JOB", "MMU", "GPU";
+		clocks = <&scpi_dvfs 2>;
+		clock-names = "clk_mali";
+		protected-memory-allocator = <&gpu_protected_memory_allocator>;
+		operating-points = <
+			/* KHz uV */
+			50000 820000
+		>;
+	};
+
+The protected memory allocator is gpu_protected_memory_allocator.
+It references the mali_protected reserved memory region and, in turn,
+it is referenced by the GPU as protected-memory-allocator.
--- a/Documentation/devicetree/bindings/power/mali-opp.txt
+++ b/Documentation/devicetree/bindings/power/mali-opp.txt
@@ -1,14 +1,20 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2017, 2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2017, 2019-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
-# A copy of the licence is included with the program, and can also be obtained
-# from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
-# Boston, MA  02110-1301, USA.
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
 #
 #

@@ -48,7 +54,7 @@ Optional properties:

 - opp-core-count: Number of cores to use for this OPP. If this is present then
  the driver will build a core mask using the available core mask provided by
-  the GPU hardware.
+  the GPU hardware. An opp-core-count value of 0 is not permitted.

  If neither this nor opp-core-mask are present then all shader cores will be
  used for this OPP.
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -183,6 +183,9 @@ config SOC_BUS

 source "drivers/base/regmap/Kconfig"

+source "drivers/base/arm/memory_group_manager/Kconfig"
+source "drivers/base/arm/protected_memory_allocator/Kconfig"
+
 config DMA_SHARED_BUFFER
 	bool
 	default n
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -18,6 +18,10 @@ obj-$(CONFIG_MODULES)	+= module.o
 endif
 obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
 obj-$(CONFIG_REGMAP)	+= regmap/
+
+obj-$(CONFIG_MALI_MEMORY_GROUP_MANAGER)	+= arm/memory_group_manager/
+obj-$(CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR)	+= arm/protected_memory_allocator/
+
 obj-$(CONFIG_SOC_BUS) += soc.o
 obj-$(CONFIG_PINCTRL) += pinctrl.o
 obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
--- a/drivers/base/arm/memory_group_manager/Kbuild
+++ b/drivers/base/arm/memory_group_manager/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

-obj-$(CONFIG_MALI_MEMORY_GROUP_MANAGER) := memory_group_manager.o
+obj-$(CONFIG_MALI_MEMORY_GROUP_MANAGER) := memory_group_manager.o
--- a/drivers/base/arm/memory_group_manager/Kconfig
+++ b/drivers/base/arm/memory_group_manager/Kconfig
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #


--- a/drivers/base/arm/memory_group_manager/Makefile
+++ b/drivers/base/arm/memory_group_manager/Makefile
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 # linux build system bootstrap for out-of-tree module
@@ -24,12 +23,15 @@
 # default to building for the host
 ARCH ?= $(shell uname -m)

-ifeq ($(KDIR),)
-$(error Must specify KDIR to point to the kernel to target))
-endif
+# Handle Android Common Kernel source naming
+KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
+KDIR ?= $(KERNEL_SRC)

 all:
-	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) EXTRA_CFLAGS="-I$(CURDIR)/../../../include" modules CONFIG_MALI_MEMORY_GROUP_MANAGER=m
+	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) EXTRA_CFLAGS="-I$(CURDIR)/../../../../include" modules CONFIG_MALI_MEMORY_GROUP_MANAGER=m

 clean:
 	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) clean
+
+modules_install:
+	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) modules_install
--- a/drivers/base/arm/memory_group_manager/build.bp
+++ b/drivers/base/arm/memory_group_manager/build.bp
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,16 +17,14 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-#ifndef _KBASE_GPU_H_
-#define _KBASE_GPU_H_
-
-#include "mali_kbase_gpu_regmap.h"
-#include "mali_kbase_gpu_fault.h"
-#include "mali_kbase_gpu_coherency.h"
-#include "mali_kbase_gpu_id.h"
-
-#endif /* _KBASE_GPU_H_ */
+bob_kernel_module {
+    name: "memory_group_manager",
+    srcs: [
+        "Kbuild",
+        "memory_group_manager.c",
+    ],
+    kbuild_options: ["CONFIG_MALI_MEMORY_GROUP_MANAGER=m"],
+    defaults: ["kernel_defaults"],
+}
--- a/drivers/base/arm/memory_group_manager/memory_group_manager.c
+++ b/drivers/base/arm/memory_group_manager/memory_group_manager.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <linux/fs.h>
@@ -481,7 +480,6 @@ static struct platform_driver memory_group_manager_driver = {
 	.remove = memory_group_manager_remove,
 	.driver = {
 		.name = "physical-memory-group-manager",
-		.owner = THIS_MODULE,
 		.of_match_table = of_match_ptr(memory_group_manager_dt_ids),
 		/*
 		 * Prevent the mgm_dev from being unbound and freed, as other's
--- a/drivers/base/arm/protected_memory_allocator/Kbuild
+++ b/drivers/base/arm/protected_memory_allocator/Kbuild
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+obj-$(CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR) := protected_memory_allocator.o
--- a/drivers/base/arm/protected_memory_allocator/Kconfig
+++ b/drivers/base/arm/protected_memory_allocator/Kconfig
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+
+config MALI_PROTECTED_MEMORY_ALLOCATOR
+	tristate "MALI_PROTECTED_MEMORY_ALLOCATOR"
+	help
+	  This option enables an example implementation of a protected memory allocator
+	  for allocation and release of pages of secure memory intended to be used
+	  by the firmware of Mali GPU device drivers.
--- a/drivers/base/arm/protected_memory_allocator/Makefile
+++ b/drivers/base/arm/protected_memory_allocator/Makefile
@@ -0,0 +1,37 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+# linux build system bootstrap for out-of-tree module
+
+# default to building for the host
+ARCH ?= $(shell uname -m)
+
+# Handle Android Common Kernel source naming
+KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
+KDIR ?= $(KERNEL_SRC)
+
+all:
+	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) EXTRA_CFLAGS="-I$(CURDIR)/../../../../include" modules CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR=m
+
+clean:
+	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) clean
+
+modules_install:
+	$(MAKE) ARCH=$(ARCH) -C $(KDIR) M=$(CURDIR) modules_install
--- a/drivers/base/arm/protected_memory_allocator/build.bp
+++ b/drivers/base/arm/protected_memory_allocator/build.bp
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+bob_kernel_module {
+    name: "protected_memory_allocator",
+    srcs: [
+        "Kbuild",
+        "protected_memory_allocator.c",
+    ],
+    kbuild_options: ["CONFIG_MALI_PROTECTED_MEMORY_ALLOCATOR=m"],
+    defaults: ["kernel_defaults"],
+    enabled: false,
+    build_csf_only_module: {
+        enabled: true,
+    },
+}
--- a/drivers/base/arm/protected_memory_allocator/protected_memory_allocator.c
+++ b/drivers/base/arm/protected_memory_allocator/protected_memory_allocator.c
@@ -0,0 +1,551 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+#include <linux/version.h>
+#include <linux/of.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/mm.h>
+#include <linux/io.h>
+#include <linux/protected_memory_allocator.h>
+
+/* Size of a bitfield element in bytes */
+#define BITFIELD_ELEM_SIZE sizeof(u64)
+
+/* We can track whether or not 64 pages are currently allocated in a u64 */
+#define PAGES_PER_BITFIELD_ELEM (BITFIELD_ELEM_SIZE * BITS_PER_BYTE)
+
+/* Order 6 (ie, 64) corresponds to the number of pages held in a bitfield */
+#define ORDER_OF_PAGES_PER_BITFIELD_ELEM 6
+
+/**
+ * struct simple_pma_device -	Simple implementation of a protected memory
+ *				allocator device
+ *
+ * @pma_dev:			Protected memory allocator device pointer
+ * @dev:  			Device pointer
+ * @alloc_pages_bitfield_arr:	Status of all the physical memory pages within the
+ *				protected memory region, one bit per page
+ * @rmem_base:			Base address of the reserved memory region
+ * @rmem_size:			Size of the reserved memory region, in pages
+ * @num_free_pages:		Number of free pages in the memory region
+ * @rmem_lock:			Lock to serialize the allocation and freeing of
+ *				physical pages from the protected memory region
+ */
+struct simple_pma_device {
+	struct protected_memory_allocator_device pma_dev;
+	struct device *dev;
+	u64 *allocated_pages_bitfield_arr;
+	phys_addr_t rmem_base;
+	size_t rmem_size;
+	size_t num_free_pages;
+	spinlock_t rmem_lock;
+};
+
+/**
+ * Number of elements in array 'allocated_pages_bitfield_arr'. If the number of
+ * pages required does not divide exactly by PAGES_PER_BITFIELD_ELEM, adds an
+ * extra page for the remainder.
+ */
+#define ALLOC_PAGES_BITFIELD_ARR_SIZE(num_pages) \
+	((PAGES_PER_BITFIELD_ELEM * (0 != (num_pages % PAGES_PER_BITFIELD_ELEM)) + \
+	num_pages) / PAGES_PER_BITFIELD_ELEM)
+
+/**
+ * Allocate a power-of-two number of pages, N, where
+ * 0 <= N <= ORDER_OF_PAGES_PER_BITFIELD_ELEM - 1.  ie, Up to 32 pages. The routine
+ * fills-in a pma structure and sets the appropriate bits in the allocated-pages
+ * bitfield array but assumes the caller has already determined that these are
+ * already clear.
+ *
+ * This routine always works within only a single allocated-pages bitfield element.
+ * It can be thought of as the 'small-granularity' allocator.
+ */
+static void small_granularity_alloc(struct simple_pma_device *const epma_dev,
+				    size_t alloc_bitfield_idx, size_t start_bit,
+				    size_t order,
+				    struct protected_memory_allocation *pma)
+{
+	size_t i;
+	size_t page_idx;
+	u64 *bitfield;
+	size_t alloc_pages_bitfield_size;
+
+	if (WARN_ON(!epma_dev) ||
+	    WARN_ON(!pma))
+		return;
+
+	WARN(epma_dev->rmem_size == 0, "%s: rmem_size is 0", __func__);
+	alloc_pages_bitfield_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
+
+	WARN(alloc_bitfield_idx >= alloc_pages_bitfield_size,
+	     "%s: idx>bf_size: %zu %zu", __FUNCTION__,
+	     alloc_bitfield_idx, alloc_pages_bitfield_size);
+
+	WARN((start_bit + (1 << order)) > PAGES_PER_BITFIELD_ELEM,
+	     "%s: start=%zu order=%zu ppbe=%zu",
+	     __FUNCTION__, start_bit, order, PAGES_PER_BITFIELD_ELEM);
+
+	bitfield = &epma_dev->allocated_pages_bitfield_arr[alloc_bitfield_idx];
+
+	for (i = 0; i < (1 << order); i++) {
+		/* Check the pages represented by this bit are actually free */
+		WARN (*bitfield & (1ULL << (start_bit + i)),
+		      "in %s: page not free: %zu %zu %.16llx %zu\n",
+		      __FUNCTION__, i, order, *bitfield, alloc_pages_bitfield_size);
+
+		/* Mark the pages as now allocated */
+		*bitfield |= (1ULL << (start_bit + i));
+	}
+
+	/* Compute the page index */
+	page_idx = (alloc_bitfield_idx * PAGES_PER_BITFIELD_ELEM) + start_bit;
+
+	/* Fill-in the allocation struct for the caller */
+	pma->pa = epma_dev->rmem_base + (page_idx << PAGE_SHIFT);
+	pma->order = order;
+}
+
+/**
+ * Allocate a power-of-two number of pages, N, where
+ * N >= ORDER_OF_PAGES_PER_BITFIELD_ELEM. ie, 64 pages or more. The routine fills-in
+ * a pma structure and sets the appropriate bits in the allocated-pages bitfield array
+ * but assumes the caller has already determined that these are already clear.
+ *
+ * Unlike small_granularity_alloc, this routine can work with multiple 64-page groups,
+ * ie multiple elements from the allocated-pages bitfield array. However, it always
+ * works with complete sets of these 64-page groups. It can therefore be thought of
+ * as the 'large-granularity' allocator.
+ */
+static void large_granularity_alloc(struct simple_pma_device *const epma_dev,
+				    size_t start_alloc_bitfield_idx,
+				    size_t order,
+				    struct protected_memory_allocation *pma)
+{
+	size_t i;
+	size_t num_pages_to_alloc = (size_t)1 << order;
+	size_t num_bitfield_elements_needed = num_pages_to_alloc / PAGES_PER_BITFIELD_ELEM;
+	size_t start_page_idx = start_alloc_bitfield_idx * PAGES_PER_BITFIELD_ELEM;
+
+	if (WARN_ON(!epma_dev) ||
+	    WARN_ON(!pma))
+		return;
+
+	/*
+	 * Are there anough bitfield array elements (groups of 64 pages)
+	 * between the start element and the end of the bitfield array
+	 * to fulfill the request?
+	 */
+	WARN((start_alloc_bitfield_idx + order) >= ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size),
+	     "%s: start=%zu order=%zu ms=%zu",
+	     __FUNCTION__, start_alloc_bitfield_idx, order, epma_dev->rmem_size);
+
+	for (i = 0; i < num_bitfield_elements_needed; i++) {
+		u64 *bitfield = &epma_dev->allocated_pages_bitfield_arr[start_alloc_bitfield_idx + i];
+
+		/* We expect all pages that relate to this bitfield element to be free */
+		WARN((*bitfield != 0),
+		     "in %s: pages not free: i=%zu o=%zu bf=%.16llx\n",
+		     __FUNCTION__, i, order, *bitfield);
+
+		/* Mark all the pages for this element as not free */
+		*bitfield = ~0ULL;
+	}
+
+	/* Fill-in the allocation struct for the caller */
+	pma->pa = epma_dev->rmem_base + (start_page_idx  << PAGE_SHIFT);
+	pma->order = order;
+}
+
+static struct protected_memory_allocation *simple_pma_alloc_page(
+	struct protected_memory_allocator_device *pma_dev, unsigned int order)
+{
+	struct simple_pma_device *const epma_dev =
+		container_of(pma_dev, struct simple_pma_device, pma_dev);
+	struct protected_memory_allocation *pma;
+	size_t num_pages_to_alloc;
+
+	u64 *bitfields = epma_dev->allocated_pages_bitfield_arr;
+	size_t i;
+	size_t bit;
+	size_t count;
+
+	dev_dbg(epma_dev->dev, "%s(pma_dev=%px, order=%u\n",
+		__func__, (void *)pma_dev, order);
+
+	/* This is an example function that follows an extremely simple logic
+	 * and is very likely to fail to allocate memory if put under stress.
+	 *
+	 * The simple_pma_device maintains an array of u64s, with one bit used
+	 * to track the status of each page.
+	 *
+	 * In order to create a memory allocation, the allocator looks for an
+	 * adjacent group of cleared bits. This does leave the algorithm open
+	 * to fragmentation issues, but is deemed sufficient for now.
+	 * If successful, the allocator shall mark all the pages as allocated
+	 * and increment the offset accordingly.
+	 *
+	 * Allocations of 64 pages or more (order 6) can be allocated only with
+	 * 64-page alignment, in order to keep the algorithm as simple as
+	 * possible. ie, starting from bit 0 of any 64-bit page-allocation
+	 * bitfield. For this, the large-granularity allocator is utilised.
+	 *
+	 * Allocations of lower-order can only be allocated entirely within the
+	 * same group of 64 pages, with the small-ganularity allocator  (ie
+	 * always from the same 64-bit page-allocation bitfield) - again, to
+	 * keep things as simple as possible, but flexible to meet
+	 * current needs.
+	 */
+
+	num_pages_to_alloc = (size_t)1 << order;
+
+	pma = devm_kzalloc(epma_dev->dev, sizeof(*pma), GFP_KERNEL);
+	if (!pma) {
+		dev_err(epma_dev->dev, "Failed to alloc pma struct");
+		return NULL;
+	}
+
+	spin_lock(&epma_dev->rmem_lock);
+
+	if (epma_dev->num_free_pages < num_pages_to_alloc) {
+		dev_err(epma_dev->dev, "not enough free pages\n");
+		devm_kfree(epma_dev->dev, pma);
+		spin_unlock(&epma_dev->rmem_lock);
+		return NULL;
+	}
+
+	/*
+	 * For order 0-5 (ie, 1 to 32 pages) we always allocate within the same set of 64 pages
+	 * Currently, most allocations will be very small (1 page), so the more likely path
+	 * here is order < ORDER_OF_PAGES_PER_BITFIELD_ELEM.
+	 */
+	if (likely(order < ORDER_OF_PAGES_PER_BITFIELD_ELEM)) {
+		size_t alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
+
+		for (i = 0; i < alloc_pages_bitmap_size; i++) {
+			count = 0;
+
+			for (bit = 0; bit < PAGES_PER_BITFIELD_ELEM; bit++) {
+				if  (0 == (bitfields[i] & (1ULL << bit))) {
+					if ((count + 1) >= num_pages_to_alloc) {
+						/*
+						 * We've found enough free, consecutive pages with which to
+						 * make an allocation
+						 */
+						small_granularity_alloc(
+							epma_dev, i,
+							bit - count, order,
+							pma);
+
+						epma_dev->num_free_pages -=
+							num_pages_to_alloc;
+
+						spin_unlock(
+							&epma_dev->rmem_lock);
+						return pma;
+					}
+
+					/* So far so good, but we need more set bits yet */
+					count++;
+				} else {
+					/*
+					 * We found an allocated page, so nothing we've seen so far can be used.
+					 * Keep looking.
+					 */
+					count = 0;
+				}
+			}
+		}
+	} else {
+		/**
+		 * For allocations of order ORDER_OF_PAGES_PER_BITFIELD_ELEM and above (>= 64 pages), we know
+		 * we'll only get allocations for whole groups of 64 pages, which hugely simplifies the task.
+		 */
+		size_t alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
+
+		/* How many 64-bit bitfield elements will be needed for the allocation? */
+		size_t num_bitfield_elements_needed = num_pages_to_alloc / PAGES_PER_BITFIELD_ELEM;
+
+		count = 0;
+
+		for (i = 0; i < alloc_pages_bitmap_size; i++) {
+			/* Are all the pages free for the i'th u64 bitfield element? */
+			if (bitfields[i] == 0) {
+				count += PAGES_PER_BITFIELD_ELEM;
+
+				if (count >= (1 << order)) {
+					size_t start_idx = (i + 1) - num_bitfield_elements_needed;
+
+					large_granularity_alloc(epma_dev,
+								start_idx,
+								order, pma);
+
+					epma_dev->num_free_pages -= 1 << order;
+					spin_unlock(&epma_dev->rmem_lock);
+					return pma;
+				}
+			}
+			else
+			{
+				count = 0;
+			}
+		}
+	}
+
+	spin_unlock(&epma_dev->rmem_lock);
+	devm_kfree(epma_dev->dev, pma);
+
+	dev_err(epma_dev->dev, "not enough contiguous pages (need %zu), total free pages left %zu\n",
+		num_pages_to_alloc, epma_dev->num_free_pages);
+	return NULL;
+}
+
+static phys_addr_t simple_pma_get_phys_addr(
+	struct protected_memory_allocator_device *pma_dev,
+	struct protected_memory_allocation *pma)
+{
+	struct simple_pma_device *const epma_dev =
+		container_of(pma_dev, struct simple_pma_device, pma_dev);
+
+	dev_dbg(epma_dev->dev, "%s(pma_dev=%px, pma=%px, pa=%llx\n",
+		__func__, (void *)pma_dev, (void *)pma,
+		(unsigned long long)pma->pa);
+
+	return pma->pa;
+}
+
+static void simple_pma_free_page(
+	struct protected_memory_allocator_device *pma_dev,
+	struct protected_memory_allocation *pma)
+{
+	struct simple_pma_device *const epma_dev =
+		container_of(pma_dev, struct simple_pma_device, pma_dev);
+	size_t num_pages_in_allocation;
+	size_t offset;
+	size_t i;
+	size_t bitfield_idx;
+	size_t bitfield_start_bit;
+	size_t page_num;
+	u64 *bitfield;
+	size_t alloc_pages_bitmap_size;
+	size_t num_bitfield_elems_used_by_alloc;
+
+	WARN_ON(pma == NULL);
+
+	dev_dbg(epma_dev->dev, "%s(pma_dev=%px, pma=%px, pa=%llx\n",
+		__func__, (void *)pma_dev, (void *)pma,
+		(unsigned long long)pma->pa);
+
+	WARN_ON(pma->pa < epma_dev->rmem_base);
+
+	/* This is an example function that follows an extremely simple logic
+	 * and is vulnerable to abuse.
+	 */
+	offset = (pma->pa - epma_dev->rmem_base);
+	num_pages_in_allocation = (size_t)1 << pma->order;
+
+	/* The number of bitfield elements used by the allocation */
+	num_bitfield_elems_used_by_alloc = num_pages_in_allocation / PAGES_PER_BITFIELD_ELEM;
+
+	/* The page number of the first page of the allocation, relative to rmem_base */
+	page_num = offset >> PAGE_SHIFT;
+
+	/* Which u64 bitfield refers to this page? */
+	bitfield_idx = page_num / PAGES_PER_BITFIELD_ELEM;
+
+	alloc_pages_bitmap_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
+
+	/* Is the allocation within expected bounds? */
+	WARN_ON((bitfield_idx + num_bitfield_elems_used_by_alloc) >= alloc_pages_bitmap_size);
+
+	spin_lock(&epma_dev->rmem_lock);
+
+	if (pma->order < ORDER_OF_PAGES_PER_BITFIELD_ELEM) {
+		bitfield = &epma_dev->allocated_pages_bitfield_arr[bitfield_idx];
+
+		/* Which bit within that u64 bitfield is the lsb covering this allocation?  */
+		bitfield_start_bit = page_num % PAGES_PER_BITFIELD_ELEM;
+
+		/* Clear the bits for the pages we're now freeing */
+		*bitfield &= ~(((1ULL << num_pages_in_allocation) - 1) << bitfield_start_bit);
+	}
+	else {
+		WARN(page_num % PAGES_PER_BITFIELD_ELEM,
+		     "%s: Expecting allocs of order >= %d to be %zu-page aligned\n",
+		     __FUNCTION__, ORDER_OF_PAGES_PER_BITFIELD_ELEM, PAGES_PER_BITFIELD_ELEM);
+
+		for (i = 0; i < num_bitfield_elems_used_by_alloc; i++) {
+			bitfield = &epma_dev->allocated_pages_bitfield_arr[bitfield_idx + i];
+
+			/* We expect all bits to be set (all pages allocated) */
+			WARN((*bitfield != ~0),
+			     "%s: alloc being freed is not fully allocated: of=%zu np=%zu bf=%.16llx\n",
+			     __FUNCTION__, offset, num_pages_in_allocation, *bitfield);
+
+			/*
+			 * Now clear all the bits in the bitfield element to mark all the pages
+			 * it refers to as free.
+			 */
+			*bitfield = 0ULL;
+		}
+	}
+
+	epma_dev->num_free_pages += num_pages_in_allocation;
+	spin_unlock(&epma_dev->rmem_lock);
+	devm_kfree(epma_dev->dev, pma);
+}
+
+static int protected_memory_allocator_probe(struct platform_device *pdev)
+{
+	struct simple_pma_device *epma_dev;
+	struct device_node *np;
+	phys_addr_t rmem_base;
+	size_t rmem_size;
+	size_t alloc_bitmap_pages_arr_size;
+#if (KERNEL_VERSION(4, 15, 0) <= LINUX_VERSION_CODE)
+	struct reserved_mem *rmem;
+#endif
+
+	np = pdev->dev.of_node;
+
+	if (!np) {
+		dev_err(&pdev->dev, "device node pointer not set\n");
+		return -ENODEV;
+	}
+
+	np = of_parse_phandle(np, "memory-region", 0);
+	if (!np) {
+		dev_err(&pdev->dev, "memory-region node not set\n");
+		return -ENODEV;
+	}
+
+#if (KERNEL_VERSION(4, 15, 0) <= LINUX_VERSION_CODE)
+	rmem = of_reserved_mem_lookup(np);
+	if (rmem) {
+		rmem_base = rmem->base;
+		rmem_size = rmem->size >> PAGE_SHIFT;
+	} else
+#endif
+	{
+		of_node_put(np);
+		dev_err(&pdev->dev, "could not read reserved memory-region\n");
+		return -ENODEV;
+	}
+
+	of_node_put(np);
+	epma_dev = devm_kzalloc(&pdev->dev, sizeof(*epma_dev), GFP_KERNEL);
+	if (!epma_dev)
+		return -ENOMEM;
+
+	epma_dev->pma_dev.ops.pma_alloc_page = simple_pma_alloc_page;
+	epma_dev->pma_dev.ops.pma_get_phys_addr = simple_pma_get_phys_addr;
+	epma_dev->pma_dev.ops.pma_free_page = simple_pma_free_page;
+	epma_dev->pma_dev.owner = THIS_MODULE;
+	epma_dev->dev = &pdev->dev;
+	epma_dev->rmem_base = rmem_base;
+	epma_dev->rmem_size = rmem_size;
+	epma_dev->num_free_pages = rmem_size;
+	spin_lock_init(&epma_dev->rmem_lock);
+
+	alloc_bitmap_pages_arr_size = ALLOC_PAGES_BITFIELD_ARR_SIZE(epma_dev->rmem_size);
+
+	epma_dev->allocated_pages_bitfield_arr = devm_kzalloc(&pdev->dev,
+		alloc_bitmap_pages_arr_size * BITFIELD_ELEM_SIZE, GFP_KERNEL);
+
+	if (!epma_dev->allocated_pages_bitfield_arr) {
+		dev_err(&pdev->dev, "failed to allocate resources\n");
+		devm_kfree(&pdev->dev, epma_dev);
+		return -ENOMEM;
+	}
+
+	if (epma_dev->rmem_size % PAGES_PER_BITFIELD_ELEM) {
+		size_t extra_pages =
+			alloc_bitmap_pages_arr_size * PAGES_PER_BITFIELD_ELEM -
+			epma_dev->rmem_size;
+		size_t last_bitfield_index = alloc_bitmap_pages_arr_size - 1;
+
+		/* Mark the extra pages (that lie outside the reserved range) as
+		 * always in use.
+		 */
+		epma_dev->allocated_pages_bitfield_arr[last_bitfield_index] =
+			((1ULL << extra_pages) - 1) <<
+			(PAGES_PER_BITFIELD_ELEM - extra_pages);
+	}
+
+	platform_set_drvdata(pdev, &epma_dev->pma_dev);
+	dev_info(&pdev->dev,
+		"Protected memory allocator probed successfully\n");
+	dev_info(&pdev->dev, "Protected memory region: base=%llx num pages=%zu\n",
+		(unsigned long long)rmem_base, rmem_size);
+
+	return 0;
+}
+
+static int protected_memory_allocator_remove(struct platform_device *pdev)
+{
+	struct protected_memory_allocator_device *pma_dev =
+		platform_get_drvdata(pdev);
+	struct simple_pma_device *epma_dev;
+	struct device *dev;
+
+	if (!pma_dev)
+		return -EINVAL;
+
+	epma_dev = container_of(pma_dev, struct simple_pma_device, pma_dev);
+	dev = epma_dev->dev;
+
+	if (epma_dev->num_free_pages < epma_dev->rmem_size) {
+		dev_warn(&pdev->dev, "Leaking %zu pages of protected memory\n",
+			epma_dev->rmem_size - epma_dev->num_free_pages);
+	}
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(dev, epma_dev->allocated_pages_bitfield_arr);
+	devm_kfree(dev, epma_dev);
+
+	dev_info(&pdev->dev,
+		"Protected memory allocator removed successfully\n");
+
+	return 0;
+}
+
+static const struct of_device_id protected_memory_allocator_dt_ids[] = {
+	{ .compatible = "arm,protected-memory-allocator" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, protected_memory_allocator_dt_ids);
+
+static struct platform_driver protected_memory_allocator_driver = {
+	.probe = protected_memory_allocator_probe,
+	.remove = protected_memory_allocator_remove,
+	.driver = {
+		.name = "simple_protected_memory_allocator",
+		.of_match_table = of_match_ptr(protected_memory_allocator_dt_ids),
+	}
+};
+
+module_platform_driver(protected_memory_allocator_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("ARM Ltd.");
+MODULE_VERSION("1.0");
--- a/drivers/base/memory_group_manager/build.bp
+++ b/drivers/base/memory_group_manager/build.bp
@@ -1,22 +0,0 @@
-/*
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * A copy of the licence is included with the program, and can also be obtained
- * from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
- * Boston, MA  02110-1301, USA.
- */
-
-bob_kernel_module {
-    name: "memory_group_manager",
-    srcs: [
-        "Kbuild",
-        "memory_group_manager.c",
-    ],
-    kbuild_options: ["CONFIG_MALI_MEMORY_GROUP_MANAGER=m"],
-    defaults: ["kernel_defaults"],
-}
--- a/drivers/gpu/arm/Kbuild
+++ b/drivers/gpu/arm/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2012 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2012, 2020 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 obj-$(CONFIG_MALI_MIDGARD) += midgard/
--- a/drivers/gpu/arm/Kconfig
+++ b/drivers/gpu/arm/Kconfig
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
 # (C) COPYRIGHT 2012 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #
 #
 source "drivers/gpu/arm/mali400/mali/Kconfig"
--- a/drivers/gpu/arm/bifrost/Kbuild
+++ b/drivers/gpu/arm/bifrost/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2012-2020 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2012-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,16 +16,14 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 # Driver version string which is returned to userspace via an ioctl
-MALI_RELEASE_NAME ?= "g2p0-01eac0"
+MALI_RELEASE_NAME ?= '"g6p0-01eac0"'

 # Paths required for build

-# make $(src) as absolute path if it isn't already, by prefixing $(srctree)
+# make $(src) as absolute path if it is not already, by prefixing $(srctree)
 src:=$(if $(patsubst /%,,$(src)),$(srctree)/$(src),$(src))
 KBASE_PATH = $(src)
 KBASE_PLATFORM_PATH = $(KBASE_PATH)/platform_dummy
@@ -32,11 +31,8 @@ UMP_PATH = $(src)/../../../base

 # Set up defaults if not defined by build system
 MALI_CUSTOMER_RELEASE ?= 1
-MALI_USE_CSF ?= 0
 MALI_UNIT_TEST ?= 0
-MALI_KERNEL_TEST_API ?= 0
 MALI_COVERAGE ?= 0
-MALI_JIT_PRESSURE_LIMIT_BASE ?= 1
 CONFIG_MALI_PLATFORM_NAME ?= "devicetree"
 # Experimental features (corresponding -D definition should be appended to
 # DEFINES below, e.g. for MALI_EXPERIMENTAL_FEATURE,
@@ -46,6 +42,20 @@ CONFIG_MALI_PLATFORM_NAME ?= "devicetree"
 # MALI_EXPERIMENTAL_FEATURE ?= 0
 MALI_INCREMENTAL_RENDERING ?= 0

+ifeq ($(CONFIG_MALI_CSF_SUPPORT),y)
+MALI_JIT_PRESSURE_LIMIT_BASE = 0
+MALI_USE_CSF = 1
+else
+MALI_JIT_PRESSURE_LIMIT_BASE ?= 1
+MALI_USE_CSF ?= 0
+endif
+
+ifneq ($(CONFIG_MALI_KUTF), n)
+MALI_KERNEL_TEST_API ?= 1
+else
+MALI_KERNEL_TEST_API ?= 0
+endif
+
 # Set up our defines, which will be passed to gcc
 DEFINES = \
 	-DMALI_CUSTOMER_RELEASE=$(MALI_CUSTOMER_RELEASE) \
@@ -53,7 +63,7 @@ DEFINES = \
 	-DMALI_KERNEL_TEST_API=$(MALI_KERNEL_TEST_API) \
 	-DMALI_UNIT_TEST=$(MALI_UNIT_TEST) \
 	-DMALI_COVERAGE=$(MALI_COVERAGE) \
-	-DMALI_RELEASE_NAME=\"$(MALI_RELEASE_NAME)\" \
+	-DMALI_RELEASE_NAME=$(MALI_RELEASE_NAME) \
 	-DMALI_JIT_PRESSURE_LIMIT_BASE=$(MALI_JIT_PRESSURE_LIMIT_BASE) \
 	-DMALI_INCREMENTAL_RENDERING=$(MALI_INCREMENTAL_RENDERING)

@@ -90,7 +100,6 @@ SRC := \
 	mali_kbase_config.c \
 	mali_kbase_vinstr.c \
 	mali_kbase_hwcnt.c \
-	mali_kbase_hwcnt_backend_jm.c \
 	mali_kbase_hwcnt_gpu.c \
 	mali_kbase_hwcnt_legacy.c \
 	mali_kbase_hwcnt_types.c \
@@ -104,7 +113,6 @@ SRC := \
 	mali_kbase_mem_profile_debugfs.c \
 	mmu/mali_kbase_mmu.c \
 	mmu/mali_kbase_mmu_hw_direct.c \
-	mmu/mali_kbase_mmu_mode_lpae.c \
 	mmu/mali_kbase_mmu_mode_aarch64.c \
 	mali_kbase_disjoint_events.c \
 	mali_kbase_debug_mem_view.c \
@@ -115,6 +123,7 @@ SRC := \
 	mali_kbase_strings.c \
 	mali_kbase_as_fault_debugfs.c \
 	mali_kbase_regs_history_debugfs.c \
+	mali_kbase_dvfs_debugfs.c \
 	mali_power_gpu_frequency_trace.c \
 	mali_kbase_trace_gpu_mem.c \
 	thirdparty/mali_kbase_mmap.c \
@@ -126,6 +135,8 @@ SRC := \

 ifeq ($(MALI_USE_CSF),1)
 	SRC += \
+		mali_kbase_hwcnt_backend_csf.c \
+		mali_kbase_hwcnt_backend_csf_if_fw.c \
 		debug/backend/mali_kbase_debug_ktrace_csf.c \
 		device/backend/mali_kbase_device_csf.c \
 		device/backend/mali_kbase_device_hw_csf.c \
@@ -135,6 +146,7 @@ ifeq ($(MALI_USE_CSF),1)
 		context/backend/mali_kbase_context_csf.c
 else
 	SRC += \
+		mali_kbase_hwcnt_backend_jm.c \
 		mali_kbase_dummy_job_wa.c \
 		mali_kbase_debug_job_fault.c \
 		mali_kbase_event.c \
@@ -156,9 +168,6 @@ ifeq ($(CONFIG_MALI_CINSTR_GWT),y)
 	SRC += mali_kbase_gwt.c
 endif

-ifeq ($(MALI_UNIT_TEST),1)
-	SRC += tl/mali_kbase_timeline_test.c
-endif

 ifeq ($(MALI_CUSTOMER_RELEASE),0)
 	SRC += mali_kbase_regs_dump_debugfs.c
--- a/drivers/gpu/arm/bifrost/Kconfig
+++ b/drivers/gpu/arm/bifrost/Kconfig
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2012-2020 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2012-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,8 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #


@@ -31,6 +30,13 @@ menuconfig MALI_BIFROST
 	  To compile this driver as a module, choose M here:
 	  this will generate a single module, called mali_kbase.

+config MALI_CSF_SUPPORT
+	bool "Mali CSF based GPU support"
+	depends on MALI_BIFROST=m
+	default n
+	help
+	  Enables support for CSF based GPUs.
+
 config MALI_BIFROST_GATOR_SUPPORT
 	bool "Enable Streamline tracing support"
 	depends on MALI_BIFROST
@@ -277,10 +283,20 @@ config MALI_JOB_DUMP
 	  minimal overhead when not in use. Enable only if you know what
 	  you are doing.

-config MALI_BIFROST_PRFCNT_SET_SECONDARY
-	bool "Use secondary set of performance counters"
+choice
+	prompt "Performance counters set"
+	default MALI_PRFCNT_SET_PRIMARY
+	depends on MALI_BIFROST && MALI_BIFROST_EXPERT
+
+config MALI_PRFCNT_SET_PRIMARY
+	bool "Primary"
+	depends on MALI_BIFROST && MALI_BIFROST_EXPERT
+	help
+	  Select this option to use primary set of performance counters.
+
+config MALI_BIFROST_PRFCNT_SET_SECONDARY
+	bool "Secondary"
 	depends on MALI_BIFROST && MALI_BIFROST_EXPERT
-	default n
 	help
 	  Select this option to use secondary set of performance counters. Kernel
 	  features that depend on an access to the primary set of counters may
@@ -288,21 +304,43 @@ config MALI_BIFROST_PRFCNT_SET_SECONDARY
 	  from working optimally and may cause instrumentation tools to return
 	  bogus results.

-	  If unsure, say N.
+	  If unsure, use MALI_PRFCNT_SET_PRIMARY.

-config MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS
-	bool "Use secondary set of performance counters"
-	depends on MALI_BIFROST && MALI_BIFROST_EXPERT && !MALI_BIFROST_PRFCNT_SET_SECONDARY && DEBUG_FS
+config MALI_PRFCNT_SET_TERTIARY
+	bool "Tertiary"
+	depends on MALI_BIFROST && MALI_BIFROST_EXPERT
+	help
+	  Select this option to use tertiary set of performance counters. Kernel
+	  features that depend on an access to the primary set of counters may
+	  become unavailable. Enabling this option will prevent power management
+	  from working optimally and may cause instrumentation tools to return
+	  bogus results.
+
+	  If unsure, use MALI_PRFCNT_SET_PRIMARY.
+
+endchoice
+
+config MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
+	bool "Allow runtime selection of performance counters set via debugfs"
+	depends on MALI_BIFROST && MALI_BIFROST_EXPERT && DEBUG_FS
 	default n
 	help
 	  Select this option to make the secondary set of performance counters
 	  available at runtime via debugfs. Kernel features that depend on an
 	  access to the primary set of counters may become unavailable.

+	  If no runtime debugfs option is set, the build time counter set
+	  choice will be used.
+
 	  This feature is unsupported and unstable, and may break at any time.
 	  Enabling this option will prevent power management from working
 	  optimally and may cause instrumentation tools to return bogus results.

+	  No validation is done on the debugfs input. Invalid input could cause
+	  performance counter errors. Valid inputs are the values accepted by
+	  the SET_SELECT bits of the PRFCNT_CONFIG register as defined in the
+	  architecture specification.
+
 	  If unsure, say N.

 source "drivers/gpu/arm/midgard/platform/Kconfig"
--- a/drivers/gpu/arm/bifrost/Makefile
+++ b/drivers/gpu/arm/bifrost/Makefile
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2010-2019 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,24 +16,49 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

+# Handle Android Common Kernel source naming
+KERNEL_SRC ?= /lib/modules/$(shell uname -r)/build
+KDIR ?= $(KERNEL_SRC)

-KDIR ?= /lib/modules/$(shell uname -r)/build
+# out-of-tree
+ifeq ($(KBUILD_EXTMOD),)
+export CONFIG_MALI_MIDGARD?=m
+
+ifneq ($(CONFIG_MALI_MIDGARD),n)
+export CONFIG_MALI_CSF_SUPPORT?=n
+export CONFIG_MALI_KUTF?=m
+export CONFIG_MALI_REAL_HW?=y
+
+# Handle default y/m in Kconfig
+export CONFIG_MALI_BIFROST_GATOR_SUPPORT?=y
+export CONFIG_MALI_BIFROST_DEVFREQ?=n
+ifneq ($(CONFIG_PM_DEVFREQ),n)
+export CONFIG_MALI_BIFROST_DEVFREQ?=y
+endif
+
+DEFINES += -DCONFIG_MALI_MIDGARD=$(CONFIG_MALI_MIDGARD) \
+	-DCONFIG_MALI_CSF_SUPPORT=$(CONFIG_MALI_CSF_SUPPORT) \
+	-DCONFIG_MALI_KUTF=$(CONFIG_MALI_KUTF) \
+	-DCONFIG_MALI_REAL_HW=$(CONFIG_MALI_REAL_HW) \
+	-DCONFIG_MALI_GATOR_SUPPORT=$(CONFIG_MALI_BIFROST_GATOR_SUPPORT) \
+	-DCONFIG_MALI_DEVFREQ=$(CONFIG_MALI_BIFROST_DEVFREQ)
+
+export DEFINES
+
+endif
+endif

-BUSLOG_PATH_RELATIVE = $(CURDIR)/../../../..
 KBASE_PATH_RELATIVE = $(CURDIR)

-ifeq ($(CONFIG_MALI_BUSLOG),y)
-#Add bus logger symbols
-EXTRA_SYMBOLS += $(BUSLOG_PATH_RELATIVE)/drivers/base/bus_logger/Module.symvers
-endif

 # we get the symbols from modules using KBUILD_EXTRA_SYMBOLS to prevent warnings about unknown functions
 all:
 	$(MAKE) -C $(KDIR) M=$(CURDIR) EXTRA_CFLAGS="-I$(CURDIR)/../../../../include -I$(CURDIR)/../../../../tests/include $(SCONS_CFLAGS)" $(SCONS_CONFIGS) KBUILD_EXTRA_SYMBOLS="$(EXTRA_SYMBOLS)" modules

+modules_install:
+	$(MAKE) -C $(KDIR) M=$(CURDIR) modules_install
+
 clean:
 	$(MAKE) -C $(KDIR) M=$(CURDIR) clean
--- a/drivers/gpu/arm/bifrost/Makefile.kbase
+++ b/drivers/gpu/arm/bifrost/Makefile.kbase
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2010, 2013, 2018 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2010, 2013, 2018-2020 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,9 +16,6 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 EXTRA_CFLAGS += -I$(ROOT) -I$(KBASE_PATH) -I$(KBASE_PATH)/platform_$(PLATFORM)
-
--- a/drivers/gpu/arm/bifrost/Mconfig
+++ b/drivers/gpu/arm/bifrost/Mconfig
@@ -1,18 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2012-2020 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2012-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
-# A copy of the licence is included with the program, and can also be obtained
-# from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
-# Boston, MA  02110-1301, USA.
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
 #
 #

-
 menuconfig MALI_BIFROST
 	bool "Mali Midgard series support"
 	default y
@@ -22,6 +27,13 @@ menuconfig MALI_BIFROST
 	  To compile this driver as a module, choose M here:
 	  this will generate a single module, called mali_kbase.

+config MALI_CSF_SUPPORT
+	bool "Mali CSF based GPU support"
+	depends on MALI_BIFROST
+	default n
+	help
+	  Enables support for CSF based GPUs.
+
 config MALI_BIFROST_GATOR_SUPPORT
 	bool "Enable Streamline tracing support"
 	depends on MALI_BIFROST && !BACKEND_USER
@@ -272,6 +284,9 @@ config MALI_GEM5_BUILD
 # Instrumentation options.

 # config MALI_JOB_DUMP exists in the Kernel Kconfig but is configured using CINSTR_JOB_DUMP in Mconfig.
+# config MALI_PRFCNT_SET_PRIMARY exists in the Kernel Kconfig but is configured using CINSTR_PRIMARY_HWC in Mconfig.
 # config MALI_BIFROST_PRFCNT_SET_SECONDARY exists in the Kernel Kconfig but is configured using CINSTR_SECONDARY_HWC in Mconfig.
+# config MALI_PRFCNT_SET_TERTIARY exists in the Kernel Kconfig but is configured using CINSTR_TERTIARY_HWC in Mconfig.
+# config MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS exists in the Kernel Kconfig but is configured using CINSTR_HWC_SET_SELECT_VIA_DEBUG_FS in Mconfig.

 source "kernel/drivers/gpu/arm/midgard/tests/Mconfig"
--- a/drivers/gpu/arm/bifrost/arbiter/Kbuild
+++ b/drivers/gpu/arm/bifrost/arbiter/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
 # (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,10 +16,8 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

-mali_kbase-y += \
+bifrost_kbase-y += \
 	arbiter/mali_kbase_arbif.o \
 	arbiter/mali_kbase_arbiter_pm.o
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbif.c
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbif.c
@@ -1,13 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
-
 /*
 *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -18,13 +17,10 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /**
- * @file mali_kbase_arbif.c
- * Mali arbiter interface APIs to share GPU between Virtual Machines
+ * DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
 */

 #include <mali_kbase.h>
@@ -34,29 +30,148 @@
 #include <linux/of_platform.h>
 #include "mali_kbase_arbiter_interface.h"

+/* Arbiter interface version against which was implemented this module */
+#define MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION 5
+#if MALI_REQUIRED_KBASE_ARBITER_INTERFACE_VERSION != \
+			MALI_KBASE_ARBITER_INTERFACE_VERSION
+#error "Unsupported Mali Arbiter interface version."
+#endif
+
+static void on_max_config(struct device *dev, uint32_t max_l2_slices,
+			  uint32_t max_core_mask)
+{
+	struct kbase_device *kbdev;
+
+	if (!dev) {
+		pr_err("%s(): dev is NULL", __func__);
+		return;
+	}
+
+	kbdev = dev_get_drvdata(dev);
+	if (!kbdev) {
+		dev_err(dev, "%s(): kbdev is NULL", __func__);
+		return;
+	}
+
+	if (!max_l2_slices || !max_core_mask) {
+		dev_dbg(dev,
+			"%s(): max_config ignored as one of the fields is zero",
+			__func__);
+		return;
+	}
+
+	/* set the max config info in the kbase device */
+	kbase_arbiter_set_max_config(kbdev, max_l2_slices, max_core_mask);
+}
+
+/**
+ * on_update_freq() - Updates GPU clock frequency
+ * @dev: arbiter interface device handle
+ * @freq: GPU clock frequency value reported from arbiter
+ *
+ * call back function to update GPU clock frequency with
+ * new value from arbiter
+ */
+static void on_update_freq(struct device *dev, uint32_t freq)
+{
+	struct kbase_device *kbdev;
+
+	if (!dev) {
+		pr_err("%s(): dev is NULL", __func__);
+		return;
+	}
+
+	kbdev = dev_get_drvdata(dev);
+	if (!kbdev) {
+		dev_err(dev, "%s(): kbdev is NULL", __func__);
+		return;
+	}
+
+	kbase_arbiter_pm_update_gpu_freq(&kbdev->arb.arb_freq, freq);
+}
+
+/**
+ * on_gpu_stop() - sends KBASE_VM_GPU_STOP_EVT event on VM stop
+ * @dev: arbiter interface device handle
+ *
+ * call back function to signal a GPU STOP event from arbiter interface
+ */
 static void on_gpu_stop(struct device *dev)
 {
-	struct kbase_device *kbdev = dev_get_drvdata(dev);
+	struct kbase_device *kbdev;
+
+	if (!dev) {
+		pr_err("%s(): dev is NULL", __func__);
+		return;
+	}
+
+	kbdev = dev_get_drvdata(dev);
+	if (!kbdev) {
+		dev_err(dev, "%s(): kbdev is NULL", __func__);
+		return;
+	}

 	KBASE_TLSTREAM_TL_ARBITER_STOP_REQUESTED(kbdev, kbdev);
 	kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_STOP_EVT);
 }

+/**
+ * on_gpu_granted() - sends KBASE_VM_GPU_GRANTED_EVT event on GPU granted
+ * @dev: arbiter interface device handle
+ *
+ * call back function to signal a GPU GRANT event from arbiter interface
+ */
 static void on_gpu_granted(struct device *dev)
 {
-	struct kbase_device *kbdev = dev_get_drvdata(dev);
+	struct kbase_device *kbdev;
+
+	if (!dev) {
+		pr_err("%s(): dev is NULL", __func__);
+		return;
+	}
+
+	kbdev = dev_get_drvdata(dev);
+	if (!kbdev) {
+		dev_err(dev, "%s(): kbdev is NULL", __func__);
+		return;
+	}

 	KBASE_TLSTREAM_TL_ARBITER_GRANTED(kbdev, kbdev);
 	kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_GRANTED_EVT);
 }

+/**
+ * on_gpu_lost() - sends KBASE_VM_GPU_LOST_EVT event  on GPU granted
+ * @dev: arbiter interface device handle
+ *
+ * call back function to signal a GPU LOST event from arbiter interface
+ */
 static void on_gpu_lost(struct device *dev)
 {
-	struct kbase_device *kbdev = dev_get_drvdata(dev);
+	struct kbase_device *kbdev;
+
+	if (!dev) {
+		pr_err("%s(): dev is NULL", __func__);
+		return;
+	}
+
+	kbdev = dev_get_drvdata(dev);
+	if (!kbdev) {
+		dev_err(dev, "%s(): kbdev is NULL", __func__);
+		return;
+	}

 	kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_LOST_EVT);
 }

+/**
+ * kbase_arbif_init() - Kbase Arbiter interface initialisation.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Initialise Kbase Arbiter interface and assign callback functions.
+ *
+ * Return: 0 on success else a Linux error code
+ */
 int kbase_arbif_init(struct kbase_device *kbdev)
 {
 #ifdef CONFIG_OF
@@ -100,6 +215,12 @@ int kbase_arbif_init(struct kbase_device *kbdev)
 	ops.arb_vm_gpu_stop = on_gpu_stop;
 	ops.arb_vm_gpu_granted = on_gpu_granted;
 	ops.arb_vm_gpu_lost = on_gpu_lost;
+	ops.arb_vm_max_config = on_max_config;
+	ops.arb_vm_update_freq = on_update_freq;
+
+
+	kbdev->arb.arb_freq.arb_freq = 0;
+	mutex_init(&kbdev->arb.arb_freq.arb_freq_lock);

 	/* register kbase arbiter_if callbacks */
 	if (arb_if->vm_ops.vm_arb_register_dev) {
@@ -111,6 +232,7 @@ int kbase_arbif_init(struct kbase_device *kbdev)
 			return err;
 		}
 	}
+
 #else /* CONFIG_OF */
 	dev_dbg(kbdev->dev, "No arbiter without Device Tree support\n");
 	kbdev->arb.arb_dev = NULL;
@@ -119,6 +241,12 @@ int kbase_arbif_init(struct kbase_device *kbdev)
 	return 0;
 }

+/**
+ * kbase_arbif_destroy() - De-init Kbase arbiter interface
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * De-initialise Kbase arbiter interface
+ */
 void kbase_arbif_destroy(struct kbase_device *kbdev)
 {
 	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
@@ -133,16 +261,45 @@ void kbase_arbif_destroy(struct kbase_device *kbdev)
 	kbdev->arb.arb_dev = NULL;
 }

+/**
+ * kbase_arbif_get_max_config() - Request max config info
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * call back function from arb interface to arbiter requesting max config info
+ */
+void kbase_arbif_get_max_config(struct kbase_device *kbdev)
+{
+	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
+
+	if (arb_if && arb_if->vm_ops.vm_arb_get_max_config) {
+		dev_dbg(kbdev->dev, "%s\n", __func__);
+		arb_if->vm_ops.vm_arb_get_max_config(arb_if);
+	}
+}
+
+/**
+ * kbase_arbif_gpu_request() - Request GPU from
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * call back function from arb interface to arbiter requesting GPU for VM
+ */
 void kbase_arbif_gpu_request(struct kbase_device *kbdev)
 {
 	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;

 	if (arb_if && arb_if->vm_ops.vm_arb_gpu_request) {
 		dev_dbg(kbdev->dev, "%s\n", __func__);
+		KBASE_TLSTREAM_TL_ARBITER_REQUESTED(kbdev, kbdev);
 		arb_if->vm_ops.vm_arb_gpu_request(arb_if);
 	}
 }

+/**
+ * kbase_arbif_gpu_stopped() - send GPU stopped message to the arbiter
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ * @gpu_required: GPU request flag
+ *
+ */
 void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required)
 {
 	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
@@ -154,6 +311,12 @@ void kbase_arbif_gpu_stopped(struct kbase_device *kbdev, u8 gpu_required)
 	}
 }

+/**
+ * kbase_arbif_gpu_active() - Sends a GPU_ACTIVE message to the Arbiter
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Informs the arbiter VM is active
+ */
 void kbase_arbif_gpu_active(struct kbase_device *kbdev)
 {
 	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
@@ -164,6 +327,12 @@ void kbase_arbif_gpu_active(struct kbase_device *kbdev)
 	}
 }

+/**
+ * kbase_arbif_gpu_idle() - Inform the arbiter that the VM has gone idle
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Informs the arbiter VM is idle
+ */
 void kbase_arbif_gpu_idle(struct kbase_device *kbdev)
 {
 	struct arbiter_if_dev *arb_if = kbdev->arb.arb_if;
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbif.h
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbif.h
@@ -1,28 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
@@ -38,12 +17,10 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- *
 */

 /**
- * @file
- * Mali arbiter interface APIs to share GPU between Virtual Machines
+ * DOC: Mali arbiter interface APIs to share GPU between Virtual Machines
 */

 #ifndef _MALI_KBASE_ARBIF_H_
@@ -94,6 +71,14 @@ int kbase_arbif_init(struct kbase_device *kbdev);
 */
 void kbase_arbif_destroy(struct kbase_device *kbdev);

+/**
+ * kbase_arbif_get_max_config() - Request max config info
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * call back function from arb interface to arbiter requesting max config info
+ */
+void kbase_arbif_get_max_config(struct kbase_device *kbdev);
+
 /**
 * kbase_arbif_gpu_request() - Send GPU request message to the arbiter
 * @kbdev: The kbase device structure for the device (must be a valid pointer)
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_defs.h
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_defs.h
@@ -1,28 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
@@ -38,7 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- *
 */

 /**
@@ -66,7 +44,8 @@
 * @vm_resume_work:  Work item for vm_arb_wq to resume current work on GPU
 * @vm_arb_starting: Work queue resume in progress
 * @vm_arb_stopping: Work queue suspend in progress
- * @vm_arb_users_waiting: Count of users waiting for GPU
+ * @interrupts_installed: Flag set when interrupts are installed
+ * @vm_request_timer: Timer to monitor GPU request
 */
 struct kbase_arbiter_vm_state {
 	struct kbase_device *kbdev;
@@ -78,7 +57,8 @@ struct kbase_arbiter_vm_state {
 	struct work_struct vm_resume_work;
 	bool vm_arb_starting;
 	bool vm_arb_stopping;
-	int vm_arb_users_waiting;
+	bool interrupts_installed;
+	struct hrtimer vm_request_timer;
 };

 /**
@@ -86,10 +66,12 @@ struct kbase_arbiter_vm_state {
 *                               allocated from the probe method of Mali driver
 * @arb_if:                 Pointer to the arbiter interface device
 * @arb_dev:                Pointer to the arbiter device
+ * @arb_freq:               GPU clock frequency retrieved from arbiter.
 */
 struct kbase_arbiter_device {
 	struct arbiter_if_dev *arb_if;
 	struct device *arb_dev;
+	struct kbase_arbiter_freq arb_freq;
 };

 #endif /* _MALI_KBASE_ARBITER_DEFS_H_ */
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_interface.h
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_interface.h
@@ -1,28 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
@@ -38,7 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- *
 */

 /**
@@ -50,7 +28,7 @@
 #define _MALI_KBASE_ARBITER_INTERFACE_H_

 /**
- * @brief Mali arbiter interface version
+ *  Mali arbiter interface version
 *
 * This specifies the current version of the configuration interface. Whenever
 * the arbiter interface changes, so that integration effort is required, the
@@ -61,8 +39,15 @@
 * 1 - Added the Mali arbiter configuration interface.
 * 2 - Strip out reference code from header
 * 3 - Removed DVFS utilization interface (DVFS moved to arbiter side)
+ * 4 - Added max_config support
+ * 5 - Added GPU clock frequency reporting support from arbiter
 */
-#define MALI_KBASE_ARBITER_INTERFACE_VERSION 3
+#define MALI_KBASE_ARBITER_INTERFACE_VERSION 5
+
+/**
+ * NO_FREQ is used in case platform doesn't support reporting frequency
+ */
+#define NO_FREQ 0

 struct arbiter_if_dev;

@@ -108,6 +93,27 @@ struct arbiter_if_arb_vm_ops {
 	 * If successful, will respond with a vm_arb_gpu_stopped message.
 	 */
 	void (*arb_vm_gpu_lost)(struct device *dev);
+
+	/**
+	 * arb_vm_max_config() - Send max config info to the VM
+	 * @dev: The arbif kernel module device.
+	 * @max_l2_slices: The maximum number of L2 slices.
+	 * @max_core_mask: The largest core mask.
+	 *
+	 * Informs KBase the maximum resources that can be allocated to the
+	 * partition in use.
+	 */
+	void (*arb_vm_max_config)(struct device *dev, uint32_t max_l2_slices,
+				  uint32_t max_core_mask);
+
+	/**
+	 * arb_vm_update_freq() - GPU clock frequency has been updated
+	 * @dev: The arbif kernel module device.
+	 * @freq: GPU clock frequency value reported from arbiter
+	 *
+	 * Informs KBase that the GPU clock frequency has been updated.
+	 */
+	void (*arb_vm_update_freq)(struct device *dev, uint32_t freq);
 };

 /**
@@ -136,6 +142,13 @@ struct arbiter_if_vm_arb_ops {
 	 */
 	void (*vm_arb_unregister_dev)(struct arbiter_if_dev *arbif_dev);

+	/**
+	 * vm_arb_gpu_get_max_config() - Request the max config from the
+	 * Arbiter.
+	 * @arbif_dev: The arbiter interface we want to issue the request.
+	 */
+	void (*vm_arb_get_max_config)(struct arbiter_if_dev *arbif_dev);
+
 	/**
 	 * vm_arb_gpu_request() - Ask the arbiter interface for GPU access.
 	 * @arbif_dev: The arbiter interface we want to issue the request.
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_pm.c
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_pm.c
@@ -1,13 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
-
 /*
 *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -18,12 +17,10 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /**
- * @file mali_kbase_arbiter_pm.c
+ * @file
 * Mali arbiter power manager state machine and APIs
 */

@@ -34,11 +31,34 @@
 #include <mali_kbase_hwcnt_context.h>
 #include <mali_kbase_pm_internal.h>
 #include <tl/mali_kbase_tracepoints.h>
+#include <mali_kbase_gpuprops.h>
+
+/* A dmesg warning will occur if the GPU is not granted
+ * after the following time (in milliseconds) has ellapsed.
+ */
+#define GPU_REQUEST_TIMEOUT 1000
+
+#define MAX_L2_SLICES_MASK		0xFF
+
+/* Maximum time in ms, before deferring probe incase
+ * GPU_GRANTED message is not received
+ */
+static int gpu_req_timeout = 1;
+module_param(gpu_req_timeout, int, 0644);
+MODULE_PARM_DESC(gpu_req_timeout,
+	"On a virtualized platform, if the GPU is not granted within this time(ms) kbase will defer the probe");

 static void kbase_arbiter_pm_vm_wait_gpu_assignment(struct kbase_device *kbdev);
 static inline bool kbase_arbiter_pm_vm_gpu_assigned_lockheld(
 	struct kbase_device *kbdev);

+/**
+ * kbase_arbiter_pm_vm_state_str() - Helper function to get string
+ *                                   for kbase VM state.(debug)
+ * @state: kbase VM state
+ *
+ * Return: string representation of Kbase_vm_state
+ */
 static inline const char *kbase_arbiter_pm_vm_state_str(
 	enum kbase_vm_state state)
 {
@@ -73,6 +93,13 @@ static inline const char *kbase_arbiter_pm_vm_state_str(
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_event_str() - Helper function to get string
+ *                                   for kbase VM event.(debug)
+ * @evt: kbase VM state
+ *
+ * Return: String representation of Kbase_arbif_event
+ */
 static inline const char *kbase_arbiter_pm_vm_event_str(
 	enum kbase_arbif_evt evt)
 {
@@ -99,6 +126,13 @@ static inline const char *kbase_arbiter_pm_vm_event_str(
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_set_state() - Sets new kbase_arbiter_vm_state
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ * @new_state: kbase VM new state
+ *
+ * This function sets the new state for the VM
+ */
 static void kbase_arbiter_pm_vm_set_state(struct kbase_device *kbdev,
 	enum kbase_vm_state new_state)
 {
@@ -107,11 +141,22 @@ static void kbase_arbiter_pm_vm_set_state(struct kbase_device *kbdev,
 	dev_dbg(kbdev->dev, "VM set_state %s -> %s",
 	kbase_arbiter_pm_vm_state_str(arb_vm_state->vm_state),
 	kbase_arbiter_pm_vm_state_str(new_state));
+
 	lockdep_assert_held(&arb_vm_state->vm_state_lock);
 	arb_vm_state->vm_state = new_state;
+	if (new_state != KBASE_VM_STATE_INITIALIZING_WITH_GPU &&
+		new_state != KBASE_VM_STATE_INITIALIZING)
+		KBASE_KTRACE_ADD(kbdev, ARB_VM_STATE, NULL, new_state);
 	wake_up(&arb_vm_state->vm_state_wait);
 }

+/**
+ * kbase_arbiter_pm_suspend_wq() - suspend work queue of the driver.
+ * @data: work queue
+ *
+ * Suspends work queue of the driver, when VM is in SUSPEND_PENDING or
+ * STOPPING_IDLE or STOPPING_ACTIVE state
+ */
 static void kbase_arbiter_pm_suspend_wq(struct work_struct *data)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = container_of(data,
@@ -136,6 +181,13 @@ static void kbase_arbiter_pm_suspend_wq(struct work_struct *data)
 	dev_dbg(kbdev->dev, "<%s\n", __func__);
 }

+/**
+ * kbase_arbiter_pm_resume_wq() -Kbase resume work queue.
+ * @data: work item
+ *
+ * Resume work queue of the driver when VM is in STARTING state,
+ * else if its in STOPPING_ACTIVE will request a stop event.
+ */
 static void kbase_arbiter_pm_resume_wq(struct work_struct *data)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = container_of(data,
@@ -157,9 +209,74 @@ static void kbase_arbiter_pm_resume_wq(struct work_struct *data)
 	}
 	arb_vm_state->vm_arb_starting = false;
 	mutex_unlock(&arb_vm_state->vm_state_lock);
+	KBASE_TLSTREAM_TL_ARBITER_STARTED(kbdev, kbdev);
 	dev_dbg(kbdev->dev, "<%s\n", __func__);
 }

+/**
+ * request_timer_callback() - Issue warning on request timer expiration
+ * @timer: Request hr timer data
+ *
+ * Called when the Arbiter takes too long to grant the GPU after a
+ * request has been made.  Issues a warning in dmesg.
+ *
+ * Return: Always returns HRTIMER_NORESTART
+ */
+static enum hrtimer_restart request_timer_callback(struct hrtimer *timer)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state = container_of(timer,
+			struct kbase_arbiter_vm_state, vm_request_timer);
+
+	KBASE_DEBUG_ASSERT(arb_vm_state);
+	KBASE_DEBUG_ASSERT(arb_vm_state->kbdev);
+
+	dev_warn(arb_vm_state->kbdev->dev,
+		"Still waiting for GPU to be granted from Arbiter after %d ms\n",
+		GPU_REQUEST_TIMEOUT);
+	return HRTIMER_NORESTART;
+}
+
+/**
+ * start_request_timer() - Start a timer after requesting GPU
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Start a timer to track when kbase is waiting for the GPU from the
+ * Arbiter.  If the timer expires before GPU is granted, a warning in
+ * dmesg will be issued.
+ */
+static void start_request_timer(struct kbase_device *kbdev)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
+
+	hrtimer_start(&arb_vm_state->vm_request_timer,
+			HR_TIMER_DELAY_MSEC(GPU_REQUEST_TIMEOUT),
+			HRTIMER_MODE_REL);
+}
+
+/**
+ * cancel_request_timer() - Stop the request timer
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Stops the request timer once GPU has been granted.  Safe to call
+ * even if timer is no longer running.
+ */
+static void cancel_request_timer(struct kbase_device *kbdev)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
+
+	hrtimer_cancel(&arb_vm_state->vm_request_timer);
+}
+
+/**
+ * kbase_arbiter_pm_early_init() - Initialize arbiter for VM
+ *                                 Paravirtualized use.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Initialize the arbiter and other required resources during the runtime
+ * and request the GPU for the VM for the first time.
+ *
+ * Return: 0 if success, or a Linux error code
+ */
 int kbase_arbiter_pm_early_init(struct kbase_device *kbdev)
 {
 	int err;
@@ -179,12 +296,17 @@ int kbase_arbiter_pm_early_init(struct kbase_device *kbdev)
 		WQ_HIGHPRI);
 	if (!arb_vm_state->vm_arb_wq) {
 		dev_err(kbdev->dev, "Failed to allocate vm_arb workqueue\n");
+		kfree(arb_vm_state);
 		return -ENOMEM;
 	}
 	INIT_WORK(&arb_vm_state->vm_suspend_work, kbase_arbiter_pm_suspend_wq);
 	INIT_WORK(&arb_vm_state->vm_resume_work, kbase_arbiter_pm_resume_wq);
 	arb_vm_state->vm_arb_starting = false;
-	arb_vm_state->vm_arb_users_waiting = 0;
+	atomic_set(&kbdev->pm.gpu_users_waiting, 0);
+	hrtimer_init(&arb_vm_state->vm_request_timer, CLOCK_MONOTONIC,
+							HRTIMER_MODE_REL);
+	arb_vm_state->vm_request_timer.function =
+						request_timer_callback;
 	kbdev->pm.arb_vm_state = arb_vm_state;

 	err = kbase_arbif_init(kbdev);
@@ -192,17 +314,31 @@ int kbase_arbiter_pm_early_init(struct kbase_device *kbdev)
 		dev_err(kbdev->dev, "Failed to initialise arbif module\n");
 		goto arbif_init_fail;
 	}
+
 	if (kbdev->arb.arb_if) {
 		kbase_arbif_gpu_request(kbdev);
 		dev_dbg(kbdev->dev, "Waiting for initial GPU assignment...\n");
-		wait_event(arb_vm_state->vm_state_wait,
+		err = wait_event_timeout(arb_vm_state->vm_state_wait,
 			arb_vm_state->vm_state ==
-					KBASE_VM_STATE_INITIALIZING_WITH_GPU);
+					KBASE_VM_STATE_INITIALIZING_WITH_GPU,
+			msecs_to_jiffies(gpu_req_timeout));
+
+		if (!err) {
+			dev_dbg(kbdev->dev,
+			"Kbase probe Deferred after waiting %d ms to receive GPU_GRANT\n",
+			gpu_req_timeout);
+			err = -EPROBE_DEFER;
+			goto arbif_eprobe_defer;
+		}
+
 		dev_dbg(kbdev->dev,
 			"Waiting for initial GPU assignment - done\n");
 	}
 	return 0;

+arbif_eprobe_defer:
+	kbase_arbiter_pm_early_term(kbdev);
+	return err;
 arbif_init_fail:
 	destroy_workqueue(arb_vm_state->vm_arb_wq);
 	kfree(arb_vm_state);
@@ -210,36 +346,72 @@ arbif_init_fail:
 	return err;
 }

+/**
+ * kbase_arbiter_pm_early_term() - Shutdown arbiter and free resources
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Clean up all the resources
+ */
 void kbase_arbiter_pm_early_term(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;

+	cancel_request_timer(kbdev);
 	mutex_lock(&arb_vm_state->vm_state_lock);
 	if (arb_vm_state->vm_state > KBASE_VM_STATE_STOPPED_GPU_REQUESTED) {
 		kbase_pm_set_gpu_lost(kbdev, false);
 		kbase_arbif_gpu_stopped(kbdev, false);
 	}
 	mutex_unlock(&arb_vm_state->vm_state_lock);
-	kbase_arbif_destroy(kbdev);
 	destroy_workqueue(arb_vm_state->vm_arb_wq);
+	kbase_arbif_destroy(kbdev);
 	arb_vm_state->vm_arb_wq = NULL;
 	kfree(kbdev->pm.arb_vm_state);
 	kbdev->pm.arb_vm_state = NULL;
 }

+/**
+ * kbase_arbiter_pm_release_interrupts() - Release the GPU interrupts
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Releases interrupts and set the interrupt flag to false
+ */
 void kbase_arbiter_pm_release_interrupts(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;

 	mutex_lock(&arb_vm_state->vm_state_lock);
-	if (!kbdev->arb.arb_if ||
-			arb_vm_state->vm_state >
-					KBASE_VM_STATE_STOPPED_GPU_REQUESTED)
+	if (arb_vm_state->interrupts_installed == true) {
+		arb_vm_state->interrupts_installed = false;
 		kbase_release_interrupts(kbdev);
-
+	}
 	mutex_unlock(&arb_vm_state->vm_state_lock);
 }

+/**
+ * kbase_arbiter_pm_install_interrupts() - Install the GPU interrupts
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Install interrupts and set the interrupt_install flag to true.
+ */
+int kbase_arbiter_pm_install_interrupts(struct kbase_device *kbdev)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
+	int err;
+
+	mutex_lock(&arb_vm_state->vm_state_lock);
+	arb_vm_state->interrupts_installed = true;
+	err = kbase_install_interrupts(kbdev);
+	mutex_unlock(&arb_vm_state->vm_state_lock);
+	return err;
+}
+
+/**
+ * kbase_arbiter_pm_vm_stopped() - Handle stop state for the VM
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Handles a stop state for the VM
+ */
 void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev)
 {
 	bool request_gpu = false;
@@ -247,14 +419,19 @@ void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev)

 	lockdep_assert_held(&arb_vm_state->vm_state_lock);

-	if (arb_vm_state->vm_arb_users_waiting > 0 &&
+	if (atomic_read(&kbdev->pm.gpu_users_waiting) > 0 &&
 			arb_vm_state->vm_state == KBASE_VM_STATE_STOPPING_IDLE)
 		kbase_arbiter_pm_vm_set_state(kbdev,
 			 KBASE_VM_STATE_STOPPING_ACTIVE);

 	dev_dbg(kbdev->dev, "%s %s\n", __func__,
 		kbase_arbiter_pm_vm_state_str(arb_vm_state->vm_state));
-	kbase_release_interrupts(kbdev);
+
+	if (arb_vm_state->interrupts_installed) {
+		arb_vm_state->interrupts_installed = false;
+		kbase_release_interrupts(kbdev);
+	}
+
 	switch (arb_vm_state->vm_state) {
 	case KBASE_VM_STATE_STOPPING_ACTIVE:
 		request_gpu = true;
@@ -275,13 +452,85 @@ void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev)

 	kbase_pm_set_gpu_lost(kbdev, false);
 	kbase_arbif_gpu_stopped(kbdev, request_gpu);
+	if (request_gpu)
+		start_request_timer(kbdev);
 }

+void kbase_arbiter_set_max_config(struct kbase_device *kbdev,
+				  uint32_t max_l2_slices,
+				  uint32_t max_core_mask)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state;
+	struct max_config_props max_config;
+
+	if (!kbdev)
+		return;
+
+	/* Mask the max_l2_slices as it is stored as 8 bits into kbase */
+	max_config.l2_slices = max_l2_slices & MAX_L2_SLICES_MASK;
+	max_config.core_mask = max_core_mask;
+	arb_vm_state = kbdev->pm.arb_vm_state;
+
+	mutex_lock(&arb_vm_state->vm_state_lock);
+	/* Just set the max_props in kbase during initialization. */
+	if (arb_vm_state->vm_state == KBASE_VM_STATE_INITIALIZING)
+		kbase_gpuprops_set_max_config(kbdev, &max_config);
+	else
+		dev_dbg(kbdev->dev, "Unexpected max_config on VM state %s",
+			kbase_arbiter_pm_vm_state_str(arb_vm_state->vm_state));
+
+	mutex_unlock(&arb_vm_state->vm_state_lock);
+}
+
+int kbase_arbiter_pm_gpu_assigned(struct kbase_device *kbdev)
+{
+	struct kbase_arbiter_vm_state *arb_vm_state;
+	int result = -EINVAL;
+
+	if (!kbdev)
+		return result;
+
+	/* First check the GPU_LOST state */
+	kbase_pm_lock(kbdev);
+	if (kbase_pm_is_gpu_lost(kbdev)) {
+		kbase_pm_unlock(kbdev);
+		return 0;
+	}
+	kbase_pm_unlock(kbdev);
+
+	/* Then the arbitration state machine */
+	arb_vm_state = kbdev->pm.arb_vm_state;
+
+	mutex_lock(&arb_vm_state->vm_state_lock);
+	switch (arb_vm_state->vm_state) {
+	case KBASE_VM_STATE_INITIALIZING:
+	case KBASE_VM_STATE_SUSPENDED:
+	case KBASE_VM_STATE_STOPPED:
+	case KBASE_VM_STATE_STOPPED_GPU_REQUESTED:
+	case KBASE_VM_STATE_SUSPEND_WAIT_FOR_GRANT:
+		result = 0;
+		break;
+	default:
+		result = 1;
+		break;
+	}
+	mutex_unlock(&arb_vm_state->vm_state_lock);
+
+	return result;
+}
+
+/**
+ * kbase_arbiter_pm_vm_gpu_start() - Handles the start state of the VM
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Handles the start state of the VM
+ */
 static void kbase_arbiter_pm_vm_gpu_start(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;

 	lockdep_assert_held(&arb_vm_state->vm_state_lock);
+	cancel_request_timer(kbdev);
 	switch (arb_vm_state->vm_state) {
 	case KBASE_VM_STATE_INITIALIZING:
 		kbase_arbiter_pm_vm_set_state(kbdev,
@@ -289,7 +538,14 @@ static void kbase_arbiter_pm_vm_gpu_start(struct kbase_device *kbdev)
 		break;
 	case KBASE_VM_STATE_STOPPED_GPU_REQUESTED:
 		kbase_arbiter_pm_vm_set_state(kbdev, KBASE_VM_STATE_STARTING);
+		arb_vm_state->interrupts_installed = true;
 		kbase_install_interrupts(kbdev);
+		/*
+		 * GPU GRANTED received while in stop can be a result of a
+		 * repartitioning.
+		 */
+		kbase_gpuprops_req_curr_config_update(kbdev);
+		/* curr_config will be updated while resuming the PM. */
 		queue_work(arb_vm_state->vm_arb_wq,
 			&arb_vm_state->vm_resume_work);
 		break;
@@ -306,6 +562,12 @@ static void kbase_arbiter_pm_vm_gpu_start(struct kbase_device *kbdev)
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_gpu_stop() - Handles the stop state of the VM
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Handles the start state of the VM
+ */
 static void kbase_arbiter_pm_vm_gpu_stop(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
@@ -348,6 +610,12 @@ static void kbase_arbiter_pm_vm_gpu_stop(struct kbase_device *kbdev)
 	}
 }

+/**
+ * kbase_gpu_lost() - Kbase signals GPU is lost on a lost event signal
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * On GPU lost event signals GPU_LOST to the aribiter
+ */
 static void kbase_gpu_lost(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
@@ -396,6 +664,13 @@ static void kbase_gpu_lost(struct kbase_device *kbdev)
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_os_suspend_ready_state() - checks if VM is ready
+ *			to be moved to suspended state.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Return: True if its ready to be suspended else False.
+ */
 static inline bool kbase_arbiter_pm_vm_os_suspend_ready_state(
 	struct kbase_device *kbdev)
 {
@@ -410,6 +685,14 @@ static inline bool kbase_arbiter_pm_vm_os_suspend_ready_state(
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_os_prepare_suspend() - Prepare OS to be in suspend state
+ *                             until it receives the grant message from arbiter
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Prepares OS to be in suspend state until it receives GRANT message
+ * from Arbiter asynchronously.
+ */
 static void kbase_arbiter_pm_vm_os_prepare_suspend(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
@@ -475,6 +758,14 @@ static void kbase_arbiter_pm_vm_os_prepare_suspend(struct kbase_device *kbdev)
 	}
 }

+/**
+ * kbase_arbiter_pm_vm_os_resume() - Resume OS function once it receives
+ *                                   a grant message from arbiter
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Resume OS function once it receives GRANT message
+ * from Arbiter asynchronously.
+ */
 static void kbase_arbiter_pm_vm_os_resume(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
@@ -487,6 +778,7 @@ static void kbase_arbiter_pm_vm_os_resume(struct kbase_device *kbdev)
 	kbase_arbiter_pm_vm_set_state(kbdev,
 		KBASE_VM_STATE_STOPPED_GPU_REQUESTED);
 	kbase_arbif_gpu_request(kbdev);
+	start_request_timer(kbdev);

 	/* Release lock and block resume OS function until we have
 	 * asynchronously received the GRANT message from the Arbiter and
@@ -498,6 +790,14 @@ static void kbase_arbiter_pm_vm_os_resume(struct kbase_device *kbdev)
 	mutex_lock(&arb_vm_state->vm_state_lock);
 }

+/**
+ * kbase_arbiter_pm_vm_event() - Dispatch VM event to the state machine.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ * @evt: VM event
+ *
+ * The state machine function. Receives events and transitions states
+ * according the event received and the current state
+ */
 void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev,
 	enum kbase_arbif_evt evt)
 {
@@ -509,7 +809,9 @@ void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev,
 	mutex_lock(&arb_vm_state->vm_state_lock);
 	dev_dbg(kbdev->dev, "%s %s\n", __func__,
 		kbase_arbiter_pm_vm_event_str(evt));
-
+	if (arb_vm_state->vm_state != KBASE_VM_STATE_INITIALIZING_WITH_GPU &&
+		arb_vm_state->vm_state != KBASE_VM_STATE_INITIALIZING)
+		KBASE_KTRACE_ADD(kbdev, ARB_VM_EVT, NULL, evt);
 	switch (evt) {
 	case KBASE_VM_GPU_GRANTED_EVT:
 		kbase_arbiter_pm_vm_gpu_start(kbdev);
@@ -542,8 +844,6 @@ void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev,
 	case KBASE_VM_REF_EVENT:
 		switch (arb_vm_state->vm_state) {
 		case KBASE_VM_STATE_STARTING:
-			KBASE_TLSTREAM_TL_ARBITER_STARTED(kbdev, kbdev);
-			/* FALL THROUGH */
 		case KBASE_VM_STATE_IDLE:
 			kbase_arbiter_pm_vm_set_state(kbdev,
 			KBASE_VM_STATE_ACTIVE);
@@ -586,6 +886,12 @@ void kbase_arbiter_pm_vm_event(struct kbase_device *kbdev,

 KBASE_EXPORT_TEST_API(kbase_arbiter_pm_vm_event);

+/**
+ * kbase_arbiter_pm_vm_wait_gpu_assignment() - VM wait for a GPU assignment.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * VM waits for a GPU assignment.
+ */
 static void kbase_arbiter_pm_vm_wait_gpu_assignment(struct kbase_device *kbdev)
 {
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;
@@ -597,6 +903,12 @@ static void kbase_arbiter_pm_vm_wait_gpu_assignment(struct kbase_device *kbdev)
 	dev_dbg(kbdev->dev, "Waiting for GPU assignment - done\n");
 }

+/**
+ * kbase_arbiter_pm_vm_gpu_assigned_lockheld() - Check if VM holds VM state lock
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Checks if the virtual machine holds VM state lock.
+ */
 static inline bool kbase_arbiter_pm_vm_gpu_assigned_lockheld(
 	struct kbase_device *kbdev)
 {
@@ -607,6 +919,19 @@ static inline bool kbase_arbiter_pm_vm_gpu_assigned_lockheld(
 		arb_vm_state->vm_state == KBASE_VM_STATE_ACTIVE);
 }

+/**
+ * kbase_arbiter_pm_ctx_active_handle_suspend() - Handle suspend operation for
+ *                                                arbitration mode
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ * @suspend_handler: The handler code for how to handle a suspend
+ *                   that might occur
+ *
+ * This function handles a suspend event from the driver,
+ * communicating with the arbiter and waiting synchronously for the GPU
+ * to be granted again depending on the VM state.
+ *
+ * Return: 0 on success else 1 suspend handler isn not possible.
+ */
 int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
 	enum kbase_pm_suspend_handler suspend_handler)
 {
@@ -627,6 +952,7 @@ int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
 				kbase_arbiter_pm_vm_set_state(kbdev,
 					KBASE_VM_STATE_STOPPED_GPU_REQUESTED);
 				kbase_arbif_gpu_request(kbdev);
+				start_request_timer(kbdev);
 			} else if (arb_vm_state->vm_state ==
 					KBASE_VM_STATE_INITIALIZING_WITH_GPU)
 				break;
@@ -660,7 +986,7 @@ int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
 			}

 			/* Need to synchronously wait for GPU assignment */
-			arb_vm_state->vm_arb_users_waiting++;
+			atomic_inc(&kbdev->pm.gpu_users_waiting);
 			mutex_unlock(&arb_vm_state->vm_state_lock);
 			mutex_unlock(&kbdev->pm.lock);
 			mutex_unlock(&js_devdata->runpool_mutex);
@@ -668,9 +994,66 @@ int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
 			mutex_lock(&js_devdata->runpool_mutex);
 			mutex_lock(&kbdev->pm.lock);
 			mutex_lock(&arb_vm_state->vm_state_lock);
-			arb_vm_state->vm_arb_users_waiting--;
+			atomic_dec(&kbdev->pm.gpu_users_waiting);
 		}
 		mutex_unlock(&arb_vm_state->vm_state_lock);
 	}
 	return res;
 }
+
+/**
+ * kbase_arbiter_pm_update_gpu_freq() - Updates GPU clock frequency received
+ * from arbiter.
+ * @arb_freq - Pointer to struchture holding GPU clock frequenecy data
+ * @freq - New frequency value
+ */
+void kbase_arbiter_pm_update_gpu_freq(struct kbase_arbiter_freq *arb_freq,
+		uint32_t freq)
+{
+	mutex_lock(&arb_freq->arb_freq_lock);
+	arb_freq->arb_freq = freq;
+	mutex_unlock(&arb_freq->arb_freq_lock);
+}
+
+/**
+ * enumerate_arb_gpu_clk() - Enumerate a GPU clock on the given index
+ * @kbdev - kbase_device pointer
+ * @index - GPU clock index
+ *
+ * Returns pointer to structure holding GPU clock frequency data reported from
+ * arbiter, only index 0 is valid.
+ */
+static void *enumerate_arb_gpu_clk(struct kbase_device *kbdev,
+		unsigned int index)
+{
+	if (index == 0)
+		return &kbdev->arb.arb_freq;
+	return NULL;
+}
+
+/**
+ * get_arb_gpu_clk_rate() - Get the current rate of GPU clock frequency value
+ * @kbdev - kbase_device pointer
+ * @index - GPU clock index
+ *
+ * Returns the GPU clock frequency value saved when gpu is granted from arbiter
+ */
+static unsigned long get_arb_gpu_clk_rate(struct kbase_device *kbdev,
+		void *gpu_clk_handle)
+{
+	uint32_t freq;
+	struct kbase_arbiter_freq *arb_dev_freq =
+			(struct kbase_arbiter_freq *) gpu_clk_handle;
+
+	mutex_lock(&arb_dev_freq->arb_freq_lock);
+	freq = arb_dev_freq->arb_freq;
+	mutex_unlock(&arb_dev_freq->arb_freq_lock);
+	return freq;
+}
+
+struct kbase_clk_rate_trace_op_conf arb_clk_rate_trace_ops = {
+	.get_gpu_clk_rate = get_arb_gpu_clk_rate,
+	.enumerate_gpu_clk = enumerate_arb_gpu_clk,
+	.gpu_clk_notifier_register = NULL,
+	.gpu_clk_notifier_unregister = NULL
+};
--- a/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_pm.h
+++ b/drivers/gpu/arm/bifrost/arbiter/mali_kbase_arbiter_pm.h
@@ -1,28 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
@@ -38,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /**
@@ -116,10 +93,18 @@ void kbase_arbiter_pm_early_term(struct kbase_device *kbdev);
 * kbase_arbiter_pm_release_interrupts() - Release the GPU interrupts
 * @kbdev: The kbase device structure for the device (must be a valid pointer)
 *
- * Releases interrupts if needed (GPU is available) otherwise does nothing
+ * Releases interrupts and set the interrupt flag to false
 */
 void kbase_arbiter_pm_release_interrupts(struct kbase_device *kbdev);

+/**
+ * kbase_arbiter_pm_install_interrupts() - Install the GPU interrupts
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Install interrupts and set the interrupt_install flag to true.
+ */
+int kbase_arbiter_pm_install_interrupts(struct kbase_device *kbdev);
+
 /**
 * kbase_arbiter_pm_vm_event() - Dispatch VM event to the state machine
 * @kbdev: The kbase device structure for the device (must be a valid pointer)
@@ -156,4 +141,42 @@ int kbase_arbiter_pm_ctx_active_handle_suspend(struct kbase_device *kbdev,
 */
 void kbase_arbiter_pm_vm_stopped(struct kbase_device *kbdev);

+/**
+ * kbase_arbiter_set_max_config() - Set the max config data in kbase device.
+ * @kbdev: The kbase device structure for the device (must be a valid pointer).
+ * @max_l2_slices: The maximum number of L2 slices.
+ * @max_core_mask: The largest core mask.
+ *
+ * This function handles a stop event for the VM.
+ * It will update the VM state and forward the stop event to the driver.
+ */
+void kbase_arbiter_set_max_config(struct kbase_device *kbdev,
+				  uint32_t max_l2_slices,
+				  uint32_t max_core_mask);
+
+/**
+ * kbase_arbiter_pm_gpu_assigned() - Determine if this VM has access to the GPU
+ * @kbdev: The kbase device structure for the device (must be a valid pointer)
+ *
+ * Return: 0 if the VM does not have access, 1 if it does, and a negative number
+ * if an error occurred
+ */
+int kbase_arbiter_pm_gpu_assigned(struct kbase_device *kbdev);
+
+extern struct kbase_clk_rate_trace_op_conf arb_clk_rate_trace_ops;
+
+/**
+ * struct kbase_arbiter_freq - Holding the GPU clock frequency data retrieved
+ * from arbiter
+ * @arb_freq:                 GPU clock frequency value
+ * @arb_freq_lock:            Mutex protecting access to arbfreq value
+ */
+struct kbase_arbiter_freq {
+	uint32_t arb_freq;
+	struct mutex arb_freq_lock;
+};
+
+void kbase_arbiter_pm_update_gpu_freq(struct kbase_arbiter_freq *arb_freq,
+		uint32_t freq);
+
 #endif /*_MALI_KBASE_ARBITER_PM_H_ */
--- a/drivers/gpu/arm/bifrost/backend/gpu/Kbuild
+++ b/drivers/gpu/arm/bifrost/backend/gpu/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
-# (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+# (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,15 +16,12 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

 BACKEND += \
 	backend/gpu/mali_kbase_cache_policy_backend.c \
 	backend/gpu/mali_kbase_gpuprops_backend.c \
 	backend/gpu/mali_kbase_irq_linux.c \
-	backend/gpu/mali_kbase_instr_backend.c \
 	backend/gpu/mali_kbase_js_backend.c \
 	backend/gpu/mali_kbase_pm_backend.c \
 	backend/gpu/mali_kbase_pm_driver.c \
@@ -40,6 +38,7 @@ ifeq ($(MALI_USE_CSF),1)
 # empty
 else
 	BACKEND += \
+		backend/gpu/mali_kbase_instr_backend.c \
 		backend/gpu/mali_kbase_jm_as.c \
 		backend/gpu/mali_kbase_debug_job_fault_backend.c \
 		backend/gpu/mali_kbase_jm_hw.c \
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_backend_config.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_backend_config.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_cache_policy_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_cache_policy_backend.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2014-2016, 2018, 2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include "backend/gpu/mali_kbase_cache_policy_backend.h"
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_cache_policy_backend.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_cache_policy_backend.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2015-2016 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2016, 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,16 +17,13 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 #ifndef _KBASE_CACHE_POLICY_BACKEND_H_
 #define _KBASE_CACHE_POLICY_BACKEND_H_

 #include "mali_kbase.h"
-#include "mali_base_kernel.h"
+#include <uapi/gpu/arm/bifrost/mali_base_kernel.h>

 /**
  * kbase_cache_set_coherency_mode() - Sets the system coherency mode
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_clk_rate_trace_mgr.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_clk_rate_trace_mgr.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -40,6 +39,38 @@
 #define CLK_RATE_TRACE_OPS (NULL)
 #endif

+/**
+ * get_clk_rate_trace_callbacks() - Returns pointer to clk trace ops.
+ * @kbdev: Pointer to kbase device, used to check if arbitration is enabled
+ *         when compiled with arbiter support.
+ * Return: Pointer to clk trace ops if supported or NULL.
+ */
+static struct kbase_clk_rate_trace_op_conf *
+get_clk_rate_trace_callbacks(struct kbase_device *kbdev __maybe_unused)
+{
+	/* base case */
+	struct kbase_clk_rate_trace_op_conf *callbacks =
+		(struct kbase_clk_rate_trace_op_conf *)CLK_RATE_TRACE_OPS;
+#if defined(CONFIG_MALI_ARBITER_SUPPORT) && defined(CONFIG_OF)
+	const void *arbiter_if_node;
+
+	if (WARN_ON(!kbdev) || WARN_ON(!kbdev->dev))
+		return callbacks;
+
+	arbiter_if_node =
+		of_get_property(kbdev->dev->of_node, "arbiter_if", NULL);
+	/* Arbitration enabled, override the callback pointer.*/
+	if (arbiter_if_node)
+		callbacks = &arb_clk_rate_trace_ops;
+	else
+		dev_dbg(kbdev->dev,
+			"Arbitration supported but disabled by platform. Leaving clk rate callbacks as default.\n");
+
+#endif
+
+	return callbacks;
+}
+
 static int gpu_clk_rate_change_notifier(struct notifier_block *nb,
 			unsigned long event, void *data)
 {
@@ -70,12 +101,13 @@ static int gpu_clk_rate_change_notifier(struct notifier_block *nb,
 static int gpu_clk_data_init(struct kbase_device *kbdev,
 		void *gpu_clk_handle, unsigned int index)
 {
-	struct kbase_clk_rate_trace_op_conf *callbacks =
-		(struct kbase_clk_rate_trace_op_conf *)CLK_RATE_TRACE_OPS;
+	struct kbase_clk_rate_trace_op_conf *callbacks;
 	struct kbase_clk_data *clk_data;
 	struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
 	int ret = 0;

+	callbacks = get_clk_rate_trace_callbacks(kbdev);
+
 	if (WARN_ON(!callbacks) ||
 	    WARN_ON(!gpu_clk_handle) ||
 	    WARN_ON(index >= BASE_MAX_NR_CLOCKS_REGULATORS))
@@ -109,8 +141,9 @@ static int gpu_clk_data_init(struct kbase_device *kbdev,
 	clk_data->clk_rate_change_nb.notifier_call =
 			gpu_clk_rate_change_notifier;

-	ret = callbacks->gpu_clk_notifier_register(kbdev, gpu_clk_handle,
-			&clk_data->clk_rate_change_nb);
+	if (callbacks->gpu_clk_notifier_register)
+		ret = callbacks->gpu_clk_notifier_register(kbdev,
+				gpu_clk_handle, &clk_data->clk_rate_change_nb);
 	if (ret) {
 		dev_err(kbdev->dev, "Failed to register notifier for clock enumerated at index %u", index);
 		kfree(clk_data);
@@ -121,19 +154,22 @@ static int gpu_clk_data_init(struct kbase_device *kbdev,

 int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev)
 {
-	struct kbase_clk_rate_trace_op_conf *callbacks =
-		(struct kbase_clk_rate_trace_op_conf *)CLK_RATE_TRACE_OPS;
+	struct kbase_clk_rate_trace_op_conf *callbacks;
 	struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
 	unsigned int i;
 	int ret = 0;

-	/* Return early if no callbacks provided for clock rate tracing */
-	if (!callbacks)
-		return 0;
+	callbacks = get_clk_rate_trace_callbacks(kbdev);

 	spin_lock_init(&clk_rtm->lock);
 	INIT_LIST_HEAD(&clk_rtm->listeners);

+	/* Return early if no callbacks provided for clock rate tracing */
+	if (!callbacks) {
+		WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
+		return 0;
+	}
+
 	clk_rtm->gpu_idle = true;

 	for (i = 0; i < BASE_MAX_NR_CLOCKS_REGULATORS; i++) {
@@ -151,10 +187,12 @@ int kbase_clk_rate_trace_manager_init(struct kbase_device *kbdev)
 	/* Activate clock rate trace manager if at least one GPU clock was
 	 * enumerated.
 	 */
-	if (i)
+	if (i) {
 		WRITE_ONCE(clk_rtm->clk_rate_trace_ops, callbacks);
-	else
+	} else {
 		dev_info(kbdev->dev, "No clock(s) available for rate tracing");
+		WRITE_ONCE(clk_rtm->clk_rate_trace_ops, NULL);
+	}

 	return 0;

@@ -183,9 +221,10 @@ void kbase_clk_rate_trace_manager_term(struct kbase_device *kbdev)
 		if (!clk_rtm->clks[i])
 			break;

-		clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister(
-				kbdev, clk_rtm->clks[i]->gpu_clk_handle,
-				&clk_rtm->clks[i]->clk_rate_change_nb);
+		if (clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister)
+			clk_rtm->clk_rate_trace_ops->gpu_clk_notifier_unregister
+			(kbdev, clk_rtm->clks[i]->gpu_clk_handle,
+			&clk_rtm->clks[i]->clk_rate_change_nb);
 		kfree(clk_rtm->clks[i]);
 	}

@@ -284,4 +323,3 @@ void kbase_clk_rate_trace_manager_notify_all(
 	}
 }
 KBASE_EXPORT_TEST_API(kbase_clk_rate_trace_manager_notify_all);
-
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_clk_rate_trace_mgr.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_clk_rate_trace_mgr.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,17 +17,15 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _KBASE_CLK_RATE_TRACE_MGR_
 #define _KBASE_CLK_RATE_TRACE_MGR_

-/** The index of top clock domain in kbase_clk_rate_trace_manager:clks. */
+/* The index of top clock domain in kbase_clk_rate_trace_manager:clks. */
 #define KBASE_CLOCK_DOMAIN_TOP (0)

-/** The index of shader-cores clock domain in
+/* The index of shader-cores clock domain in
 * kbase_clk_rate_trace_manager:clks.
 */
 #define KBASE_CLOCK_DOMAIN_SHADER_CORES (1)
@@ -139,7 +138,7 @@ static inline void kbase_clk_rate_trace_manager_unsubscribe(
 *                                             rate listeners.
 *
 * @clk_rtm:     Clock rate manager instance.
- * @clk_index:   Clock index.
+ * @clock_index:   Clock index.
 * @new_rate:    New clock frequency(Hz)
 *
 * kbase_clk_rate_trace_manager:lock must be locked.
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_debug_job_fault_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_debug_job_fault_backend.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2012-2015, 2018-2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_devfreq.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_devfreq.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
@@ -32,20 +33,8 @@
 #endif

 #include <linux/version.h>
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 13, 0)
 #include <linux/pm_opp.h>
-#else /* Linux >= 3.13 */
-/* In 3.13 the OPP include header file, types, and functions were all
- * renamed. Use the old filename for the include, and define the new names to
- * the old, when an old kernel is detected.
- */
-#include <linux/opp.h>
-#define dev_pm_opp opp
-#define dev_pm_opp_get_voltage opp_get_voltage
-#define dev_pm_opp_get_opp_count opp_get_opp_count
-#define dev_pm_opp_find_freq_ceil opp_find_freq_ceil
-#define dev_pm_opp_find_freq_floor opp_find_freq_floor
-#endif /* Linux >= 3.13 */
+
 #include <soc/rockchip/rockchip_ipa.h>
 #include <soc/rockchip/rockchip_opp_select.h>
 #include <soc/rockchip/rockchip_system_monitor.h>
@@ -59,22 +48,46 @@ static struct monitor_dev_profile mali_mdevp = {
 };

 /**
- * opp_translate - Translate nominal OPP frequency from devicetree into real
- *                 frequency and core mask
- * @kbdev:     Device pointer
- * @freq:      Nominal frequency
- * @volt:      Nominal voltage
- * @core_mask: Pointer to u64 to store core mask to
- * @freqs:     Pointer to array of frequencies
- * @volts:     Pointer to array of voltages
+ * get_voltage() - Get the voltage value corresponding to the nominal frequency
+ *                 used by devfreq.
+ * @kbdev:    Device pointer
+ * @freq:     Nominal frequency in Hz passed by devfreq.
 *
- * This function will only perform translation if an operating-points-v2-mali
- * table is present in devicetree. If one is not present then it will return an
- * untranslated frequency and all cores enabled.
+ * This function will be called only when the opp table which is compatible with
+ * "operating-points-v2-mali", is not present in the devicetree for GPU device.
+ *
+ * Return: Voltage value in milli volts, 0 in case of error.
 */
-static void opp_translate(struct kbase_device *kbdev, unsigned long freq,
-			  unsigned long volt, u64 *core_mask,
-			  unsigned long *freqs, unsigned long *volts)
+static unsigned long get_voltage(struct kbase_device *kbdev, unsigned long freq)
+{
+	struct dev_pm_opp *opp;
+	unsigned long voltage = 0;
+
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
+	rcu_read_lock();
+#endif
+
+	opp = dev_pm_opp_find_freq_exact(kbdev->dev, freq, true);
+
+	if (IS_ERR_OR_NULL(opp))
+		dev_err(kbdev->dev, "Failed to get opp (%ld)\n", PTR_ERR(opp));
+	else {
+		voltage = dev_pm_opp_get_voltage(opp);
+#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
+		dev_pm_opp_put(opp);
+#endif
+	}
+
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
+	rcu_read_unlock();
+#endif
+
+	/* Return the voltage in milli volts */
+	return voltage / 1000;
+}
+
+void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq,
+	u64 *core_mask, unsigned long *freqs, unsigned long *volts)
 {
 	unsigned int i;

@@ -95,13 +108,16 @@ static void opp_translate(struct kbase_device *kbdev, unsigned long freq,
 	}

 	/* If failed to find OPP, return all cores enabled
-	 * and nominal frequency
+	 * and nominal frequency and the corresponding voltage.
 	 */
 	if (i == kbdev->num_opps) {
+		unsigned long voltage = get_voltage(kbdev, freq);
+
 		*core_mask = kbdev->gpu_props.props.raw_props.shader_present;
+
 		for (i = 0; i < kbdev->nr_clocks; i++) {
 			freqs[i] = freq;
-			volts[i] = volt;
+			volts[i] = voltage;
 		}
 	}
 }
@@ -120,12 +136,12 @@ kbase_devfreq_target(struct device *dev, unsigned long *target_freq, u32 flags)

 	nominal_freq = *target_freq;

-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 	rcu_read_lock();
 #endif
 	opp = devfreq_recommended_opp(dev, &nominal_freq, flags);
 	if (IS_ERR_OR_NULL(opp)) {
-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 		rcu_read_unlock();
 #endif
 		dev_err(dev, "Failed to get opp (%ld)\n", PTR_ERR(opp));
@@ -135,12 +151,15 @@ kbase_devfreq_target(struct device *dev, unsigned long *target_freq, u32 flags)
 #if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
 	rcu_read_unlock();
 #endif
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
 	dev_pm_opp_put(opp);
 #endif

-	opp_translate(kbdev, nominal_freq, nominal_volt, &core_mask, freqs,
-		      volts);
+	kbase_devfreq_opp_translate(kbdev,
+				    nominal_freq,
+				    &core_mask,
+				    freqs,
+				    volts);

 	/*
 	 * Only update if there is a change of frequency
@@ -168,6 +187,7 @@ kbase_devfreq_target(struct device *dev, unsigned long *target_freq, u32 flags)
 #endif
 		return 0;
 	}
+
 	dev_dbg(dev, "%lu-->%lu\n", kbdev->current_nominal_freq, nominal_freq);

 #ifdef CONFIG_REGULATOR
@@ -285,6 +305,10 @@ kbase_devfreq_status(struct device *dev, struct devfreq_dev_status *stat)
 	stat->current_frequency = kbdev->current_nominal_freq;
 	stat->private_data = NULL;

+#if MALI_USE_CSF && defined CONFIG_DEVFREQ_THERMAL
+	kbase_ipa_reset_data(kbdev);
+#endif
+
 	return 0;
 }

@@ -296,11 +320,11 @@ static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev,
 	unsigned long freq;
 	struct dev_pm_opp *opp;

-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 	rcu_read_lock();
 #endif
 	count = dev_pm_opp_get_opp_count(kbdev->dev);
-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 	rcu_read_unlock();
 #endif
 	if (count < 0)
@@ -311,20 +335,20 @@ static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev,
 	if (!dp->freq_table)
 		return -ENOMEM;

-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 	rcu_read_lock();
 #endif
 	for (i = 0, freq = ULONG_MAX; i < count; i++, freq--) {
 		opp = dev_pm_opp_find_freq_floor(kbdev->dev, &freq);
 		if (IS_ERR(opp))
 			break;
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE
 		dev_pm_opp_put(opp);
-#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0) */
+#endif /* KERNEL_VERSION(4, 11, 0) <= LINUX_VERSION_CODE */

 		dp->freq_table[i] = freq;
 	}
-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 11, 0)
+#if KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE
 	rcu_read_unlock();
 #endif

@@ -407,7 +431,7 @@ static void kbasep_devfreq_read_suspend_clock(struct kbase_device *kbdev,

 static int kbase_devfreq_init_core_mask_table(struct kbase_device *kbdev)
 {
-#if KERNEL_VERSION(3, 18, 0) > LINUX_VERSION_CODE || !defined(CONFIG_OF)
+#ifndef CONFIG_OF
 	/* OPP table initialization requires at least the capability to get
 	 * regulators and clocks from the device tree, as well as parsing
 	 * arrays of unsigned integer values.
@@ -541,11 +565,9 @@ static int kbase_devfreq_init_core_mask_table(struct kbase_device *kbdev)
 	kbdev->num_opps = i;

 	return 0;
-#endif /* KERNEL_VERSION(3, 18, 0) > LINUX_VERSION_CODE */
+#endif /* CONFIG_OF */
 }

-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 8, 0)
-
 static const char *kbase_devfreq_req_type_name(enum kbase_devfreq_work_type type)
 {
 	const char *p;
@@ -602,12 +624,9 @@ static void kbase_devfreq_suspend_resume_worker(struct work_struct *work)
 	}
 }

-#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(3, 8, 0) */
-
 void kbase_devfreq_enqueue_work(struct kbase_device *kbdev,
 				       enum kbase_devfreq_work_type work_type)
 {
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 8, 0)
 	unsigned long flags;

 	WARN_ON(work_type == DEVFREQ_WORK_NONE);
@@ -617,12 +636,10 @@ void kbase_devfreq_enqueue_work(struct kbase_device *kbdev,
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
 	dev_dbg(kbdev->dev, "Enqueuing devfreq req: %s\n",
 		kbase_devfreq_req_type_name(work_type));
-#endif
 }

 static int kbase_devfreq_work_init(struct kbase_device *kbdev)
 {
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 8, 0)
 	kbdev->devfreq_queue.req_type = DEVFREQ_WORK_NONE;
 	kbdev->devfreq_queue.acted_type = DEVFREQ_WORK_RESUME;

@@ -632,15 +649,12 @@ static int kbase_devfreq_work_init(struct kbase_device *kbdev)

 	INIT_WORK(&kbdev->devfreq_queue.work,
 			kbase_devfreq_suspend_resume_worker);
-#endif
 	return 0;
 }

 static void kbase_devfreq_work_term(struct kbase_device *kbdev)
 {
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 8, 0)
 	destroy_workqueue(kbdev->devfreq_queue.workq);
-#endif
 }

 static unsigned long kbase_devfreq_get_static_power(struct devfreq *devfreq,
@@ -661,9 +675,9 @@ int kbase_devfreq_init(struct kbase_device *kbdev)
 	struct devfreq_cooling_power *kbase_dcp = &kbase_cooling_power;
 	struct device_node *np = kbdev->dev->of_node;
 	struct devfreq_dev_profile *dp;
+	int err;
 	struct dev_pm_opp *opp;
 	unsigned long opp_rate;
-	int err;
 	unsigned int i;

 	if (kbdev->nr_clocks == 0) {
@@ -726,7 +740,8 @@ int kbase_devfreq_init(struct kbase_device *kbdev)
 	}

 	/* devfreq_add_device only copies a few of kbdev->dev's fields, so
-	 * set drvdata explicitly so IPA models can access kbdev. */
+	 * set drvdata explicitly so IPA models can access kbdev.
+	 */
 	dev_set_drvdata(&kbdev->devfreq->dev, kbdev);

 	err = devfreq_register_opp_notifier(kbdev->dev, kbdev->devfreq);
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_devfreq.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_devfreq.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014, 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014, 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _BASE_DEVFREQ_H_
@@ -44,4 +43,20 @@ void kbase_devfreq_force_freq(struct kbase_device *kbdev, unsigned long freq);
 void kbase_devfreq_enqueue_work(struct kbase_device *kbdev,
 				enum kbase_devfreq_work_type work_type);

+/**
+ * kbase_devfreq_opp_translate - Translate nominal OPP frequency from devicetree
+ *                               into real frequency & voltage pair, along with
+ *                               core mask
+ * @kbdev:     Device pointer
+ * @freq:      Nominal frequency
+ * @core_mask: Pointer to u64 to store core mask to
+ * @freqs:     Pointer to array of frequencies
+ * @volts:     Pointer to array of voltages
+ *
+ * This function will only perform translation if an operating-points-v2-mali
+ * table is present in devicetree. If one is not present then it will return an
+ * untranslated frequency (and corresponding voltage) and all cores enabled.
+ */
+void kbase_devfreq_opp_translate(struct kbase_device *kbdev, unsigned long freq,
+	u64 *core_mask, unsigned long *freqs, unsigned long *volts);
 #endif /* _BASE_DEVFREQ_H_ */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_gpuprops_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_gpuprops_backend.c
@@ -1,12 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -41,12 +39,13 @@ int kbase_backend_gpuprops_get(struct kbase_device *kbdev,

 	registers.l2_features = kbase_reg_read(kbdev,
 				GPU_CONTROL_REG(L2_FEATURES));
+	registers.core_features = 0;
 #if !MALI_USE_CSF
+	/* TGOx */
 	registers.core_features = kbase_reg_read(kbdev,
 				GPU_CONTROL_REG(CORE_FEATURES));
 #else /* !MALI_USE_CSF */
-	registers.core_features = 0;
-#endif /* !MALI_USE_CSF */
+#endif /* MALI_USE_CSF */
 	registers.tiler_features = kbase_reg_read(kbdev,
 				GPU_CONTROL_REG(TILER_FEATURES));
 	registers.mem_features = kbase_reg_read(kbdev,
@@ -105,6 +104,16 @@ int kbase_backend_gpuprops_get(struct kbase_device *kbdev,
 	registers.stack_present_hi = kbase_reg_read(kbdev,
 				GPU_CONTROL_REG(STACK_PRESENT_HI));

+	if (registers.gpu_id >= GPU_ID2_PRODUCT_MAKE(11, 8, 5, 2)) {
+		registers.gpu_features_lo = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(GPU_FEATURES_LO));
+		registers.gpu_features_hi = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(GPU_FEATURES_HI));
+	} else {
+		registers.gpu_features_lo = 0;
+		registers.gpu_features_hi = 0;
+	}
+
 	if (!kbase_is_gpu_removed(kbdev)) {
 		*regdump = registers;
 		return 0;
@@ -112,6 +121,32 @@ int kbase_backend_gpuprops_get(struct kbase_device *kbdev,
 		return -EIO;
 }

+int kbase_backend_gpuprops_get_curr_config(struct kbase_device *kbdev,
+		struct kbase_current_config_regdump *curr_config_regdump)
+{
+	if (WARN_ON(!kbdev) || WARN_ON(!curr_config_regdump))
+		return -EINVAL;
+
+	curr_config_regdump->mem_features = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(MEM_FEATURES));
+
+	curr_config_regdump->shader_present_lo = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(SHADER_PRESENT_LO));
+	curr_config_regdump->shader_present_hi = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(SHADER_PRESENT_HI));
+
+	curr_config_regdump->l2_present_lo = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(L2_PRESENT_LO));
+	curr_config_regdump->l2_present_hi = kbase_reg_read(kbdev,
+					GPU_CONTROL_REG(L2_PRESENT_HI));
+
+	if (WARN_ON(kbase_is_gpu_removed(kbdev)))
+		return -EIO;
+
+	return 0;
+
+}
+
 int kbase_backend_gpuprops_get_features(struct kbase_device *kbdev,
 					struct kbase_gpuprops_regdump *regdump)
 {
@@ -147,11 +182,15 @@ int kbase_backend_gpuprops_get_l2_features(struct kbase_device *kbdev,
 	if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_L2_CONFIG)) {
 		u32 l2_features = kbase_reg_read(kbdev,
 				GPU_CONTROL_REG(L2_FEATURES));
+		u32 l2_config =
+			kbase_reg_read(kbdev, GPU_CONTROL_REG(L2_CONFIG));
+

 		if (kbase_is_gpu_removed(kbdev))
 			return -EIO;

 		regdump->l2_features = l2_features;
+		regdump->l2_config = l2_config;
 	}

 	return 0;
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_backend.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * GPU backend instrumentation APIs.
 */
@@ -39,9 +36,7 @@ int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev,
 {
 	unsigned long flags;
 	int err = -EINVAL;
-#if !MALI_USE_CSF
 	u32 irq_mask;
-#endif
 	u32 prfcnt_config;

 	lockdep_assert_held(&kbdev->hwaccess_lock);
@@ -58,12 +53,10 @@ int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev,
 		goto out_err;
 	}

-#if !MALI_USE_CSF
 	/* Enable interrupt */
 	irq_mask = kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK));
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK), irq_mask |
 						PRFCNT_SAMPLE_COMPLETED);
-#endif

 	/* In use, this context is the owner */
 	kbdev->hwcnt.kctx = kctx;
@@ -75,36 +68,13 @@ int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev,

 	/* Configure */
 	prfcnt_config = kctx->as_nr << PRFCNT_CONFIG_AS_SHIFT;
-#ifdef CONFIG_MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS
-	if (kbdev->hwcnt.backend.use_secondary_override)
+#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
+	prfcnt_config |= kbdev->hwcnt.backend.override_counter_set
+			 << PRFCNT_CONFIG_SETSELECT_SHIFT;
 #else
-	if (enable->use_secondary)
+	prfcnt_config |= enable->counter_set << PRFCNT_CONFIG_SETSELECT_SHIFT;
 #endif
-		prfcnt_config |= 1 << PRFCNT_CONFIG_SETSELECT_SHIFT;

-#if MALI_USE_CSF
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_CONFIG),
-			prfcnt_config | PRFCNT_CONFIG_MODE_OFF);
-
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_BASE_LO),
-					enable->dump_buffer & 0xFFFFFFFF);
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_BASE_HI),
-					enable->dump_buffer >> 32);
-
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_CSHW_EN),
-					enable->fe_bm);
-
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_SHADER_EN),
-					enable->shader_bm);
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_MMU_L2_EN),
-					enable->mmu_l2_bm);
-
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_TILER_EN),
-					enable->tiler_bm);
-
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_CONFIG),
-			prfcnt_config | PRFCNT_CONFIG_MODE_MANUAL);
-#else
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(PRFCNT_CONFIG),
 			prfcnt_config | PRFCNT_CONFIG_MODE_OFF);

@@ -126,7 +96,6 @@ int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev,

 	kbase_reg_write(kbdev, GPU_CONTROL_REG(PRFCNT_CONFIG),
 			prfcnt_config | PRFCNT_CONFIG_MODE_MANUAL);
-#endif

 	spin_lock_irqsave(&kbdev->hwcnt.lock, flags);

@@ -138,7 +107,7 @@ int kbase_instr_hwcnt_enable_internal(struct kbase_device *kbdev,

 	err = 0;

-	dev_dbg(kbdev->dev, "HW counters dumping set-up for context %p", kctx);
+	dev_dbg(kbdev->dev, "HW counters dumping set-up for context %pK", kctx);
 	return err;
 out_err:
 	return err;
@@ -148,9 +117,7 @@ int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)
 {
 	unsigned long flags, pm_flags;
 	int err = -EINVAL;
-#if !MALI_USE_CSF
 	u32 irq_mask;
-#endif
 	struct kbase_device *kbdev = kctx->kbdev;

 	while (1) {
@@ -185,10 +152,6 @@ int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)
 	kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;
 	kbdev->hwcnt.backend.triggered = 0;

-#if MALI_USE_CSF
-	/* Disable the counters */
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_CONFIG), 0);
-#else
 	/* Disable interrupt */
 	irq_mask = kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK));
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK),
@@ -196,7 +159,6 @@ int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)

 	/* Disable the counters */
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(PRFCNT_CONFIG), 0);
-#endif

 	kbdev->hwcnt.kctx = NULL;
 	kbdev->hwcnt.addr = 0ULL;
@@ -205,11 +167,10 @@ int kbase_instr_hwcnt_disable_internal(struct kbase_context *kctx)
 	spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);

-	dev_dbg(kbdev->dev, "HW counters dumping disabled for context %p",
+	dev_dbg(kbdev->dev, "HW counters dumping disabled for context %pK",
 									kctx);

 	err = 0;
-
 out:
 	return err;
 }
@@ -229,7 +190,8 @@ int kbase_instr_hwcnt_request_dump(struct kbase_context *kctx)

 	if (kbdev->hwcnt.backend.state != KBASE_INSTR_STATE_IDLE) {
 		/* HW counters are disabled or another dump is ongoing, or we're
-		 * resetting */
+		 * resetting
+		 */
 		goto unlock;
 	}

@@ -239,44 +201,26 @@ int kbase_instr_hwcnt_request_dump(struct kbase_context *kctx)
 	 */
 	kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DUMPING;

-
-#if MALI_USE_CSF
-	/* Reconfigure the dump address */
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_BASE_LO),
-					kbdev->hwcnt.addr & 0xFFFFFFFF);
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(PRFCNT_BASE_HI),
-					kbdev->hwcnt.addr >> 32);
-#else
 	/* Reconfigure the dump address */
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(PRFCNT_BASE_LO),
 					kbdev->hwcnt.addr & 0xFFFFFFFF);
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(PRFCNT_BASE_HI),
 					kbdev->hwcnt.addr >> 32);
-#endif

 	/* Start dumping */
 	KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_SAMPLE, NULL,
 			kbdev->hwcnt.addr);

-#if MALI_USE_CSF
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(GPU_COMMAND),
-					GPU_COMMAND_PRFCNT_SAMPLE);
-#else
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
 					GPU_COMMAND_PRFCNT_SAMPLE);
-#endif

-	dev_dbg(kbdev->dev, "HW counters dumping done for context %p", kctx);
+	dev_dbg(kbdev->dev, "HW counters dumping done for context %pK", kctx);

 	err = 0;

 unlock:
 	spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);

-#if MALI_USE_CSF
-	tasklet_schedule(&kbdev->hwcnt.backend.csf_hwc_irq_poll_tasklet);
-#endif
-
 	return err;
 }
 KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_request_dump);
@@ -305,86 +249,6 @@ bool kbase_instr_hwcnt_dump_complete(struct kbase_context *kctx,
 }
 KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_dump_complete);

-void kbasep_cache_clean_worker(struct work_struct *data)
-{
-	struct kbase_device *kbdev;
-	unsigned long flags, pm_flags;
-
-	kbdev = container_of(data, struct kbase_device,
-						hwcnt.backend.cache_clean_work);
-
-	spin_lock_irqsave(&kbdev->hwaccess_lock, pm_flags);
-	spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
-
-	/* Clean and invalidate the caches so we're sure the mmu tables for the
-	 * dump buffer is valid.
-	 */
-	KBASE_DEBUG_ASSERT(kbdev->hwcnt.backend.state ==
-					KBASE_INSTR_STATE_REQUEST_CLEAN);
-	kbase_gpu_start_cache_clean_nolock(kbdev);
-	spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
-	spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
-
-	kbase_gpu_wait_cache_clean(kbdev);
-
-	spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
-	KBASE_DEBUG_ASSERT(kbdev->hwcnt.backend.state ==
-					KBASE_INSTR_STATE_REQUEST_CLEAN);
-	/* All finished and idle */
-	kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
-	kbdev->hwcnt.backend.triggered = 1;
-	wake_up(&kbdev->hwcnt.backend.wait);
-
-	spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
-}
-
-#if MALI_USE_CSF
-/**
- * kbasep_hwcnt_irq_poll_tasklet - tasklet to poll MCU IRQ status register
- *
- * @data: tasklet parameter which pointer to kbdev
- *
- * This tasklet poll GPU_IRQ_STATUS register in GPU_CONTROL_MCU page to check
- * PRFCNT_SAMPLE_COMPLETED bit.
- *
- * Tasklet is needed here since work_queue is too slow and cuased some test
- * cases timeout, the poll_count variable is introduced to avoid infinite
- * loop in unexpected cases, the poll_count is 1 or 2 in normal case, 128
- * should be big enough to exit the tasklet in abnormal cases.
- *
- * Return: void
- */
-static void kbasep_hwcnt_irq_poll_tasklet(unsigned long int data)
-{
-	struct kbase_device *kbdev = (struct kbase_device *)data;
-	unsigned long flags, pm_flags;
-	u32 mcu_gpu_irq_raw_status = 0;
-	u32 poll_count = 0;
-
-	while (1) {
-		spin_lock_irqsave(&kbdev->hwaccess_lock, pm_flags);
-		spin_lock_irqsave(&kbdev->hwcnt.lock, flags);
-		mcu_gpu_irq_raw_status = kbase_reg_read(kbdev,
-			GPU_CONTROL_MCU_REG(GPU_IRQ_RAWSTAT));
-		spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
-		spin_unlock_irqrestore(&kbdev->hwaccess_lock, pm_flags);
-		if (mcu_gpu_irq_raw_status & PRFCNT_SAMPLE_COMPLETED) {
-			kbase_reg_write(kbdev,
-				GPU_CONTROL_MCU_REG(GPU_IRQ_CLEAR),
-				PRFCNT_SAMPLE_COMPLETED);
-			kbase_instr_hwcnt_sample_done(kbdev);
-			break;
-		} else if (poll_count++ > 128) {
-			dev_err(kbdev->dev,
-				"Err: HWC dump timeout, count: %u", poll_count);
-			/* Still call sample_done to unblock waiting thread */
-			kbase_instr_hwcnt_sample_done(kbdev);
-			break;
-		}
-	}
-}
-#endif
-
 void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev)
 {
 	unsigned long flags;
@@ -395,20 +259,10 @@ void kbase_instr_hwcnt_sample_done(struct kbase_device *kbdev)
 		kbdev->hwcnt.backend.triggered = 1;
 		wake_up(&kbdev->hwcnt.backend.wait);
 	} else if (kbdev->hwcnt.backend.state == KBASE_INSTR_STATE_DUMPING) {
-		if (kbdev->mmu_mode->flags & KBASE_MMU_MODE_HAS_NON_CACHEABLE) {
-			/* All finished and idle */
-			kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
-			kbdev->hwcnt.backend.triggered = 1;
-			wake_up(&kbdev->hwcnt.backend.wait);
-		} else {
-			int ret;
-			/* Always clean and invalidate the cache after a successful dump
-			 */
-			kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_REQUEST_CLEAN;
-			ret = queue_work(kbdev->hwcnt.backend.cache_clean_wq,
-						&kbdev->hwcnt.backend.cache_clean_work);
-			KBASE_DEBUG_ASSERT(ret);
-		}
+		/* All finished and idle */
+		kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_IDLE;
+		kbdev->hwcnt.backend.triggered = 1;
+		wake_up(&kbdev->hwcnt.backend.wait);
 	}

 	spin_unlock_irqrestore(&kbdev->hwcnt.lock, flags);
@@ -450,20 +304,16 @@ int kbase_instr_hwcnt_clear(struct kbase_context *kctx)
 	spin_lock_irqsave(&kbdev->hwcnt.lock, flags);

 	/* Check it's the context previously set up and we're not already
-	 * dumping */
+	 * dumping
+	 */
 	if (kbdev->hwcnt.kctx != kctx || kbdev->hwcnt.backend.state !=
 							KBASE_INSTR_STATE_IDLE)
 		goto out;

 	/* Clear the counters */
 	KBASE_KTRACE_ADD(kbdev, CORE_GPU_PRFCNT_CLEAR, NULL, 0);
-#if MALI_USE_CSF
-	kbase_reg_write(kbdev, GPU_CONTROL_MCU_REG(GPU_COMMAND),
-					GPU_COMMAND_PRFCNT_CLEAR);
-#else
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
 						GPU_COMMAND_PRFCNT_CLEAR);
-#endif

 	err = 0;

@@ -475,46 +325,45 @@ KBASE_EXPORT_SYMBOL(kbase_instr_hwcnt_clear);

 int kbase_instr_backend_init(struct kbase_device *kbdev)
 {
-	int ret = 0;
+	spin_lock_init(&kbdev->hwcnt.lock);

 	kbdev->hwcnt.backend.state = KBASE_INSTR_STATE_DISABLED;

 	init_waitqueue_head(&kbdev->hwcnt.backend.wait);
-	INIT_WORK(&kbdev->hwcnt.backend.cache_clean_work,
-						kbasep_cache_clean_worker);
-
-#if MALI_USE_CSF
-	tasklet_init(&kbdev->hwcnt.backend.csf_hwc_irq_poll_tasklet,
-		     kbasep_hwcnt_irq_poll_tasklet, (unsigned long int)kbdev);
-#endif

 	kbdev->hwcnt.backend.triggered = 0;

-#ifdef CONFIG_MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS
-	kbdev->hwcnt.backend.use_secondary_override = false;
+#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
+/* Use the build time option for the override default. */
+#if defined(CONFIG_MALI_BIFROST_PRFCNT_SET_SECONDARY)
+	kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_SET_SECONDARY;
+#elif defined(CONFIG_MALI_PRFCNT_SET_TERTIARY)
+	kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_SET_TERTIARY;
+#else
+	/* Default to primary */
+	kbdev->hwcnt.backend.override_counter_set = KBASE_HWCNT_SET_PRIMARY;
 #endif
-
-	kbdev->hwcnt.backend.cache_clean_wq =
-			alloc_workqueue("Mali cache cleaning workqueue", 0, 1);
-	if (NULL == kbdev->hwcnt.backend.cache_clean_wq)
-		ret = -EINVAL;
-
-	return ret;
+#endif
+	return 0;
 }

 void kbase_instr_backend_term(struct kbase_device *kbdev)
 {
-#if MALI_USE_CSF
-	tasklet_kill(&kbdev->hwcnt.backend.csf_hwc_irq_poll_tasklet);
-#endif
-	destroy_workqueue(kbdev->hwcnt.backend.cache_clean_wq);
+	CSTD_UNUSED(kbdev);
 }

-#ifdef CONFIG_MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS
+#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
 void kbase_instr_backend_debugfs_init(struct kbase_device *kbdev)
 {
-	debugfs_create_bool("hwcnt_use_secondary", S_IRUGO | S_IWUSR,
-		kbdev->mali_debugfs_directory,
-		&kbdev->hwcnt.backend.use_secondary_override);
+	/* No validation is done on the debugfs input. Invalid input could cause
+	 * performance counter errors. This is acceptable since this is a debug
+	 * only feature and users should know what they are doing.
+	 *
+	 * Valid inputs are the values accepted bythe SET_SELECT bits of the
+	 * PRFCNT_CONFIG register as defined in the architecture specification.
+	*/
+	debugfs_create_u8("hwcnt_set_select", S_IRUGO | S_IWUSR,
+			  kbdev->mali_debugfs_directory,
+			  (u8 *)&kbdev->hwcnt.backend.override_counter_set);
 }
 #endif
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_defs.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_defs.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014, 2016, 2018, 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014, 2016, 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -27,6 +26,8 @@
 #ifndef _KBASE_INSTR_DEFS_H_
 #define _KBASE_INSTR_DEFS_H_

+#include "../../mali_kbase_hwcnt_gpu.h"
+
 /*
 * Instrumentation State Machine States
 */
@@ -37,8 +38,6 @@ enum kbase_instr_state {
 	KBASE_INSTR_STATE_IDLE,
 	/* Hardware is currently dumping a frame. */
 	KBASE_INSTR_STATE_DUMPING,
-	/* We've requested a clean to occur on a workqueue */
-	KBASE_INSTR_STATE_REQUEST_CLEAN,
 	/* An error has occured during DUMPING (page fault). */
 	KBASE_INSTR_STATE_FAULT
 };
@@ -47,17 +46,11 @@ enum kbase_instr_state {
 struct kbase_instr_backend {
 	wait_queue_head_t wait;
 	int triggered;
-#ifdef CONFIG_MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS
-	bool use_secondary_override;
+#ifdef CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS
+	enum kbase_hwcnt_physical_set override_counter_set;
 #endif

 	enum kbase_instr_state state;
-	struct workqueue_struct *cache_clean_wq;
-	struct work_struct  cache_clean_work;
-#if MALI_USE_CSF
-	struct tasklet_struct csf_hwc_irq_poll_tasklet;
-#endif
 };

 #endif /* _KBASE_INSTR_DEFS_H_ */
-
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_internal.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_instr_internal.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014, 2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014, 2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * Backend-specific HW access instrumentation APIs
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_irq_internal.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_irq_internal.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2015 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2015, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_irq_linux.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_irq_linux.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2014-2016,2018-2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
@@ -215,20 +214,21 @@ int kbase_set_custom_irq_handler(struct kbase_device *kbdev,
 	int result = 0;
 	irq_handler_t requested_irq_handler = NULL;

-	KBASE_DEBUG_ASSERT((JOB_IRQ_HANDLER <= irq_type) &&
-						(GPU_IRQ_HANDLER >= irq_type));
+	KBASE_DEBUG_ASSERT((irq_type >= JOB_IRQ_HANDLER) &&
+			   (irq_type <= GPU_IRQ_HANDLER));

 	/* Release previous handler */
 	if (kbdev->irqs[irq_type].irq)
 		free_irq(kbdev->irqs[irq_type].irq, kbase_tag(kbdev, irq_type));

-	requested_irq_handler = (NULL != custom_handler) ? custom_handler :
-						kbase_handler_table[irq_type];
+	requested_irq_handler = (custom_handler != NULL) ?
+					custom_handler :
+					kbase_handler_table[irq_type];

-	if (0 != request_irq(kbdev->irqs[irq_type].irq,
-			requested_irq_handler,
+	if (request_irq(kbdev->irqs[irq_type].irq, requested_irq_handler,
 			kbdev->irqs[irq_type].flags | IRQF_SHARED,
-			dev_name(kbdev->dev), kbase_tag(kbdev, irq_type))) {
+			dev_name(kbdev->dev),
+			kbase_tag(kbdev, irq_type)) != 0) {
 		result = -EINVAL;
 		dev_err(kbdev->dev, "Can't request interrupt %d (index %d)\n",
 					kbdev->irqs[irq_type].irq, irq_type);
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_as.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_as.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register backend context / address space management
 */
@@ -190,8 +188,8 @@ int kbase_backend_find_and_release_free_address_space(
 			}

 			/* Context was retained while locks were dropped,
-			 * continue looking for free AS */
-
+			 * continue looking for free AS
+			 */
 			mutex_unlock(&js_devdata->runpool_mutex);
 			mutex_unlock(&as_js_kctx_info->ctx.jsctx_mutex);

--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_defs.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_defs.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2016, 2018-2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register-based HW access backend specific definitions
 */
@@ -78,9 +76,8 @@ struct slot_rb {
 * The hwaccess_lock (a spinlock) must be held when accessing this structure
 */
 struct kbase_backend_data {
-	struct slot_rb slot_rb[BASE_JM_MAX_NR_SLOTS];
-
 #if !MALI_USE_CSF
+	struct slot_rb slot_rb[BASE_JM_MAX_NR_SLOTS];
 	struct hrtimer scheduling_timer;

 	bool timer_running;
@@ -94,13 +91,16 @@ struct kbase_backend_data {
 /* kbase_prepare_to_reset_gpu has been called */
 #define KBASE_RESET_GPU_PREPARED        1
 /* kbase_reset_gpu has been called - the reset will now definitely happen
- * within the timeout period */
+ * within the timeout period
+ */
 #define KBASE_RESET_GPU_COMMITTED       2
 /* The GPU reset process is currently occuring (timeout has expired or
- * kbasep_try_reset_gpu_early was called) */
+ * kbasep_try_reset_gpu_early was called)
+ */
 #define KBASE_RESET_GPU_HAPPENING       3
 /* Reset the GPU silently, used when resetting the GPU as part of normal
- * behavior (e.g. when exiting protected mode). */
+ * behavior (e.g. when exiting protected mode).
+ */
 #define KBASE_RESET_GPU_SILENT          4
 	struct workqueue_struct *reset_workq;
 	struct work_struct reset_work;
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_hw.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_hw.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2010-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -41,10 +40,12 @@
 #include <mali_kbase_regs_history_debugfs.h>

 static void kbasep_try_reset_gpu_early_locked(struct kbase_device *kbdev);
+static u64 kbasep_apply_limited_core_mask(const struct kbase_device *kbdev,
+				const u64 affinity, const u64 limited_core_mask);

 static u64 kbase_job_write_affinity(struct kbase_device *kbdev,
 				base_jd_core_req core_req,
-				int js)
+				int js, const u64 limited_core_mask)
 {
 	u64 affinity;

@@ -73,14 +74,21 @@ static u64 kbase_job_write_affinity(struct kbase_device *kbdev,
 		 */
 		if (js == 2 && num_core_groups > 1)
 			affinity &= coherency_info->group[1].core_mask;
-		else
+		else if (num_core_groups > 1)
 			affinity &= coherency_info->group[0].core_mask;
+		else
+			affinity &= kbdev->gpu_props.curr_config.shader_present;
 	} else {
 		/* Use all cores */
 		affinity = kbdev->pm.backend.shaders_avail &
 				kbdev->pm.debug_core_mask[js];
 	}

+	if (core_req & BASE_JD_REQ_LIMITED_CORE_MASK) {
+		/* Limiting affinity due to BASE_JD_REQ_LIMITED_CORE_MASK by applying the limited core mask. */
+		affinity = kbasep_apply_limited_core_mask(kbdev, affinity, limited_core_mask);
+	}
+
 	if (unlikely(!affinity)) {
 #ifdef CONFIG_MALI_BIFROST_DEBUG
 		u64 shaders_ready =
@@ -90,6 +98,16 @@ static u64 kbase_job_write_affinity(struct kbase_device *kbdev,
 #endif

 		affinity = kbdev->pm.backend.shaders_avail;
+
+		if (core_req & BASE_JD_REQ_LIMITED_CORE_MASK) {
+			/* Limiting affinity again to make sure it only enables shader cores with backed TLS memory. */
+			affinity = kbasep_apply_limited_core_mask(kbdev, affinity, limited_core_mask);
+
+#ifdef CONFIG_MALI_BIFROST_DEBUG
+			/* affinity should never be 0 */
+			WARN_ON(!affinity);
+#endif
+		}
 	}

 	kbase_reg_write(kbdev, JOB_SLOT_REG(js, JS_AFFINITY_NEXT_LO),
@@ -170,7 +188,7 @@ static u64 select_job_chain(struct kbase_jd_atom *katom)
 	}

 	dev_dbg(kctx->kbdev->dev,
-		"Selected job chain 0x%llx for end atom %p in state %d\n",
+		"Selected job chain 0x%llx for end atom %pK in state %d\n",
 		jc, (void *)katom, (int)rp->state);

 	katom->jc = jc;
@@ -194,7 +212,7 @@ void kbase_job_hw_submit(struct kbase_device *kbdev,
 	/* Command register must be available */
 	KBASE_DEBUG_ASSERT(kbasep_jm_is_js_free(kbdev, js, kctx));

-	dev_dbg(kctx->kbdev->dev, "Write JS_HEAD_NEXT 0x%llx for atom %p\n",
+	dev_dbg(kctx->kbdev->dev, "Write JS_HEAD_NEXT 0x%llx for atom %pK\n",
 		jc_head, (void *)katom);

 	kbase_reg_write(kbdev, JOB_SLOT_REG(js, JS_HEAD_NEXT_LO),
@@ -202,10 +220,12 @@ void kbase_job_hw_submit(struct kbase_device *kbdev,
 	kbase_reg_write(kbdev, JOB_SLOT_REG(js, JS_HEAD_NEXT_HI),
 						jc_head >> 32);

-	affinity = kbase_job_write_affinity(kbdev, katom->core_req, js);
+	affinity = kbase_job_write_affinity(kbdev, katom->core_req, js,
+						kctx->limited_core_mask);

 	/* start MMU, medium priority, cache clean/flush on end, clean/flush on
-	 * start */
+	 * start
+	 */
 	cfg = kctx->as_nr;

 	if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_FLUSH_REDUCTION) &&
@@ -257,7 +277,7 @@ void kbase_job_hw_submit(struct kbase_device *kbdev,
 	katom->start_timestamp = ktime_get();

 	/* GO ! */
-	dev_dbg(kbdev->dev, "JS: Submitting atom %p from ctx %p to js[%d] with head=0x%llx",
+	dev_dbg(kbdev->dev, "JS: Submitting atom %pK from ctx %pK to js[%d] with head=0x%llx",
 				katom, kctx, js, jc_head);

 	KBASE_KTRACE_ADD_JM_SLOT_INFO(kbdev, JM_SUBMIT, kctx, katom, jc_head, js,
@@ -331,7 +351,8 @@ static void kbasep_job_slot_update_head_start_timestamp(
 			/* Only update the timestamp if it's a better estimate
 			 * than what's currently stored. This is because our
 			 * estimate that accounts for the throttle time may be
-			 * too much of an overestimate */
+			 * too much of an overestimate
+			 */
 			katom->start_timestamp = end_timestamp;
 		}
 	}
@@ -374,9 +395,9 @@ void kbase_job_done(struct kbase_device *kbdev, u32 done)
 		/* treat failed slots as finished slots */
 		u32 finished = (done & 0xFFFF) | failed;

-		/* Note: This is inherently unfair, as we always check
-		 * for lower numbered interrupts before the higher
-		 * numbered ones.*/
+		/* Note: This is inherently unfair, as we always check for lower
+		 * numbered interrupts before the higher numbered ones.
+		 */
 		i = ffs(finished) - 1;
 		KBASE_DEBUG_ASSERT(i >= 0);

@@ -388,7 +409,8 @@ void kbase_job_done(struct kbase_device *kbdev, u32 done)

 			if (failed & (1u << i)) {
 				/* read out the job slot status code if the job
-				 * slot reported failure */
+				 * slot reported failure
+				 */
 				completion_code = kbase_reg_read(kbdev,
 					JOB_SLOT_REG(i, JS_STATUS));

@@ -402,7 +424,8 @@ void kbase_job_done(struct kbase_device *kbdev, u32 done)

 					/* Soft-stopped job - read the value of
 					 * JS<n>_TAIL so that the job chain can
-					 * be resumed */
+					 * be resumed
+					 */
 					job_tail = (u64)kbase_reg_read(kbdev,
 						JOB_SLOT_REG(i, JS_TAIL_LO)) |
 						((u64)kbase_reg_read(kbdev,
@@ -411,21 +434,26 @@ void kbase_job_done(struct kbase_device *kbdev, u32 done)
 				} else if (completion_code ==
 						BASE_JD_EVENT_NOT_STARTED) {
 					/* PRLAM-10673 can cause a TERMINATED
-					 * job to come back as NOT_STARTED, but
-					 * the error interrupt helps us detect
-					 * it */
+					 * job to come back as NOT_STARTED,
+					 * but the error interrupt helps us
+					 * detect it
+					 */
 					completion_code =
 						BASE_JD_EVENT_TERMINATED;
 				}

 				kbase_gpu_irq_evict(kbdev, i, completion_code);

-				/* Some jobs that encounter a BUS FAULT may result in corrupted
-				 * state causing future jobs to hang. Reset GPU before
-				 * allowing any other jobs on the slot to continue. */
+				/* Some jobs that encounter a BUS FAULT may
+				 * result in corrupted state causing future
+				 * jobs to hang. Reset GPU before allowing
+				 * any other jobs on the slot to continue.
+				 */
 				if (kbase_hw_has_issue(kbdev, BASE_HW_ISSUE_TTRX_3076)) {
 					if (completion_code == BASE_JD_EVENT_JOB_BUS_FAULT) {
-						if (kbase_prepare_to_reset_gpu_locked(kbdev))
+						if (kbase_prepare_to_reset_gpu_locked(
+							    kbdev,
+							    RESET_FLAGS_NONE))
 							kbase_reset_gpu_locked(kbdev);
 					}
 				}
@@ -483,7 +511,8 @@ void kbase_job_done(struct kbase_device *kbdev, u32 done)

 				if ((rawstat >> (i + 16)) & 1) {
 					/* There is a failed job that we've
-					 * missed - add it back to active */
+					 * missed - add it back to active
+					 */
 					active |= (1u << i);
 				}
 			}
@@ -585,7 +614,8 @@ void kbasep_job_slot_soft_or_hard_stop_do_action(struct kbase_device *kbdev,
 		}

 		/* We are about to issue a soft stop, so mark the atom as having
-		 * been soft stopped */
+		 * been soft stopped
+		 */
 		target_katom->atom_flags |= KBASE_KATOM_FLAG_BEEN_SOFT_STOPPED;

 		/* Mark the point where we issue the soft-stop command */
@@ -781,7 +811,7 @@ static int softstop_start_rp_nolock(

 	if (!(katom->core_req & BASE_JD_REQ_START_RENDERPASS)) {
 		dev_dbg(kctx->kbdev->dev,
-			"Atom %p on job slot is not start RP\n", (void *)katom);
+			"Atom %pK on job slot is not start RP\n", (void *)katom);
 		return -EPERM;
 	}

@@ -794,13 +824,13 @@ static int softstop_start_rp_nolock(
 		rp->state != KBASE_JD_RP_RETRY))
 		return -EINVAL;

-	dev_dbg(kctx->kbdev->dev, "OOM in state %d with region %p\n",
+	dev_dbg(kctx->kbdev->dev, "OOM in state %d with region %pK\n",
 		(int)rp->state, (void *)reg);

 	if (WARN_ON(katom != rp->start_katom))
 		return -EINVAL;

-	dev_dbg(kctx->kbdev->dev, "Adding region %p to list %p\n",
+	dev_dbg(kctx->kbdev->dev, "Adding region %pK to list %pK\n",
 		(void *)reg, (void *)&rp->oom_reg_list);
 	list_move_tail(&reg->link, &rp->oom_reg_list);
 	dev_dbg(kctx->kbdev->dev, "Added region to list\n");
@@ -845,9 +875,9 @@ void kbase_jm_wait_for_zero_jobs(struct kbase_context *kctx)
 	if (timeout != 0)
 		goto exit;

-	if (kbase_prepare_to_reset_gpu(kbdev)) {
+	if (kbase_prepare_to_reset_gpu(kbdev, RESET_FLAGS_NONE)) {
 		dev_err(kbdev->dev,
-			"Issueing GPU soft-reset because jobs failed to be killed (within %d ms) as part of context termination (e.g. process exit)\n",
+			"Issuing GPU soft-reset because jobs failed to be killed (within %d ms) as part of context termination (e.g. process exit)\n",
 			ZAP_TIMEOUT);
 		kbase_reset_gpu(kbdev);
 	}
@@ -855,7 +885,7 @@ void kbase_jm_wait_for_zero_jobs(struct kbase_context *kctx)
 	/* Wait for the reset to complete */
 	kbase_reset_gpu_wait(kbdev);
 exit:
-	dev_dbg(kbdev->dev, "Zap: Finished Context %p", kctx);
+	dev_dbg(kbdev->dev, "Zap: Finished Context %pK", kctx);

 	/* Ensure that the signallers of the waitqs have finished */
 	mutex_lock(&kctx->jctx.lock);
@@ -916,7 +946,7 @@ KBASE_EXPORT_TEST_API(kbase_job_slot_term);
 void kbase_job_slot_softstop_swflags(struct kbase_device *kbdev, int js,
 			struct kbase_jd_atom *target_katom, u32 sw_flags)
 {
-	dev_dbg(kbdev->dev, "Soft-stop atom %p with flags 0x%x (s:%d)\n",
+	dev_dbg(kbdev->dev, "Soft-stop atom %pK with flags 0x%x (s:%d)\n",
 		target_katom, sw_flags, js);

 	KBASE_DEBUG_ASSERT(!(sw_flags & JS_COMMAND_MASK));
@@ -1020,6 +1050,33 @@ void kbase_job_check_leave_disjoint(struct kbase_device *kbdev,
 	}
 }

+int kbase_reset_gpu_prevent_and_wait(struct kbase_device *kbdev)
+{
+	WARN(true, "%s Not implemented for JM GPUs", __func__);
+	return -EINVAL;
+}
+
+int kbase_reset_gpu_try_prevent(struct kbase_device *kbdev)
+{
+	WARN(true, "%s Not implemented for JM GPUs", __func__);
+	return -EINVAL;
+}
+
+void kbase_reset_gpu_allow(struct kbase_device *kbdev)
+{
+	WARN(true, "%s Not implemented for JM GPUs", __func__);
+}
+
+void kbase_reset_gpu_assert_prevented(struct kbase_device *kbdev)
+{
+	WARN(true, "%s Not implemented for JM GPUs", __func__);
+}
+
+void kbase_reset_gpu_assert_failed_or_prevented(struct kbase_device *kbdev)
+{
+	WARN(true, "%s Not implemented for JM GPUs", __func__);
+}
+
 static void kbase_debug_dump_registers(struct kbase_device *kbdev)
 {
 	int i;
@@ -1086,13 +1143,15 @@ static void kbasep_reset_timeout_worker(struct work_struct *data)

 	/* Make sure the timer has completed - this cannot be done from
 	 * interrupt context, so this cannot be done within
-	 * kbasep_try_reset_gpu_early. */
+	 * kbasep_try_reset_gpu_early.
+	 */
 	hrtimer_cancel(&kbdev->hwaccess.backend.reset_timer);

 	if (kbase_pm_context_active_handle_suspend(kbdev,
 				KBASE_PM_SUSPEND_HANDLER_DONT_REACTIVATE)) {
 		/* This would re-activate the GPU. Since it's already idle,
-		 * there's no need to reset it */
+		 * there's no need to reset it
+		 */
 		atomic_set(&kbdev->hwaccess.backend.reset_gpu,
 						KBASE_RESET_GPU_NOT_PENDING);
 		kbase_disjoint_state_down(kbdev);
@@ -1113,14 +1172,16 @@ static void kbasep_reset_timeout_worker(struct work_struct *data)
 	kbdev->irq_reset_flush = true;

 	/* Disable IRQ to avoid IRQ handlers to kick in after releasing the
-	 * spinlock; this also clears any outstanding interrupts */
+	 * spinlock; this also clears any outstanding interrupts
+	 */
 	kbase_pm_disable_interrupts_nolock(kbdev);

 	spin_unlock(&kbdev->mmu_mask_change);
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);

 	/* Ensure that any IRQ handlers have finished
-	 * Must be done without any locks IRQ handlers will take */
+	 * Must be done without any locks IRQ handlers will take
+	 */
 	kbase_synchronize_irqs(kbdev);

 	/* Flush out any in-flight work items */
@@ -1131,7 +1192,8 @@ static void kbasep_reset_timeout_worker(struct work_struct *data)

 	if (kbase_hw_has_issue(kbdev, BASE_HW_ISSUE_TMIX_8463)) {
 		/* Ensure that L2 is not transitioning when we send the reset
-		 * command */
+		 * command
+		 */
 		while (--max_loops && kbase_pm_get_trans_cores(kbdev,
 				KBASE_PM_CORE_L2))
 			;
@@ -1146,14 +1208,16 @@ static void kbasep_reset_timeout_worker(struct work_struct *data)
 	/* All slot have been soft-stopped and we've waited
 	 * SOFT_STOP_RESET_TIMEOUT for the slots to clear, at this point we
 	 * assume that anything that is still left on the GPU is stuck there and
-	 * we'll kill it when we reset the GPU */
+	 * we'll kill it when we reset the GPU
+	 */

 	if (!silent)
 		dev_err(kbdev->dev, "Resetting GPU (allowing up to %d ms)",
 								RESET_TIMEOUT);

 	/* Output the state of some interesting registers to help in the
-	 * debugging of GPU resets */
+	 * debugging of GPU resets
+	 */
 	if (!silent)
 		kbase_debug_dump_registers(kbdev);

@@ -1192,7 +1256,8 @@ static void kbasep_reset_timeout_worker(struct work_struct *data)
 	kbase_pm_update_cores_state(kbdev);

 	/* Synchronously request and wait for those cores, because if
-	 * instrumentation is enabled it would need them immediately. */
+	 * instrumentation is enabled it would need them immediately.
+	 */
 	kbase_pm_wait_for_desired_state(kbdev);

 	mutex_unlock(&kbdev->pm.lock);
@@ -1269,7 +1334,8 @@ static void kbasep_try_reset_gpu_early_locked(struct kbase_device *kbdev)

 	/* Check that the reset has been committed to (i.e. kbase_reset_gpu has
 	 * been called), and that no other thread beat this thread to starting
-	 * the reset */
+	 * the reset
+	 */
 	if (atomic_cmpxchg(&kbdev->hwaccess.backend.reset_gpu,
 			KBASE_RESET_GPU_COMMITTED, KBASE_RESET_GPU_HAPPENING) !=
 						KBASE_RESET_GPU_COMMITTED) {
@@ -1293,6 +1359,7 @@ static void kbasep_try_reset_gpu_early(struct kbase_device *kbdev)
 /**
 * kbase_prepare_to_reset_gpu_locked - Prepare for resetting the GPU
 * @kbdev: kbase device
+ * @flags: Bitfield indicating impact of reset (see flag defines)
 *
 * This function just soft-stops all the slots to ensure that as many jobs as
 * possible are saved.
@@ -1303,10 +1370,12 @@ static void kbasep_try_reset_gpu_early(struct kbase_device *kbdev)
 *   false - Another thread is performing a reset, kbase_reset_gpu should
 *   not be called.
 */
-bool kbase_prepare_to_reset_gpu_locked(struct kbase_device *kbdev)
+bool kbase_prepare_to_reset_gpu_locked(struct kbase_device *kbdev,
+				       unsigned int flags)
 {
 	int i;

+	CSTD_UNUSED(flags);
 	KBASE_DEBUG_ASSERT(kbdev);

 #ifdef CONFIG_MALI_ARBITER_SUPPORT
@@ -1334,14 +1403,14 @@ bool kbase_prepare_to_reset_gpu_locked(struct kbase_device *kbdev)
 	return true;
 }

-bool kbase_prepare_to_reset_gpu(struct kbase_device *kbdev)
+bool kbase_prepare_to_reset_gpu(struct kbase_device *kbdev, unsigned int flags)
 {
-	unsigned long flags;
+	unsigned long lock_flags;
 	bool ret;

-	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
-	ret = kbase_prepare_to_reset_gpu_locked(kbdev);
-	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+	spin_lock_irqsave(&kbdev->hwaccess_lock, lock_flags);
+	ret = kbase_prepare_to_reset_gpu_locked(kbdev, flags);
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, lock_flags);

 	return ret;
 }
@@ -1362,7 +1431,8 @@ void kbase_reset_gpu(struct kbase_device *kbdev)
 	KBASE_DEBUG_ASSERT(kbdev);

 	/* Note this is an assert/atomic_set because it is a software issue for
-	 * a race to be occuring here */
+	 * a race to be occurring here
+	 */
 	KBASE_DEBUG_ASSERT(atomic_read(&kbdev->hwaccess.backend.reset_gpu) ==
 						KBASE_RESET_GPU_PREPARED);
 	atomic_set(&kbdev->hwaccess.backend.reset_gpu,
@@ -1385,7 +1455,8 @@ void kbase_reset_gpu_locked(struct kbase_device *kbdev)
 	KBASE_DEBUG_ASSERT(kbdev);

 	/* Note this is an assert/atomic_set because it is a software issue for
-	 * a race to be occuring here */
+	 * a race to be occurring here
+	 */
 	KBASE_DEBUG_ASSERT(atomic_read(&kbdev->hwaccess.backend.reset_gpu) ==
 						KBASE_RESET_GPU_PREPARED);
 	atomic_set(&kbdev->hwaccess.backend.reset_gpu,
@@ -1460,3 +1531,21 @@ void kbase_reset_gpu_term(struct kbase_device *kbdev)
 {
 	destroy_workqueue(kbdev->hwaccess.backend.reset_workq);
 }
+
+static u64 kbasep_apply_limited_core_mask(const struct kbase_device *kbdev,
+				const u64 affinity, const u64 limited_core_mask)
+{
+	const u64 result = affinity & limited_core_mask;
+
+#ifdef CONFIG_MALI_BIFROST_DEBUG
+	dev_dbg(kbdev->dev,
+				"Limiting affinity due to BASE_JD_REQ_LIMITED_CORE_MASK from 0x%lx to 0x%lx (mask is 0x%lx)\n",
+				(unsigned long int)affinity,
+				(unsigned long int)result,
+				(unsigned long int)limited_core_mask);
+#else
+	CSTD_UNUSED(kbdev);
+#endif
+
+	return result;
+}
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_internal.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_internal.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
 * (C) COPYRIGHT 2011-2016, 2018-2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * Job Manager backend-specific low-level APIs.
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_rb.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_rb.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register-based HW access backend specific APIs
 */
@@ -40,10 +38,12 @@
 #include <backend/gpu/mali_kbase_pm_internal.h>

 /* Return whether the specified ringbuffer is empty. HW access lock must be
- * held */
+ * held
+ */
 #define SLOT_RB_EMPTY(rb)   (rb->write_idx == rb->read_idx)
 /* Return number of atoms currently in the specified ringbuffer. HW access lock
- * must be held */
+ * must be held
+ */
 #define SLOT_RB_ENTRIES(rb) (int)(s8)(rb->write_idx - rb->read_idx)

 static void kbase_gpu_release_atom(struct kbase_device *kbdev,
@@ -284,7 +284,8 @@ static void kbase_gpu_release_atom(struct kbase_device *kbdev,
 		kbase_kinstr_jm_atom_hw_release(katom);
 		/* Inform power management at start/finish of atom so it can
 		 * update its GPU utilisation metrics. Mark atom as not
-		 * submitted beforehand. */
+		 * submitted beforehand.
+		 */
 		katom->gpu_rb_state = KBASE_ATOM_GPU_RB_READY;
 		kbase_pm_metrics_update(kbdev, end_timestamp);

@@ -544,7 +545,8 @@ static int kbase_jm_enter_protected_mode(struct kbase_device *kbdev,
 		KBASE_TLSTREAM_AUX_PROTECTED_ENTER_START(kbdev, kbdev);
 		/* The checks in KBASE_ATOM_GPU_RB_WAITING_PROTECTED_MODE_PREV
 		 * should ensure that we are not already transitiong, and that
-		 * there are no atoms currently on the GPU. */
+		 * there are no atoms currently on the GPU.
+		 */
 		WARN_ON(kbdev->protected_mode_transition);
 		WARN_ON(kbase_gpu_atoms_submitted_any(kbdev));
 		/* If hwcnt is disabled, it means we didn't clean up correctly
@@ -570,19 +572,15 @@ static int kbase_jm_enter_protected_mode(struct kbase_device *kbdev,

 		/* We couldn't disable atomically, so kick off a worker */
 		if (!kbdev->protected_mode_hwcnt_disabled) {
-#if KERNEL_VERSION(3, 16, 0) > LINUX_VERSION_CODE
-			queue_work(system_wq,
+			kbase_hwcnt_context_queue_work(
+				kbdev->hwcnt_gpu_ctx,
 				&kbdev->protected_mode_hwcnt_disable_work);
-#else
-			queue_work(system_highpri_wq,
-				&kbdev->protected_mode_hwcnt_disable_work);
-#endif
 			return -EAGAIN;
 		}

-		/* Once reaching this point GPU must be
-		 * switched to protected mode or hwcnt
-		 * re-enabled. */
+		/* Once reaching this point GPU must be switched to protected
+		 * mode or hwcnt re-enabled.
+		 */

 		if (kbase_pm_protected_entry_override_enable(kbdev))
 			return -EAGAIN;
@@ -722,7 +720,8 @@ static int kbase_jm_exit_protected_mode(struct kbase_device *kbdev,
 		KBASE_TLSTREAM_AUX_PROTECTED_LEAVE_START(kbdev, kbdev);
 		/* The checks in KBASE_ATOM_GPU_RB_WAITING_PROTECTED_MODE_PREV
 		 * should ensure that we are not already transitiong, and that
-		 * there are no atoms currently on the GPU. */
+		 * there are no atoms currently on the GPU.
+		 */
 		WARN_ON(kbdev->protected_mode_transition);
 		WARN_ON(kbase_gpu_atoms_submitted_any(kbdev));

@@ -768,8 +767,8 @@ static int kbase_jm_exit_protected_mode(struct kbase_device *kbdev,
 			katom[idx]->event_code = BASE_JD_EVENT_JOB_INVALID;
 			kbase_gpu_mark_atom_for_return(kbdev, katom[idx]);
 			/* Only return if head atom or previous atom
-			 * already removed - as atoms must be returned
-			 * in order */
+			 * already removed - as atoms must be returned in order
+			 */
 			if (idx == 0 || katom[0]->gpu_rb_state ==
 					KBASE_ATOM_GPU_RB_NOT_IN_SLOT_RB) {
 				kbase_gpu_dequeue_atom(kbdev, js, NULL);
@@ -912,12 +911,14 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)
 					kbase_gpu_mark_atom_for_return(kbdev,
 							katom[idx]);
 					/* Set EVENT_DONE so this atom will be
-					   completed, not unpulled. */
+					 * completed, not unpulled.
+					 */
 					katom[idx]->event_code =
 						BASE_JD_EVENT_DONE;
 					/* Only return if head atom or previous
 					 * atom already removed - as atoms must
-					 * be returned in order. */
+					 * be returned in order.
+					 */
 					if (idx == 0 ||	katom[0]->gpu_rb_state ==
 							KBASE_ATOM_GPU_RB_NOT_IN_SLOT_RB) {
 						kbase_gpu_dequeue_atom(kbdev, js, NULL);
@@ -948,7 +949,8 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)

 				if (idx == 1) {
 					/* Only submit if head atom or previous
-					 * atom already submitted */
+					 * atom already submitted
+					 */
 					if ((katom[0]->gpu_rb_state !=
 						KBASE_ATOM_GPU_RB_SUBMITTED &&
 						katom[0]->gpu_rb_state !=
@@ -964,7 +966,8 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)
 				}

 				/* If inter-slot serialization in use then don't
-				 * submit atom if any other slots are in use */
+				 * submit atom if any other slots are in use
+				 */
 				if ((kbdev->serialize_jobs &
 						KBASE_SERIALIZE_INTER_SLOT) &&
 						other_slots_busy(kbdev, js))
@@ -976,7 +979,8 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)
 					break;
 #endif
 				/* Check if this job needs the cycle counter
-				 * enabled before submission */
+				 * enabled before submission
+				 */
 				if (katom[idx]->core_req & BASE_JD_REQ_PERMON)
 					kbase_pm_request_gpu_cycle_counter_l2_is_on(
 									kbdev);
@@ -987,7 +991,8 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)

 				/* Inform power management at start/finish of
 				 * atom so it can update its GPU utilisation
-				 * metrics. */
+				 * metrics.
+				 */
 				kbase_pm_metrics_update(kbdev,
 						&katom[idx]->start_timestamp);

@@ -1000,7 +1005,8 @@ void kbase_backend_slot_update(struct kbase_device *kbdev)
 			case KBASE_ATOM_GPU_RB_RETURN_TO_JS:
 				/* Only return if head atom or previous atom
 				 * already removed - as atoms must be returned
-				 * in order */
+				 * in order
+				 */
 				if (idx == 0 || katom[0]->gpu_rb_state ==
 					KBASE_ATOM_GPU_RB_NOT_IN_SLOT_RB) {
 					kbase_gpu_dequeue_atom(kbdev, js, NULL);
@@ -1018,7 +1024,7 @@ void kbase_backend_run_atom(struct kbase_device *kbdev,
 				struct kbase_jd_atom *katom)
 {
 	lockdep_assert_held(&kbdev->hwaccess_lock);
-	dev_dbg(kbdev->dev, "Backend running atom %p\n", (void *)katom);
+	dev_dbg(kbdev->dev, "Backend running atom %pK\n", (void *)katom);

 	kbase_gpu_enqueue_atom(kbdev, katom);
 	kbase_backend_slot_update(kbdev);
@@ -1079,7 +1085,7 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,
 	struct kbase_context *kctx = katom->kctx;

 	dev_dbg(kbdev->dev,
-		"Atom %p completed on hw with code 0x%x and job_tail 0x%llx (s:%d)\n",
+		"Atom %pK completed on hw with code 0x%x and job_tail 0x%llx (s:%d)\n",
 		(void *)katom, completion_code, job_tail, js);

 	lockdep_assert_held(&kbdev->hwaccess_lock);
@@ -1103,7 +1109,8 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,
 		 * BASE_JD_REQ_SKIP_CACHE_END is set, the GPU cache is not
 		 * flushed. To prevent future evictions causing possible memory
 		 * corruption we need to flush the cache manually before any
-		 * affected memory gets reused. */
+		 * affected memory gets reused.
+		 */
 		katom->need_cache_flush_cores_retained = true;
 	}

@@ -1184,7 +1191,8 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,
 					katom_idx1->gpu_rb_state !=
 					KBASE_ATOM_GPU_RB_SUBMITTED) {
 				/* Can not dequeue this atom yet - will be
-				 * dequeued when atom at idx0 completes */
+				 * dequeued when atom at idx0 completes
+				 */
 				katom_idx1->event_code = BASE_JD_EVENT_STOPPED;
 				kbase_gpu_mark_atom_for_return(kbdev,
 								katom_idx1);
@@ -1197,7 +1205,7 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,
 	if (job_tail != 0 && job_tail != katom->jc) {
 		/* Some of the job has been executed */
 		dev_dbg(kbdev->dev,
-			"Update job chain address of atom %p to resume from 0x%llx\n",
+			"Update job chain address of atom %pK to resume from 0x%llx\n",
 			(void *)katom, job_tail);

 		katom->jc = job_tail;
@@ -1258,7 +1266,7 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,

 	if (katom) {
 		dev_dbg(kbdev->dev,
-			"Cross-slot dependency %p has become runnable.\n",
+			"Cross-slot dependency %pK has become runnable.\n",
 			(void *)katom);

 		/* Check if there are lower priority jobs to soft stop */
@@ -1271,7 +1279,8 @@ void kbase_gpu_complete_hw(struct kbase_device *kbdev, int js,
 	kbase_pm_update_state(kbdev);

 	/* Job completion may have unblocked other atoms. Try to update all job
-	 * slots */
+	 * slots
+	 */
 	kbase_backend_slot_update(kbdev);
 }

@@ -1322,7 +1331,8 @@ void kbase_backend_reset(struct kbase_device *kbdev, ktime_t *end_timestamp)
 				katom->protected_state.exit = KBASE_ATOM_EXIT_PROTECTED_CHECK;
 				/* As the atom was not removed, increment the
 				 * index so that we read the correct atom in the
-				 * next iteration. */
+				 * next iteration.
+				 */
 				atom_idx++;
 				continue;
 			}
@@ -1425,7 +1435,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 		katom_idx0_valid = (katom_idx0 == katom);
 		/* If idx0 is to be removed and idx1 is on the same context,
 		 * then idx1 must also be removed otherwise the atoms might be
-		 * returned out of order */
+		 * returned out of order
+		 */
 		if (katom_idx1)
 			katom_idx1_valid = (katom_idx1 == katom) ||
 						(katom_idx0_valid &&
@@ -1472,7 +1483,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 				if (kbase_reg_read(kbdev, JOB_SLOT_REG(js,
 						JS_COMMAND_NEXT)) == 0) {
 					/* idx0 has already completed - stop
-					 * idx1 if needed*/
+					 * idx1 if needed
+					 */
 					if (katom_idx1_valid) {
 						kbase_gpu_stop_atom(kbdev, js,
 								katom_idx1,
@@ -1481,7 +1493,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 					}
 				} else {
 					/* idx1 is in NEXT registers - attempt
-					 * to remove */
+					 * to remove
+					 */
 					kbase_reg_write(kbdev,
 							JOB_SLOT_REG(js,
 							JS_COMMAND_NEXT),
@@ -1496,7 +1509,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 							JS_HEAD_NEXT_HI))
 									!= 0) {
 						/* idx1 removed successfully,
-						 * will be handled in IRQ */
+						 * will be handled in IRQ
+						 */
 						kbase_gpu_remove_atom(kbdev,
 								katom_idx1,
 								action, true);
@@ -1510,7 +1524,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 						ret = true;
 					} else if (katom_idx1_valid) {
 						/* idx0 has already completed,
-						 * stop idx1 if needed */
+						 * stop idx1 if needed
+						 */
 						kbase_gpu_stop_atom(kbdev, js,
 								katom_idx1,
 								action);
@@ -1529,7 +1544,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 				 * flow was also interrupted, and this function
 				 * might not enter disjoint state e.g. if we
 				 * don't actually do a hard stop on the head
-				 * atom */
+				 * atom
+				 */
 				kbase_gpu_stop_atom(kbdev, js, katom_idx0,
 									action);
 				ret = true;
@@ -1557,7 +1573,8 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 				ret = true;
 			} else {
 				/* idx1 is in NEXT registers - attempt to
-				 * remove */
+				 * remove
+				 */
 				kbase_reg_write(kbdev, JOB_SLOT_REG(js,
 							JS_COMMAND_NEXT),
 							JS_COMMAND_NOP);
@@ -1567,13 +1584,15 @@ bool kbase_backend_soft_hard_stop_slot(struct kbase_device *kbdev,
 				    kbase_reg_read(kbdev, JOB_SLOT_REG(js,
 						JS_HEAD_NEXT_HI)) != 0) {
 					/* idx1 removed successfully, will be
-					 * handled in IRQ once idx0 completes */
+					 * handled in IRQ once idx0 completes
+					 */
 					kbase_gpu_remove_atom(kbdev, katom_idx1,
 									action,
 									false);
 				} else {
 					/* idx0 has already completed - stop
-					 * idx1 */
+					 * idx1
+					 */
 					kbase_gpu_stop_atom(kbdev, js,
 								katom_idx1,
 								action);
@@ -1647,7 +1666,7 @@ void kbase_gpu_dump_slots(struct kbase_device *kbdev)

 			if (katom)
 				dev_info(kbdev->dev,
-				"  js%d idx%d : katom=%p gpu_rb_state=%d\n",
+				"  js%d idx%d : katom=%pK gpu_rb_state=%d\n",
 				js, idx, katom, katom->gpu_rb_state);
 			else
 				dev_info(kbdev->dev, "  js%d idx%d : empty\n",
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_rb.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_jm_rb.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register-based HW access backend specific APIs
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_js_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_js_backend.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register-based HW access backend specific job scheduler APIs
 */
@@ -48,7 +46,8 @@ static inline bool timer_callback_should_run(struct kbase_device *kbdev)

 	/* nr_contexts_pullable is updated with the runpool_mutex. However, the
 	 * locking in the caller gives us a barrier that ensures
-	 * nr_contexts_pullable is up-to-date for reading */
+	 * nr_contexts_pullable is up-to-date for reading
+	 */
 	nr_running_ctxs = atomic_read(&kbdev->js_data.nr_contexts_runnable);

 #ifdef CONFIG_MALI_BIFROST_DEBUG
@@ -114,7 +113,8 @@ static enum hrtimer_restart timer_callback(struct hrtimer *timer)

 		if (atom != NULL) {
 			/* The current version of the model doesn't support
-			 * Soft-Stop */
+			 * Soft-Stop
+			 */
 			if (!kbase_hw_has_issue(kbdev, BASE_HW_ISSUE_5736)) {
 				u32 ticks = atom->ticks++;

@@ -142,7 +142,8 @@ static enum hrtimer_restart timer_callback(struct hrtimer *timer)
 				 * new soft_stop timeout. This ensures that
 				 * atoms do not miss any of the timeouts due to
 				 * races between this worker and the thread
-				 * changing the timeouts. */
+				 * changing the timeouts.
+				 */
 				if (backend->timeouts_updated &&
 						ticks > soft_stop_ticks)
 					ticks = atom->ticks = soft_stop_ticks;
@@ -172,10 +173,11 @@ static enum hrtimer_restart timer_callback(struct hrtimer *timer)
 					 *
 					 * Similarly, if it's about to be
 					 * decreased, the last job from another
-					 * context has already finished, so it's
-					 * not too bad that we observe the older
-					 * value and register a disjoint event
-					 * when we try soft-stopping */
+					 * context has already finished, so
+					 * it's not too bad that we observe the
+					 * older value and register a disjoint
+					 * event when we try soft-stopping
+					 */
 					if (js_devdata->nr_user_contexts_running
 							>= disjoint_threshold)
 						softstop_flags |=
@@ -253,9 +255,9 @@ static enum hrtimer_restart timer_callback(struct hrtimer *timer)
 		}
 	}
 	if (reset_needed) {
-		dev_err(kbdev->dev, "JS: Job has been on the GPU for too long (JS_RESET_TICKS_SS/DUMPING timeout hit). Issueing GPU soft-reset to resolve.");
+		dev_err(kbdev->dev, "JS: Job has been on the GPU for too long (JS_RESET_TICKS_SS/DUMPING timeout hit). Issuing GPU soft-reset to resolve.");

-		if (kbase_prepare_to_reset_gpu_locked(kbdev))
+		if (kbase_prepare_to_reset_gpu_locked(kbdev, RESET_FLAGS_NONE))
 			kbase_reset_gpu_locked(kbdev);
 	}
 	/* the timer is re-issued if there is contexts in the run-pool */
@@ -287,11 +289,12 @@ void kbase_backend_ctx_count_changed(struct kbase_device *kbdev)
 		spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
 		backend->timer_running = false;
 		spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
-		/* From now on, return value of timer_callback_should_run() will
-		 * also cause the timer to not requeue itself. Its return value
-		 * cannot change, because it depends on variables updated with
-		 * the runpool_mutex held, which the caller of this must also
-		 * hold */
+		/* From now on, return value of timer_callback_should_run()
+		 * will also cause the timer to not requeue itself. Its return
+		 * value cannot change, because it depends on variables updated
+		 * with the runpool_mutex held, which the caller of this must
+		 * also hold
+		 */
 		hrtimer_cancel(&backend->scheduling_timer);
 	}

--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_js_internal.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_js_internal.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2015 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2015, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * Register-based HW access backend specific job scheduler APIs
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_l2_mmu_config.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_l2_mmu_config.c
@@ -6,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_l2_mmu_config.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_l2_mmu_config.h
@@ -1,31 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-/*
 * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_always_on.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_always_on.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2010-2015, 2018-2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2015, 2018-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * "Always on" power management policy
 */
@@ -62,6 +59,9 @@ const struct kbase_pm_policy kbase_pm_always_on_policy_ops = {
 	always_on_shaders_needed,	/* shaders_needed */
 	always_on_get_core_active,	/* get_core_active */
 	KBASE_PM_POLICY_ID_ALWAYS_ON,	/* id */
+#if MALI_USE_CSF
+	ALWAYS_ON_PM_SCHED_FLAGS,	/* pm_sched_flags */
+#endif
 };

 KBASE_EXPORT_TEST_API(kbase_pm_always_on_policy_ops);
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_always_on.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_always_on.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2011-2015,2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2011-2015, 2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * "Always on" power management policy
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_backend.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_backend.c
@@ -1,11 +1,12 @@
- /*
+// SPDX-License-Identifier: GPL-2.0
+/*
 *
- * (C) COPYRIGHT 2010-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
 /*
 * GPU backend implementation of base kernel power management APIs
 */
@@ -156,15 +154,25 @@ int kbase_hwaccess_pm_init(struct kbase_device *kbdev)
 #endif /* CONFIG_MALI_BIFROST_DEBUG */
 	init_waitqueue_head(&kbdev->pm.backend.gpu_in_desired_state_wait);

+#if !MALI_USE_CSF
 	/* Initialise the metrics subsystem */
 	ret = kbasep_pm_metrics_init(kbdev);
 	if (ret)
 		return ret;
+#else
+	mutex_init(&kbdev->pm.backend.policy_change_lock);
+	kbdev->pm.backend.policy_change_clamp_state_to_off = false;
+	/* Due to dependency on kbase_ipa_control, the metrics subsystem can't
+	 * be initialized here.
+	 */
+	CSTD_UNUSED(ret);
+#endif

 	init_waitqueue_head(&kbdev->pm.backend.reset_done_wait);
 	kbdev->pm.backend.reset_done = false;

 	init_waitqueue_head(&kbdev->pm.zero_active_count_wait);
+	init_waitqueue_head(&kbdev->pm.resume_wait);
 	kbdev->pm.active_count = 0;

 	spin_lock_init(&kbdev->pm.backend.gpu_cycle_counter_requests_lock);
@@ -221,7 +229,9 @@ pm_state_machine_fail:
 	kbase_pm_policy_term(kbdev);
 	kbase_pm_ca_term(kbdev);
 workq_fail:
+#if !MALI_USE_CSF
 	kbasep_pm_metrics_term(kbdev);
+#endif
 	return -EINVAL;
 }

@@ -230,7 +240,8 @@ void kbase_pm_do_poweron(struct kbase_device *kbdev, bool is_resume)
 	lockdep_assert_held(&kbdev->pm.lock);

 	/* Turn clocks and interrupts on - no-op if we haven't done a previous
-	 * kbase_pm_clock_off() */
+	 * kbase_pm_clock_off()
+	 */
 	kbase_pm_clock_on(kbdev, is_resume);

 	if (!is_resume) {
@@ -248,7 +259,8 @@ void kbase_pm_do_poweron(struct kbase_device *kbdev, bool is_resume)
 	kbase_pm_update_cores_state(kbdev);

 	/* NOTE: We don't wait to reach the desired state, since running atoms
-	 * will wait for that state to be reached anyway */
+	 * will wait for that state to be reached anyway
+	 */
 }

 static void kbase_pm_gpu_poweroff_wait_wq(struct work_struct *data)
@@ -486,7 +498,15 @@ static void kbase_pm_hwcnt_disable_worker(struct work_struct *data)
 		/* PM state was updated while we were doing the disable,
 		 * so we need to undo the disable we just performed.
 		 */
+#if MALI_USE_CSF
+		unsigned long lock_flags;
+
+		kbase_csf_scheduler_spin_lock(kbdev, &lock_flags);
+#endif
 		kbase_hwcnt_context_enable(kbdev->hwcnt_gpu_ctx);
+#if MALI_USE_CSF
+		kbase_csf_scheduler_spin_unlock(kbdev, lock_flags);
+#endif
 	}

 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
@@ -562,20 +582,35 @@ int kbase_hwaccess_pm_powerup(struct kbase_device *kbdev,
 	KBASE_DEBUG_ASSERT(!kbase_pm_is_suspending(kbdev));

 	/* Power up the GPU, don't enable IRQs as we are not ready to receive
-	 * them. */
+	 * them
+	 */
 	ret = kbase_pm_init_hw(kbdev, flags);
 	if (ret) {
 		kbase_pm_unlock(kbdev);
 		return ret;
 	}
-
+#if MALI_USE_CSF
+	kbdev->pm.debug_core_mask =
+		kbdev->gpu_props.props.raw_props.shader_present;
+	spin_lock_irqsave(&kbdev->hwaccess_lock, irq_flags);
+	/* Set the initial value for 'shaders_avail'. It would be later
+	 * modified only from the MCU state machine, when the shader core
+	 * allocation enable mask request has completed. So its value would
+	 * indicate the mask of cores that are currently being used by FW for
+	 * the allocation of endpoints requested by CSGs.
+	 */
+	kbdev->pm.backend.shaders_avail = kbase_pm_ca_get_core_mask(kbdev);
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, irq_flags);
+#else
 	kbdev->pm.debug_core_mask_all = kbdev->pm.debug_core_mask[0] =
 			kbdev->pm.debug_core_mask[1] =
 			kbdev->pm.debug_core_mask[2] =
 			kbdev->gpu_props.props.raw_props.shader_present;
+#endif

 	/* Pretend the GPU is active to prevent a power policy turning the GPU
-	 * cores off */
+	 * cores off
+	 */
 	kbdev->pm.active_count = 1;

 	spin_lock_irqsave(&kbdev->pm.backend.gpu_cycle_counter_requests_lock,
@@ -587,7 +622,8 @@ int kbase_hwaccess_pm_powerup(struct kbase_device *kbdev,
 								irq_flags);

 	/* We are ready to receive IRQ's now as power policy is set up, so
-	 * enable them now. */
+	 * enable them now.
+	 */
 #ifdef CONFIG_MALI_BIFROST_DEBUG
 	kbdev->pm.backend.driver_ready_for_irqs = true;
 #endif
@@ -620,6 +656,8 @@ void kbase_hwaccess_pm_halt(struct kbase_device *kbdev)
 	mutex_lock(&kbdev->pm.lock);
 	kbase_pm_do_poweroff(kbdev);
 	mutex_unlock(&kbdev->pm.lock);
+
+	kbase_pm_wait_for_poweroff_complete(kbdev);
 }

 KBASE_EXPORT_TEST_API(kbase_hwaccess_pm_halt);
@@ -634,10 +672,15 @@ void kbase_hwaccess_pm_term(struct kbase_device *kbdev)

 	if (kbdev->pm.backend.hwcnt_disabled) {
 		unsigned long flags;
-
+#if MALI_USE_CSF
+		kbase_csf_scheduler_spin_lock(kbdev, &flags);
+		kbase_hwcnt_context_enable(kbdev->hwcnt_gpu_ctx);
+		kbase_csf_scheduler_spin_unlock(kbdev, flags);
+#else
 		spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
 		kbase_hwcnt_context_enable(kbdev->hwcnt_gpu_ctx);
 		spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+#endif
 	}

 	/* Free any resources the policy allocated */
@@ -645,8 +688,16 @@ void kbase_hwaccess_pm_term(struct kbase_device *kbdev)
 	kbase_pm_policy_term(kbdev);
 	kbase_pm_ca_term(kbdev);

+#if !MALI_USE_CSF
 	/* Shut down the metrics subsystem */
 	kbasep_pm_metrics_term(kbdev);
+#else
+	if (WARN_ON(mutex_is_locked(&kbdev->pm.backend.policy_change_lock))) {
+		mutex_lock(&kbdev->pm.backend.policy_change_lock);
+		mutex_unlock(&kbdev->pm.backend.policy_change_lock);
+	}
+	mutex_destroy(&kbdev->pm.backend.policy_change_lock);
+#endif

 	destroy_workqueue(kbdev->pm.backend.gpu_poweroff_wait_wq);
 }
@@ -665,6 +716,17 @@ void kbase_pm_power_changed(struct kbase_device *kbdev)
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
 }

+#if MALI_USE_CSF
+void kbase_pm_set_debug_core_mask(struct kbase_device *kbdev, u64 new_core_mask)
+{
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+	lockdep_assert_held(&kbdev->pm.lock);
+
+	kbdev->pm.debug_core_mask = new_core_mask;
+	kbase_pm_update_dynamic_cores_onoff(kbdev);
+}
+KBASE_EXPORT_TEST_API(kbase_pm_set_debug_core_mask);
+#else
 void kbase_pm_set_debug_core_mask(struct kbase_device *kbdev,
 		u64 new_core_mask_js0, u64 new_core_mask_js1,
 		u64 new_core_mask_js2)
@@ -685,6 +747,7 @@ void kbase_pm_set_debug_core_mask(struct kbase_device *kbdev,

 	kbase_pm_update_dynamic_cores_onoff(kbdev);
 }
+#endif /* MALI_USE_CSF */

 void kbase_hwaccess_pm_gpu_active(struct kbase_device *kbdev)
 {
@@ -700,7 +763,8 @@ void kbase_hwaccess_pm_suspend(struct kbase_device *kbdev)
 {
 	/* Force power off the GPU and all cores (regardless of policy), only
 	 * after the PM active count reaches zero (otherwise, we risk turning it
-	 * off prematurely) */
+	 * off prematurely)
+	 */
 	kbase_pm_lock(kbdev);

 	kbase_pm_do_poweroff(kbdev);
@@ -735,6 +799,7 @@ void kbase_hwaccess_pm_resume(struct kbase_device *kbdev)
 	kbase_backend_timer_resume(kbdev);
 #endif /* !MALI_USE_CSF */

+	wake_up_all(&kbdev->pm.resume_wait);
 	kbase_pm_unlock(kbdev);
 }

@@ -745,6 +810,9 @@ void kbase_pm_handle_gpu_lost(struct kbase_device *kbdev)
 	ktime_t end_timestamp = ktime_get();
 	struct kbase_arbiter_vm_state *arb_vm_state = kbdev->pm.arb_vm_state;

+	if (!kbdev->arb.arb_if)
+		return;
+
 	mutex_lock(&kbdev->pm.lock);
 	mutex_lock(&arb_vm_state->vm_state_lock);
 	if (kbdev->pm.backend.gpu_powered &&
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2013-2018, 2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -59,6 +58,14 @@ void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask)

 	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);

+#if MALI_USE_CSF
+	if (!(core_mask & kbdev->pm.debug_core_mask)) {
+		dev_err(kbdev->dev,
+			"OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
+			core_mask, kbdev->pm.debug_core_mask);
+		goto unlock;
+	}
+#else
 	if (!(core_mask & kbdev->pm.debug_core_mask_all)) {
 		dev_err(kbdev->dev, "OPP core mask 0x%llX does not intersect with debug mask 0x%llX\n",
 				core_mask, kbdev->pm.debug_core_mask_all);
@@ -69,6 +76,7 @@ void kbase_devfreq_set_core_mask(struct kbase_device *kbdev, u64 core_mask)
 		dev_err(kbdev->dev, "Dynamic core scaling not supported as dummy job WA is enabled");
 		goto unlock;
 	}
+#endif /* MALI_USE_CSF */

 	pm_backend->ca_cores_enabled = core_mask;

@@ -80,21 +88,32 @@ unlock:
 	dev_dbg(kbdev->dev, "Devfreq policy : new core mask=%llX\n",
 			pm_backend->ca_cores_enabled);
 }
+KBASE_EXPORT_TEST_API(kbase_devfreq_set_core_mask);
 #endif

 u64 kbase_pm_ca_get_core_mask(struct kbase_device *kbdev)
 {
-#ifdef CONFIG_MALI_BIFROST_DEVFREQ
-	struct kbase_pm_backend_data *pm_backend = &kbdev->pm.backend;
+#if MALI_USE_CSF
+	u64 debug_core_mask = kbdev->pm.debug_core_mask;
+#else
+	u64 debug_core_mask = kbdev->pm.debug_core_mask_all;
 #endif

 	lockdep_assert_held(&kbdev->hwaccess_lock);

 #ifdef CONFIG_MALI_BIFROST_DEVFREQ
-	return pm_backend->ca_cores_enabled & kbdev->pm.debug_core_mask_all;
+	/*
+	 * Although in the init we let the pm_backend->ca_cores_enabled to be
+	 * the max config (it uses the base_gpu_props), at this function we need
+	 * to limit it to be a subgroup of the curr config, otherwise the
+	 * shaders state machine on the PM does not evolve.
+	 */
+	return kbdev->gpu_props.curr_config.shader_present &
+			kbdev->pm.backend.ca_cores_enabled &
+			debug_core_mask;
 #else
-	return kbdev->gpu_props.props.raw_props.shader_present &
-			kbdev->pm.debug_core_mask_all;
+	return kbdev->gpu_props.curr_config.shader_present &
+		debug_core_mask;
 #endif
 }

--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2011-2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2011-2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca_devfreq.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_ca_devfreq.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2017 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2017, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_coarse_demand.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_coarse_demand.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2012-2016, 2018-2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2012-2016, 2018-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * "Coarse Demand" power management policy
 */
@@ -61,6 +58,9 @@ const struct kbase_pm_policy kbase_pm_coarse_demand_policy_ops = {
 	coarse_demand_shaders_needed,		/* shaders_needed */
 	coarse_demand_get_core_active,		/* get_core_active */
 	KBASE_PM_POLICY_ID_COARSE_DEMAND,	/* id */
+#if MALI_USE_CSF
+	COARSE_ON_DEMAND_PM_SCHED_FLAGS,	/* pm_sched_flags */
+#endif
 };

 KBASE_EXPORT_TEST_API(kbase_pm_coarse_demand_policy_ops);
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_coarse_demand.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_coarse_demand.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2012-2015,2018 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2012-2015, 2018, 2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * "Coarse Demand" power management policy
 */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_defs.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_defs.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2014-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -62,24 +61,9 @@ enum kbase_pm_core_type {
 	KBASE_PM_CORE_STACK = STACK_PRESENT_LO
 };

-/**
+/*
 * enum kbase_l2_core_state - The states used for the L2 cache & tiler power
 *                            state machine.
- *
- * @KBASE_L2_OFF: The L2 cache and tiler are off
- * @KBASE_L2_PEND_ON: The L2 cache and tiler are powering on
- * @KBASE_L2_RESTORE_CLOCKS: The GPU clock is restored. Conditionally used.
- * @KBASE_L2_ON_HWCNT_ENABLE: The L2 cache and tiler are on, and hwcnt is being
- *                            enabled
- * @KBASE_L2_ON: The L2 cache and tiler are on, and hwcnt is enabled
- * @KBASE_L2_ON_HWCNT_DISABLE: The L2 cache and tiler are on, and hwcnt is being
- *                             disabled
- * @KBASE_L2_SLOW_DOWN_CLOCKS: The GPU clock is set to appropriate or lowest
- *                             clock. Conditionally used.
- * @KBASE_L2_POWER_DOWN: The L2 cache and tiler are about to be powered off
- * @KBASE_L2_PEND_OFF: The L2 cache and tiler are powering off
- * @KBASE_L2_RESET_WAIT: The GPU is resetting, L2 cache and tiler power state
- *                       are unknown
 */
 enum kbase_l2_core_state {
 #define KBASEP_L2_STATE(n) KBASE_L2_ ## n,
@@ -88,24 +72,8 @@ enum kbase_l2_core_state {
 };

 #if MALI_USE_CSF
-/**
+/*
 * enum kbase_mcu_state - The states used for the MCU state machine.
- *
- * @KBASE_MCU_OFF:            The MCU is powered off.
- * @KBASE_MCU_PEND_ON_RELOAD: The warm boot of MCU or cold boot of MCU (with
- *                            firmware reloading) is in progress.
- * @KBASE_MCU_ON_GLB_REINIT_PEND: The MCU is enabled and Global configuration
- *                                requests have been sent to the firmware.
- * @KBASE_MCU_ON_HWCNT_ENABLE: The Global requests have completed and MCU is
- *                             now ready for use and hwcnt is being enabled.
- * @KBASE_MCU_ON:             The MCU is active and hwcnt has been enabled.
- * @KBASE_MCU_ON_HWCNT_DISABLE: The MCU is on and hwcnt is being disabled.
- * @KBASE_MCU_ON_HALT:        The MCU is on and hwcnt has been disabled,
- *                            MCU halt would be triggered.
- * @KBASE_MCU_ON_PEND_HALT:   MCU halt in progress, confirmation pending.
- * @KBASE_MCU_POWER_DOWN:     MCU halted operations, pending being disabled.
- * @KBASE_MCU_PEND_OFF:       MCU is being disabled, pending on powering off.
- * @KBASE_MCU_RESET_WAIT:     The GPU is resetting, MCU state is unknown.
 */
 enum kbase_mcu_state {
 #define KBASEP_MCU_STATE(n) KBASE_MCU_ ## n,
@@ -114,45 +82,8 @@ enum kbase_mcu_state {
 };
 #endif

-/**
+/*
 * enum kbase_shader_core_state - The states used for the shaders' state machine.
- *
- * @KBASE_SHADERS_OFF_CORESTACK_OFF: The shaders and core stacks are off
- * @KBASE_SHADERS_OFF_CORESTACK_PEND_ON: The shaders are off, core stacks have
- *                                       been requested to power on and hwcnt
- *                                       is being disabled
- * @KBASE_SHADERS_PEND_ON_CORESTACK_ON: Core stacks are on, shaders have been
- *                                      requested to power on. Or after doing
- *                                      partial shader on/off, checking whether
- *                                      it's the desired state.
- * @KBASE_SHADERS_ON_CORESTACK_ON: The shaders and core stacks are on, and hwcnt
- *					already enabled.
- * @KBASE_SHADERS_ON_CORESTACK_ON_RECHECK: The shaders and core stacks
- *                                      are on, hwcnt disabled, and checks
- *                                      to powering down or re-enabling
- *                                      hwcnt.
- * @KBASE_SHADERS_WAIT_OFF_CORESTACK_ON: The shaders have been requested to
- *                                       power off, but they remain on for the
- *                                       duration of the hysteresis timer
- * @KBASE_SHADERS_WAIT_GPU_IDLE: The shaders partial poweroff needs to reach
- *                               a state where jobs on the GPU are finished
- *                               including jobs currently running and in the
- *                               GPU queue because of GPU2017-861
- * @KBASE_SHADERS_WAIT_FINISHED_CORESTACK_ON: The hysteresis timer has expired
- * @KBASE_SHADERS_L2_FLUSHING_CORESTACK_ON: The core stacks are on and the
- *                                          level 2 cache is being flushed.
- * @KBASE_SHADERS_READY_OFF_CORESTACK_ON: The core stacks are on and the shaders
- *                                        are ready to be powered off.
- * @KBASE_SHADERS_PEND_OFF_CORESTACK_ON: The core stacks are on, and the shaders
- *                                       have been requested to power off
- * @KBASE_SHADERS_OFF_CORESTACK_PEND_OFF: The shaders are off, and the core stacks
- *                                        have been requested to power off
- * @KBASE_SHADERS_OFF_CORESTACK_OFF_TIMER_PEND_OFF: Shaders and corestacks are
- *                                                  off, but the tick timer
- *                                                  cancellation is still
- *                                                  pending.
- * @KBASE_SHADERS_RESET_WAIT: The GPU is resetting, shader and core stack power
- *                            states are unknown
 */
 enum kbase_shader_core_state {
 #define KBASEP_SHADER_STATE(n) KBASE_SHADERS_ ## n,
@@ -164,28 +95,40 @@ enum kbase_shader_core_state {
 * struct kbasep_pm_metrics - Metrics data collected for use by the power
 *                            management framework.
 *
- *  @time_busy: number of ns the GPU was busy executing jobs since the
- *          @time_period_start timestamp.
- *  @time_idle: number of ns since time_period_start the GPU was not executing
- *          jobs since the @time_period_start timestamp.
- *  @busy_cl: number of ns the GPU was busy executing CL jobs. Note that
- *           if two CL jobs were active for 400ns, this value would be updated
- *           with 800.
- *  @busy_gl: number of ns the GPU was busy executing GL jobs. Note that
- *           if two GL jobs were active for 400ns, this value would be updated
- *           with 800.
+ *  @time_busy: the amount of time the GPU was busy executing jobs since the
+ *          @time_period_start timestamp, in units of 256ns. This also includes
+ *          time_in_protm, the time spent in protected mode, since it's assumed
+ *          the GPU was busy 100% during this period.
+ *  @time_idle: the amount of time the GPU was not executing jobs since the
+ *              time_period_start timestamp, measured in units of 256ns.
+ *  @time_in_protm: The amount of time the GPU has spent in protected mode since
+ *                  the time_period_start timestamp, measured in units of 256ns.
+ *  @busy_cl: the amount of time the GPU was busy executing CL jobs. Note that
+ *           if two CL jobs were active for 256ns, this value would be updated
+ *           with 2 (2x256ns).
+ *  @busy_gl: the amount of time the GPU was busy executing GL jobs. Note that
+ *           if two GL jobs were active for 256ns, this value would be updated
+ *           with 2 (2x256ns).
 */
 struct kbasep_pm_metrics {
 	u32 time_busy;
 	u32 time_idle;
+#if MALI_USE_CSF
+	u32 time_in_protm;
+#else
 	u32 busy_cl[2];
 	u32 busy_gl;
+#endif
 };

 /**
 * struct kbasep_pm_metrics_state - State required to collect the metrics in
 *                                  struct kbasep_pm_metrics
 *  @time_period_start: time at which busy/idle measurements started
+ *  @ipa_control_client: Handle returned on registering DVFS as a
+ *                       kbase_ipa_control client
+ *  @skip_gpu_active_sanity_check: Decide whether to skip GPU_ACTIVE sanity
+ *                                 check in DVFS utilisation calculation
 *  @gpu_active: true when the GPU is executing jobs. false when
 *           not. Updated when the job scheduler informs us a job in submitted
 *           or removed from a GPU slot.
@@ -197,6 +140,7 @@ struct kbasep_pm_metrics {
 *  @values: The current values of the power management metrics. The
 *           kbase_pm_get_dvfs_metrics() function is used to compare these
 *           current values with the saved values from a previous invocation.
+ *  @initialized: tracks whether metrics_state has been initialized or not.
 *  @timer: timer to regularly make DVFS decisions based on the power
 *           management metrics.
 *  @timer_active: boolean indicating @timer is running
@@ -205,9 +149,14 @@ struct kbasep_pm_metrics {
 */
 struct kbasep_pm_metrics_state {
 	ktime_t time_period_start;
+#if MALI_USE_CSF
+	void *ipa_control_client;
+	bool skip_gpu_active_sanity_check;
+#else
 	bool gpu_active;
 	u32 active_cl_ctx[2];
 	u32 active_gl_ctx[3];
+#endif
 	spinlock_t lock;

 	void *platform_data;
@@ -216,6 +165,7 @@ struct kbasep_pm_metrics_state {
 	struct kbasep_pm_metrics values;

 #ifdef CONFIG_MALI_BIFROST_DVFS
+	bool initialized;
 	struct hrtimer timer;
 	bool timer_active;
 	struct kbasep_pm_metrics dvfs_last;
@@ -326,6 +276,8 @@ union kbase_pm_policy_data {
 * @callback_soft_reset: Optional callback to software reset the GPU. See
 *                       &struct kbase_pm_callback_conf
 * @ca_cores_enabled: Cores that are currently available
+ * @mcu_state: The current state of the micro-control unit, only applicable
+ *             to GPUs that have such a component
 * @l2_state:     The current state of the L2 cache state machine. See
 *                &enum kbase_l2_core_state
 * @l2_desired:   True if the L2 cache should be powered on by the L2 cache state
@@ -335,10 +287,10 @@ union kbase_pm_policy_data {
 * @shaders_avail: This is updated by the state machine when it is in a state
 *                 where it can write to the SHADER_PWRON or PWROFF registers
 *                 to have the same set of available cores as specified by
- *                 @shaders_desired_mask. So it would eventually have the same
- *                 value as @shaders_desired_mask and would precisely indicate
- *                 the cores that are currently available. This is internal to
- *                 shader state machine and should *not* be modified elsewhere.
+ *                 @shaders_desired_mask. So would precisely indicate the cores
+ *                 that are currently available. This is internal to shader
+ *                 state machine of JM GPUs and should *not* be modified
+ *                 elsewhere.
 * @shaders_desired_mask: This is updated by the state machine when it is in
 *                        a state where it can handle changes to the core
 *                        availability (either by DVFS or sysfs). This is
@@ -350,6 +302,16 @@ union kbase_pm_policy_data {
 *                   cores may be different, but there should be transitions in
 *                   progress that will eventually achieve this state (assuming
 *                   that the policy doesn't change its mind in the mean time).
+ * @mcu_desired: True if the micro-control unit should be powered on
+ * @policy_change_clamp_state_to_off: Signaling the backend is in PM policy
+ *                change transition, needs the mcu/L2 to be brought back to the
+ *                off state and remain in that state until the flag is cleared.
+ * @csf_pm_sched_flags: CSF Dynamic PM control flags in accordance to the
+ *                current active PM policy. This field is updated whenever a
+ *                new policy is activated.
+ * @policy_change_lock: Used to serialize the policy change calls. In CSF case,
+ *                      the change of policy may involve the scheduler to
+ *                      suspend running CSGs and then reconfigure the MCU.
 * @in_reset: True if a GPU is resetting and normal power manager operation is
 *            suspended
 * @partial_shaderoff: True if we want to partial power off shader cores,
@@ -440,9 +402,6 @@ struct kbase_pm_backend_data {
 	u64 ca_cores_enabled;

 #if MALI_USE_CSF
-	/* The current state of the micro-control unit, only applicable
-	 * to GPUs that has such a component
-	 */
 	enum kbase_mcu_state mcu_state;
 #endif
 	enum kbase_l2_core_state l2_state;
@@ -450,8 +409,10 @@ struct kbase_pm_backend_data {
 	u64 shaders_avail;
 	u64 shaders_desired_mask;
 #if MALI_USE_CSF
-	/* True if the micro-control unit should be powered on */
 	bool mcu_desired;
+	bool policy_change_clamp_state_to_off;
+	unsigned int csf_pm_sched_flags;
+	struct mutex policy_change_lock;
 #endif
 	bool l2_desired;
 	bool l2_always_on;
@@ -476,6 +437,23 @@ struct kbase_pm_backend_data {
 	struct work_struct gpu_clock_control_work;
 };

+#if MALI_USE_CSF
+/* CSF PM flag, signaling that the MCU CORE should be kept on */
+#define  CSF_DYNAMIC_PM_CORE_KEEP_ON (1 << 0)
+/* CSF PM flag, signaling no scheduler suspension on idle groups */
+#define CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE (1 << 1)
+/* CSF PM flag, signaling no scheduler suspension on no runnable groups */
+#define CSF_DYNAMIC_PM_SCHED_NO_SUSPEND (1 << 2)
+
+/* The following flags corresponds to existing defined PM policies */
+#define ALWAYS_ON_PM_SCHED_FLAGS (CSF_DYNAMIC_PM_CORE_KEEP_ON | \
+				  CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE | \
+				  CSF_DYNAMIC_PM_SCHED_NO_SUSPEND)
+#define COARSE_ON_DEMAND_PM_SCHED_FLAGS (0)
+#if !MALI_CUSTOMER_RELEASE
+#define ALWAYS_ON_DEMAND_PM_SCHED_FLAGS (CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE)
+#endif
+#endif

 /* List of policy IDs */
 enum kbase_pm_policy_id {
@@ -502,11 +480,15 @@ enum kbase_pm_policy_id {
 *                      necessarily the same as its index in the list returned
 *                      by kbase_pm_list_policies().
 *                      It is used purely for debugging.
+ * @pm_sched_flags: Policy associated with CSF PM scheduling operational flags.
+ *                  Pre-defined required flags exist for each of the
+ *                  ARM released policies, such as 'always_on', 'coarse_demand'
+ *                  and etc.
 */
 struct kbase_pm_policy {
 	char *name;

-	/**
+	/*
 	 * Function called when the policy is selected
 	 *
 	 * This should initialize the kbdev->pm.pm_policy_data structure. It
@@ -520,7 +502,7 @@ struct kbase_pm_policy {
 	 */
 	void (*init)(struct kbase_device *kbdev);

-	/**
+	/*
 	 * Function called when the policy is unselected.
 	 *
 	 * @kbdev: The kbase device structure for the device (must be a
@@ -528,7 +510,7 @@ struct kbase_pm_policy {
 	 */
 	void (*term)(struct kbase_device *kbdev);

-	/**
+	/*
 	 * Function called to find out if shader cores are needed
 	 *
 	 * This needs to at least satisfy kbdev->pm.backend.shaders_desired,
@@ -541,7 +523,7 @@ struct kbase_pm_policy {
 	 */
 	bool (*shaders_needed)(struct kbase_device *kbdev);

-	/**
+	/*
 	 * Function called to get the current overall GPU power state
 	 *
 	 * This function must meet or exceed the requirements for power
@@ -555,6 +537,15 @@ struct kbase_pm_policy {
 	bool (*get_core_active)(struct kbase_device *kbdev);

 	enum kbase_pm_policy_id id;
+
+#if MALI_USE_CSF
+	/* Policy associated with CSF PM scheduling operational flags.
+	 * There are pre-defined required flags exist for each of the
+	 * ARM released policies, such as 'always_on', 'coarse_demand'
+	 * and etc.
+	 */
+	unsigned int pm_sched_flags;
+#endif
 };

 #endif /* _KBASE_PM_HWACCESS_DEFS_H_ */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_driver.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_driver.c
@@ -1,12 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2010-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -32,9 +30,13 @@
 #include <mali_kbase_pm.h>
 #include <mali_kbase_config_defaults.h>
 #include <mali_kbase_smc.h>
-#if !MALI_USE_CSF
+
+#if MALI_USE_CSF
+#include <csf/ipa_control/mali_kbase_csf_ipa_control.h>
+#else
 #include <mali_kbase_hwaccess_jm.h>
 #endif /* !MALI_USE_CSF */
+
 #include <mali_kbase_reset_gpu.h>
 #include <mali_kbase_ctx_sched.h>
 #include <mali_kbase_hwcnt_context.h>
@@ -47,6 +49,9 @@
 #ifdef CONFIG_MALI_ARBITER_SUPPORT
 #include <arbiter/mali_kbase_arbiter_pm.h>
 #endif /* CONFIG_MALI_ARBITER_SUPPORT */
+#if MALI_USE_CSF
+#include <csf/ipa_control/mali_kbase_csf_ipa_control.h>
+#endif

 #include <linux/of.h>

@@ -103,13 +108,13 @@ bool kbase_pm_is_mcu_desired(struct kbase_device *kbdev)
 		return true;

 	/* MCU is supposed to be ON, only when scheduler.pm_active_count is
-	 * non zero. But for always_on policy also MCU needs to be ON.
-	 * GPUCORE-24926 will add the proper handling for always_on
-	 * power policy.
+	 * non zero. But for always_on policy, the MCU needs to be kept on,
+	 * unless policy changing transition needs it off.
 	 */
+
 	return (kbdev->pm.backend.mcu_desired &&
-		(kbdev->pm.backend.pm_current_policy ==
-		 &kbase_pm_always_on_policy_ops));
+		kbase_pm_no_mcu_core_pwroff(kbdev) &&
+		!kbdev->pm.backend.policy_change_clamp_state_to_off);
 }
 #endif

@@ -126,6 +131,11 @@ bool kbase_pm_is_l2_desired(struct kbase_device *kbdev)
 			!kbdev->pm.backend.shaders_desired)
 		return false;

+#if MALI_USE_CSF
+	if (kbdev->pm.backend.policy_change_clamp_state_to_off)
+		return false;
+#endif
+
 	return kbdev->pm.backend.l2_desired;
 }

@@ -257,7 +267,8 @@ static void mali_cci_flush_l2(struct kbase_device *kbdev)
 		GPU_CONTROL_REG(GPU_IRQ_RAWSTAT));

 	/* Wait for cache flush to complete before continuing, exit on
-	 * gpu resets or loop expiry. */
+	 * gpu resets or loop expiry.
+	 */
 	while (((raw & mask) == 0) && --loops) {
 		raw = kbase_reg_read(kbdev,
 					GPU_CONTROL_REG(GPU_IRQ_RAWSTAT));
@@ -396,9 +407,9 @@ u64 kbase_pm_get_present_cores(struct kbase_device *kbdev,

 	switch (type) {
 	case KBASE_PM_CORE_L2:
-		return kbdev->gpu_props.props.raw_props.l2_present;
+		return kbdev->gpu_props.curr_config.l2_present;
 	case KBASE_PM_CORE_SHADER:
-		return kbdev->gpu_props.props.raw_props.shader_present;
+		return kbdev->gpu_props.curr_config.shader_present;
 	case KBASE_PM_CORE_TILER:
 		return kbdev->gpu_props.props.raw_props.tiler_present;
 	case KBASE_PM_CORE_STACK:
@@ -492,14 +503,10 @@ static void kbase_pm_trigger_hwcnt_disable(struct kbase_device *kbdev)
 	 */
 	if (kbase_hwcnt_context_disable_atomic(kbdev->hwcnt_gpu_ctx)) {
 		backend->hwcnt_disabled = true;
+
 	} else {
-#if KERNEL_VERSION(3, 16, 0) > LINUX_VERSION_CODE
-		queue_work(system_wq,
-			&backend->hwcnt_disable_work);
-#else
-		queue_work(system_highpri_wq,
-			&backend->hwcnt_disable_work);
-#endif
+		kbase_hwcnt_context_queue_work(kbdev->hwcnt_gpu_ctx,
+					       &backend->hwcnt_disable_work);
 	}
 }

@@ -517,7 +524,8 @@ static void kbase_pm_l2_config_override(struct kbase_device *kbdev)
 	 * Skip if size and hash are not given explicitly,
 	 * which means default values are used.
 	 */
-	if ((kbdev->l2_size_override == 0) && (kbdev->l2_hash_override == 0))
+	if ((kbdev->l2_size_override == 0) && (kbdev->l2_hash_override == 0) &&
+	    (!kbdev->l2_hash_values_override))
 		return;

 	val = kbase_reg_read(kbdev, GPU_CONTROL_REG(L2_CONFIG));
@@ -528,13 +536,25 @@ static void kbase_pm_l2_config_override(struct kbase_device *kbdev)
 	}

 	if (kbdev->l2_hash_override) {
+		WARN_ON(kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_ASN_HASH));
 		val &= ~L2_CONFIG_HASH_MASK;
 		val |= (kbdev->l2_hash_override << L2_CONFIG_HASH_SHIFT);
+	} else if (kbdev->l2_hash_values_override) {
+		int i;
+
+		WARN_ON(!kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_ASN_HASH));
+		val &= ~L2_CONFIG_ASN_HASH_ENABLE_MASK;
+		val |= (0x1 << L2_CONFIG_ASN_HASH_ENABLE_SHIFT);
+
+		for (i = 0; i < ASN_HASH_COUNT; i++) {
+			dev_dbg(kbdev->dev, "Program 0x%x to ASN_HASH[%d]\n",
+				kbdev->l2_hash_values[i], i);
+			kbase_reg_write(kbdev, GPU_CONTROL_REG(ASN_HASH(i)),
+					kbdev->l2_hash_values[i]);
+		}
 	}

 	dev_dbg(kbdev->dev, "Program 0x%x to L2_CONFIG\n", val);
-
-	/* Write L2_CONFIG to override */
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(L2_CONFIG), val);
 }

@@ -561,6 +581,35 @@ static const char *kbase_mcu_state_to_string(enum kbase_mcu_state state)
 		return strings[state];
 }

+static inline bool kbase_pm_handle_mcu_core_attr_update(struct kbase_device *kbdev)
+{
+	struct kbase_pm_backend_data *backend = &kbdev->pm.backend;
+	bool timer_update;
+	bool core_mask_update;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	WARN_ON(backend->mcu_state != KBASE_MCU_ON);
+
+	/* This function is only for cases where the MCU managing Cores, if
+	 * the firmware mode is with host control, do nothing here.
+	 */
+	if (unlikely(kbdev->csf.firmware_hctl_core_pwr))
+		return false;
+
+	core_mask_update =
+		backend->shaders_avail != backend->shaders_desired_mask;
+
+	timer_update = kbdev->csf.mcu_core_pwroff_dur_count !=
+			kbdev->csf.mcu_core_pwroff_reg_shadow;
+
+	if (core_mask_update || timer_update)
+		kbase_csf_firmware_update_core_attr(kbdev, timer_update,
+			core_mask_update, backend->shaders_desired_mask);
+
+	return (core_mask_update || timer_update);
+}
+
 static int kbase_pm_mcu_update_state(struct kbase_device *kbdev)
 {
 	struct kbase_pm_backend_data *backend = &kbdev->pm.backend;
@@ -578,11 +627,20 @@ static int kbase_pm_mcu_update_state(struct kbase_device *kbdev)
 	}

 	do {
+		u64 shaders_trans = kbase_pm_get_trans_cores(kbdev, KBASE_PM_CORE_SHADER);
+		u64 shaders_ready = kbase_pm_get_ready_cores(kbdev, KBASE_PM_CORE_SHADER);
+
+		/* mask off ready from trans in case transitions finished
+		 * between the register reads
+		 */
+		shaders_trans &= ~shaders_ready;
+
 		prev_state = backend->mcu_state;

 		switch (backend->mcu_state) {
 		case KBASE_MCU_OFF:
 			if (kbase_pm_is_mcu_desired(kbdev) &&
+			    !backend->policy_change_clamp_state_to_off &&
 			    backend->l2_state == KBASE_L2_ON) {
 				kbase_csf_firmware_trigger_reload(kbdev);
 				backend->mcu_state = KBASE_MCU_PEND_ON_RELOAD;
@@ -591,35 +649,116 @@ static int kbase_pm_mcu_update_state(struct kbase_device *kbdev)

 		case KBASE_MCU_PEND_ON_RELOAD:
 			if (kbdev->csf.firmware_reloaded) {
-				kbase_csf_firmware_global_reinit(kbdev);
+				backend->shaders_desired_mask =
+					kbase_pm_ca_get_core_mask(kbdev);
+				kbase_csf_firmware_global_reinit(kbdev,
+					backend->shaders_desired_mask);
 				backend->mcu_state =
 					KBASE_MCU_ON_GLB_REINIT_PEND;
 			}
 			break;

 		case KBASE_MCU_ON_GLB_REINIT_PEND:
-			if (kbase_csf_firmware_global_reinit_complete(kbdev))
+			if (kbase_csf_firmware_global_reinit_complete(kbdev)) {
+				backend->shaders_avail =
+						backend->shaders_desired_mask;
+				backend->pm_shaders_core_mask = 0;
+				if (kbdev->csf.firmware_hctl_core_pwr) {
+					kbase_pm_invoke(kbdev, KBASE_PM_CORE_SHADER,
+						backend->shaders_avail, ACTION_PWRON);
+					backend->mcu_state =
+						KBASE_MCU_HCTL_SHADERS_PEND_ON;
+				} else
+					backend->mcu_state = KBASE_MCU_ON_HWCNT_ENABLE;
+			}
+			break;
+
+		case KBASE_MCU_HCTL_SHADERS_PEND_ON:
+			if (!shaders_trans &&
+			    shaders_ready == backend->shaders_avail) {
+				/* Cores now stable, notify MCU the stable mask */
+				kbase_csf_firmware_update_core_attr(kbdev,
+						false, true, shaders_ready);
+
+				backend->pm_shaders_core_mask = shaders_ready;
+				backend->mcu_state =
+					KBASE_MCU_HCTL_CORES_NOTIFY_PEND;
+			}
+			break;
+
+		case KBASE_MCU_HCTL_CORES_NOTIFY_PEND:
+			/* Wait for the acknowledgement */
+			if (kbase_csf_firmware_core_attr_updated(kbdev))
 				backend->mcu_state = KBASE_MCU_ON_HWCNT_ENABLE;
 			break;

 		case KBASE_MCU_ON_HWCNT_ENABLE:
 			backend->hwcnt_desired = true;
 			if (backend->hwcnt_disabled) {
+				unsigned long flags;
+
+				kbase_csf_scheduler_spin_lock(kbdev, &flags);
 				kbase_hwcnt_context_enable(
 					kbdev->hwcnt_gpu_ctx);
+				kbase_csf_scheduler_spin_unlock(kbdev, flags);
 				backend->hwcnt_disabled = false;
 			}
 			backend->mcu_state = KBASE_MCU_ON;
 			break;

 		case KBASE_MCU_ON:
+			backend->shaders_desired_mask = kbase_pm_ca_get_core_mask(kbdev);
+
 			if (!kbase_pm_is_mcu_desired(kbdev))
 				backend->mcu_state = KBASE_MCU_ON_HWCNT_DISABLE;
+			else if (kbdev->csf.firmware_hctl_core_pwr) {
+				/* Host control add additional Cores to be active */
+				if (backend->shaders_desired_mask & ~shaders_ready) {
+					backend->hwcnt_desired = false;
+					if (!backend->hwcnt_disabled)
+						kbase_pm_trigger_hwcnt_disable(kbdev);
+					backend->mcu_state =
+						KBASE_MCU_HCTL_MCU_ON_RECHECK;
+				}
+			} else if (kbase_pm_handle_mcu_core_attr_update(kbdev))
+				kbdev->pm.backend.mcu_state =
+					KBASE_MCU_ON_CORE_ATTR_UPDATE_PEND;
 			break;

-		/* ToDo. Add new state(s) if shader cores mask change for DVFS
-		 * has to be accommodated in the MCU state machine.
-		 */
+		case KBASE_MCU_HCTL_MCU_ON_RECHECK:
+			backend->shaders_desired_mask = kbase_pm_ca_get_core_mask(kbdev);
+
+			if (!backend->hwcnt_disabled) {
+				/* Wait for being disabled */
+				;
+			} else if (!kbase_pm_is_mcu_desired(kbdev)) {
+				/* Converging to MCU powering down flow */
+				backend->mcu_state = KBASE_MCU_ON_HWCNT_DISABLE;
+			} else if (backend->shaders_desired_mask & ~shaders_ready) {
+				/* set cores ready but not available to
+				 * meet SHADERS_PEND_ON check pass
+				 */
+				backend->shaders_avail =
+					(backend->shaders_desired_mask | shaders_ready);
+
+				kbase_pm_invoke(kbdev, KBASE_PM_CORE_SHADER,
+						backend->shaders_avail & ~shaders_ready,
+						ACTION_PWRON);
+				backend->mcu_state =
+					KBASE_MCU_HCTL_SHADERS_PEND_ON;
+			} else {
+				backend->mcu_state =
+					KBASE_MCU_HCTL_SHADERS_PEND_ON;
+			}
+			break;
+
+		case KBASE_MCU_ON_CORE_ATTR_UPDATE_PEND:
+			if (kbase_csf_firmware_core_attr_updated(kbdev)) {
+				backend->shaders_avail =
+					backend->shaders_desired_mask;
+				backend->mcu_state = KBASE_MCU_ON;
+			}
+			break;

 		case KBASE_MCU_ON_HWCNT_DISABLE:
 			if (kbase_pm_is_mcu_desired(kbdev)) {
@@ -639,14 +778,32 @@ static int kbase_pm_mcu_update_state(struct kbase_device *kbdev)
 			if (!kbase_pm_is_mcu_desired(kbdev)) {
 				kbase_csf_firmware_trigger_mcu_halt(kbdev);
 				backend->mcu_state = KBASE_MCU_ON_PEND_HALT;
-			} else if (kbase_pm_is_mcu_desired(kbdev)) {
+			} else
 				backend->mcu_state = KBASE_MCU_ON_HWCNT_ENABLE;
-			}
 			break;

 		case KBASE_MCU_ON_PEND_HALT:
-			if (kbase_csf_firmware_mcu_halted(kbdev))
+			if (kbase_csf_firmware_mcu_halted(kbdev)) {
+				if (kbdev->csf.firmware_hctl_core_pwr)
+					backend->mcu_state =
+						KBASE_MCU_HCTL_SHADERS_READY_OFF;
+				else
+					backend->mcu_state = KBASE_MCU_POWER_DOWN;
+			}
+			break;
+
+		case KBASE_MCU_HCTL_SHADERS_READY_OFF:
+			kbase_pm_invoke(kbdev, KBASE_PM_CORE_SHADER,
+					shaders_ready, ACTION_PWROFF);
+			backend->mcu_state =
+				KBASE_MCU_HCTL_SHADERS_PEND_OFF;
+			break;
+
+		case KBASE_MCU_HCTL_SHADERS_PEND_OFF:
+			if (!shaders_trans && !shaders_ready) {
+				backend->pm_shaders_core_mask = 0;
 				backend->mcu_state = KBASE_MCU_POWER_DOWN;
+			}
 			break;

 		case KBASE_MCU_POWER_DOWN:
@@ -698,8 +855,10 @@ static const char *kbase_l2_core_state_to_string(enum kbase_l2_core_state state)
 static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 {
 	struct kbase_pm_backend_data *backend = &kbdev->pm.backend;
-	u64 l2_present = kbdev->gpu_props.props.raw_props.l2_present;
+	u64 l2_present = kbdev->gpu_props.curr_config.l2_present;
+#if !MALI_USE_CSF
 	u64 tiler_present = kbdev->gpu_props.props.raw_props.tiler_present;
+#endif
 	enum kbase_l2_core_state prev_state;

 	lockdep_assert_held(&kbdev->hwaccess_lock);
@@ -710,10 +869,13 @@ static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 				KBASE_PM_CORE_L2);
 		u64 l2_ready = kbase_pm_get_ready_cores(kbdev,
 				KBASE_PM_CORE_L2);
+
+#if !MALI_USE_CSF
 		u64 tiler_trans = kbase_pm_get_trans_cores(kbdev,
 				KBASE_PM_CORE_TILER);
 		u64 tiler_ready = kbase_pm_get_ready_cores(kbdev,
 				KBASE_PM_CORE_TILER);
+#endif

 		/*
 		 * kbase_pm_get_ready_cores and kbase_pm_get_trans_cores
@@ -736,8 +898,9 @@ static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 		 * between the register reads
 		 */
 		l2_trans &= ~l2_ready;
+#if !MALI_USE_CSF
 		tiler_trans &= ~tiler_ready;
-
+#endif
 		prev_state = backend->l2_state;

 		switch (backend->l2_state) {
@@ -748,7 +911,7 @@ static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 				 * powering it on
 				 */
 				kbase_pm_l2_config_override(kbdev);
-
+#if !MALI_USE_CSF
 				/* L2 is required, power on.  Powering on the
 				 * tiler will also power the first L2 cache.
 				 */
@@ -762,14 +925,30 @@ static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 					kbase_pm_invoke(kbdev, KBASE_PM_CORE_L2,
 							l2_present & ~1,
 							ACTION_PWRON);
+#else
+				/* With CSF firmware, Host driver doesn't need to
+				 * handle power management with both shader and tiler cores.
+				 * The CSF firmware will power up the cores appropriately.
+				 * So only power the l2 cache explicitly.
+				 */
+				kbase_pm_invoke(kbdev, KBASE_PM_CORE_L2,
+						l2_present, ACTION_PWRON);
+#endif
 				backend->l2_state = KBASE_L2_PEND_ON;
 			}
 			break;

 		case KBASE_L2_PEND_ON:
+#if !MALI_USE_CSF
 			if (!l2_trans && l2_ready == l2_present && !tiler_trans
 					&& tiler_ready == tiler_present) {
-				KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_AVAILABLE_TILER, NULL, tiler_ready);
+				KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_AVAILABLE_TILER, NULL,
+						tiler_ready);
+#else
+			if (!l2_trans && l2_ready == l2_present) {
+				KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_AVAILABLE_L2, NULL,
+						l2_ready);
+#endif
 				/*
 				 * Ensure snoops are enabled after L2 is powered
 				 * up. Note that kbase keeps track of the snoop
@@ -948,9 +1127,11 @@ static int kbase_pm_l2_update_state(struct kbase_device *kbdev)
 				 */
 				kbase_gpu_start_cache_clean_nolock(
 						kbdev);
-
+#if !MALI_USE_CSF
 			KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_AVAILABLE_TILER, NULL, 0u);
-
+#else
+			KBASE_KTRACE_ADD(kbdev, PM_CORES_CHANGE_AVAILABLE_L2, NULL, 0u);
+#endif
 			backend->l2_state = KBASE_L2_PEND_OFF;
 			break;

@@ -1078,7 +1259,6 @@ static int kbase_pm_shaders_update_state(struct kbase_device *kbdev)
 			&kbdev->pm.backend.shader_tick_timer;
 	enum kbase_shader_core_state prev_state;
 	u64 stacks_avail = 0;
-	int err = 0;

 	lockdep_assert_held(&kbdev->hwaccess_lock);

@@ -1173,8 +1353,18 @@ static int kbase_pm_shaders_update_state(struct kbase_device *kbdev)
 				backend->pm_shaders_core_mask = shaders_ready;
 				backend->hwcnt_desired = true;
 				if (backend->hwcnt_disabled) {
+#if MALI_USE_CSF
+					unsigned long flags;
+
+					kbase_csf_scheduler_spin_lock(kbdev,
+								      &flags);
+#endif
 					kbase_hwcnt_context_enable(
 						kbdev->hwcnt_gpu_ctx);
+#if MALI_USE_CSF
+					kbase_csf_scheduler_spin_unlock(kbdev,
+									flags);
+#endif
 					backend->hwcnt_disabled = false;
 				}

@@ -1354,8 +1544,18 @@ static int kbase_pm_shaders_update_state(struct kbase_device *kbdev)
 				backend->pm_shaders_core_mask = 0;
 				backend->hwcnt_desired = true;
 				if (backend->hwcnt_disabled) {
+#if MALI_USE_CSF
+					unsigned long flags;
+
+					kbase_csf_scheduler_spin_lock(kbdev,
+								      &flags);
+#endif
 					kbase_hwcnt_context_enable(
 						kbdev->hwcnt_gpu_ctx);
+#if MALI_USE_CSF
+					kbase_csf_scheduler_spin_unlock(kbdev,
+									flags);
+#endif
 					backend->hwcnt_disabled = false;
 				}
 				backend->shaders_state = KBASE_SHADERS_OFF_CORESTACK_OFF_TIMER_PEND_OFF;
@@ -1382,7 +1582,7 @@ static int kbase_pm_shaders_update_state(struct kbase_device *kbdev)

 	} while (backend->shaders_state != prev_state);

-	return err;
+	return 0;
 }
 #endif

@@ -1647,7 +1847,8 @@ void kbase_pm_reset_complete(struct kbase_device *kbdev)

 /* Timeout for kbase_pm_wait_for_desired_state when wait_event_killable has
 * aborted due to a fatal signal. If the time spent waiting has exceeded this
- * threshold then there is most likely a hardware issue. */
+ * threshold then there is most likely a hardware issue.
+ */
 #define PM_TIMEOUT_MS (5000) /* 5s */

 static void kbase_pm_timed_out(struct kbase_device *kbdev)
@@ -1706,7 +1907,8 @@ static void kbase_pm_timed_out(struct kbase_device *kbdev)
 					L2_PWRTRANS_LO)));

 	dev_err(kbdev->dev, "Sending reset to GPU - all running jobs will be lost\n");
-	if (kbase_prepare_to_reset_gpu(kbdev))
+	if (kbase_prepare_to_reset_gpu(kbdev,
+				       RESET_FLAGS_HWC_UNRECOVERABLE_ERROR))
 		kbase_reset_gpu(kbdev);
 }

@@ -1774,7 +1976,7 @@ void kbase_pm_enable_interrupts(struct kbase_device *kbdev)
 {
 	unsigned long flags;

-	KBASE_DEBUG_ASSERT(NULL != kbdev);
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
 	/*
 	 * Clear all interrupts,
 	 * and unmask them all.
@@ -1800,7 +2002,7 @@ KBASE_EXPORT_TEST_API(kbase_pm_enable_interrupts);

 void kbase_pm_disable_interrupts_nolock(struct kbase_device *kbdev)
 {
-	KBASE_DEBUG_ASSERT(NULL != kbdev);
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
 	/*
 	 * Mask all interrupts,
 	 * and clear them all.
@@ -1827,6 +2029,22 @@ void kbase_pm_disable_interrupts(struct kbase_device *kbdev)

 KBASE_EXPORT_TEST_API(kbase_pm_disable_interrupts);

+#if MALI_USE_CSF
+static void update_user_reg_page_mapping(struct kbase_device *kbdev)
+{
+	lockdep_assert_held(&kbdev->pm.lock);
+
+	if (kbdev->csf.mali_file_inode) {
+		/* This would zap the pte corresponding to the mapping of User
+		 * register page for all the Kbase contexts.
+		 */
+		unmap_mapping_range(kbdev->csf.mali_file_inode->i_mapping,
+				    BASEP_MEM_CSF_USER_REG_PAGE_HANDLE,
+				    PAGE_SIZE, 1);
+	}
+}
+#endif
+
 /*
 * pmu layout:
 * 0x0000: PMU TAG (RO) (0xCAFECAFE)
@@ -1838,7 +2056,7 @@ void kbase_pm_clock_on(struct kbase_device *kbdev, bool is_resume)
 	bool reset_required = is_resume;
 	unsigned long flags;

-	KBASE_DEBUG_ASSERT(NULL != kbdev);
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
 #if !MALI_USE_CSF
 	lockdep_assert_held(&kbdev->js_data.runpool_mutex);
 #endif /* !MALI_USE_CSF */
@@ -1876,24 +2094,39 @@ void kbase_pm_clock_on(struct kbase_device *kbdev, bool is_resume)
 	kbdev->pm.backend.gpu_powered = true;
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);

+#if MALI_USE_CSF
+	/* GPU has been turned on, can switch to actual register page */
+	update_user_reg_page_mapping(kbdev);
+#endif
+
 	if (reset_required) {
 		/* GPU state was lost, reset GPU to ensure it is in a
-		 * consistent state */
+		 * consistent state
+		 */
 		kbase_pm_init_hw(kbdev, PM_ENABLE_IRQS);
 	}
 #ifdef CONFIG_MALI_ARBITER_SUPPORT
 	else {
-		struct kbase_arbiter_vm_state *arb_vm_state =
+		if (kbdev->arb.arb_if) {
+			struct kbase_arbiter_vm_state *arb_vm_state =
 				kbdev->pm.arb_vm_state;

-		/* In the case that the GPU has just been granted by
-		 * the Arbiter, a reset will have already been done.
-		 * However, it is still necessary to initialize the GPU.
-		 */
-		if (arb_vm_state->vm_arb_starting)
-			kbase_pm_init_hw(kbdev, PM_ENABLE_IRQS |
-					PM_NO_RESET);
+			/* In the case that the GPU has just been granted by
+			 * the Arbiter, a reset will have already been done.
+			 * However, it is still necessary to initialize the GPU.
+			 */
+			if (arb_vm_state->vm_arb_starting)
+				kbase_pm_init_hw(kbdev, PM_ENABLE_IRQS |
+						PM_NO_RESET);
+		}
 	}
+	/*
+	 * This point means that the GPU trasitioned to ON. So there is a chance
+	 * that a repartitioning occurred. In this case the current config
+	 * should be read again.
+	 */
+	kbase_gpuprops_get_curr_config_props(kbdev,
+		&kbdev->gpu_props.curr_config);
 #endif /* CONFIG_MALI_ARBITER_SUPPORT */

 	mutex_lock(&kbdev->mmu_hw_mutex);
@@ -1918,6 +2151,17 @@ void kbase_pm_clock_on(struct kbase_device *kbdev, bool is_resume)
 	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
 	kbdev->pm.backend.gpu_ready = true;
 	kbdev->pm.backend.l2_desired = true;
+#if MALI_USE_CSF
+	if (reset_required) {
+		/* GPU reset was done after the power on, so send the post
+		 * reset event instead. This is okay as GPU power off event
+		 * is same as pre GPU reset event.
+		 */
+		kbase_ipa_control_handle_gpu_reset_post(kbdev);
+	} else {
+		kbase_ipa_control_handle_gpu_power_on(kbdev);
+	}
+#endif
 	kbase_pm_update_state(kbdev);
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
 }
@@ -1928,7 +2172,7 @@ bool kbase_pm_clock_off(struct kbase_device *kbdev)
 {
 	unsigned long flags;

-	KBASE_DEBUG_ASSERT(NULL != kbdev);
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
 	lockdep_assert_held(&kbdev->pm.lock);

 	/* ASSERT that the cores should now be unavailable. No lock needed. */
@@ -1952,12 +2196,16 @@ bool kbase_pm_clock_off(struct kbase_device *kbdev)

 	if (atomic_read(&kbdev->faults_pending)) {
 		/* Page/bus faults are still being processed. The GPU can not
-		 * be powered off until they have completed */
+		 * be powered off until they have completed
+		 */
 		spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
 		return false;
 	}

 	kbase_pm_cache_snoop_disable(kbdev);
+#if MALI_USE_CSF
+	kbase_ipa_control_handle_gpu_power_off(kbdev);
+#endif

 	kbdev->pm.backend.gpu_ready = false;

@@ -1974,6 +2222,12 @@ bool kbase_pm_clock_off(struct kbase_device *kbdev)
 #endif

 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+
+#if MALI_USE_CSF
+	/* GPU is about to be turned off, switch to dummy page */
+	update_user_reg_page_mapping(kbdev);
+#endif
+
 #ifdef CONFIG_MALI_ARBITER_SUPPORT
 	kbase_arbiter_pm_vm_event(kbdev, KBASE_VM_GPU_IDLE_EVENT);
 #endif /* CONFIG_MALI_ARBITER_SUPPORT */
@@ -2021,23 +2275,23 @@ static enum hrtimer_restart kbasep_reset_timeout(struct hrtimer *timer)
 	struct kbasep_reset_timeout_data *rtdata =
 		container_of(timer, struct kbasep_reset_timeout_data, timer);

-	rtdata->timed_out = 1;
+	rtdata->timed_out = true;

 	/* Set the wait queue to wake up kbase_pm_init_hw even though the reset
-	 * hasn't completed */
+	 * hasn't completed
+	 */
 	kbase_pm_reset_done(rtdata->kbdev);

 	return HRTIMER_NORESTART;
 }

-static int kbase_set_jm_quirks(struct kbase_device *kbdev, const u32 prod_id)
+static int kbase_set_gpu_quirks(struct kbase_device *kbdev, const u32 prod_id)
 {
 #if MALI_USE_CSF
-	kbdev->hw_quirks_jm = kbase_reg_read(kbdev,
-				GPU_CONTROL_REG(CSF_CONFIG));
+	kbdev->hw_quirks_gpu =
+		kbase_reg_read(kbdev, GPU_CONTROL_REG(CSF_CONFIG));
 #else
-	u32 hw_quirks_jm = kbase_reg_read(kbdev,
-				GPU_CONTROL_REG(JM_CONFIG));
+	u32 hw_quirks_gpu = kbase_reg_read(kbdev, GPU_CONTROL_REG(JM_CONFIG));

 	if (GPU_ID2_MODEL_MATCH_VALUE(prod_id) == GPU_ID2_PRODUCT_TMIX) {
 		/* Only for tMIx */
@@ -2051,39 +2305,38 @@ static int kbase_set_jm_quirks(struct kbase_device *kbdev, const u32 prod_id)
 		 */
 		if (coherency_features ==
 				COHERENCY_FEATURE_BIT(COHERENCY_ACE)) {
-			hw_quirks_jm |= (COHERENCY_ACE_LITE |
-					COHERENCY_ACE) <<
-					JM_FORCE_COHERENCY_FEATURES_SHIFT;
+			hw_quirks_gpu |= (COHERENCY_ACE_LITE | COHERENCY_ACE)
+					 << JM_FORCE_COHERENCY_FEATURES_SHIFT;
 		}
 	}

 	if (kbase_is_gpu_removed(kbdev))
 		return -EIO;

-	kbdev->hw_quirks_jm = hw_quirks_jm;
+	kbdev->hw_quirks_gpu = hw_quirks_gpu;

 #endif /* !MALI_USE_CSF */
 	if (kbase_hw_has_feature(kbdev, BASE_HW_FEATURE_IDVS_GROUP_SIZE)) {
 		int default_idvs_group_size = 0xF;
-		u32 tmp;
+		u32 group_size = 0;

-		if (of_property_read_u32(kbdev->dev->of_node,
-					"idvs-group-size", &tmp))
-			tmp = default_idvs_group_size;
+		if (of_property_read_u32(kbdev->dev->of_node, "idvs-group-size",
+					 &group_size))
+			group_size = default_idvs_group_size;

-		if (tmp > IDVS_GROUP_MAX_SIZE) {
+		if (group_size > IDVS_GROUP_MAX_SIZE) {
 			dev_err(kbdev->dev,
 				"idvs-group-size of %d is too large. Maximum value is %d",
-				tmp, IDVS_GROUP_MAX_SIZE);
-			tmp = default_idvs_group_size;
+				group_size, IDVS_GROUP_MAX_SIZE);
+			group_size = default_idvs_group_size;
 		}

-		kbdev->hw_quirks_jm |= tmp << IDVS_GROUP_SIZE_SHIFT;
+		kbdev->hw_quirks_gpu |= group_size << IDVS_GROUP_SIZE_SHIFT;
 	}

 #define MANUAL_POWER_CONTROL ((u32)(1 << 8))
 	if (corestack_driver_control)
-		kbdev->hw_quirks_jm |= MANUAL_POWER_CONTROL;
+		kbdev->hw_quirks_gpu |= MANUAL_POWER_CONTROL;

 	return 0;
 }
@@ -2137,18 +2390,17 @@ static int kbase_pm_hw_issues_detect(struct kbase_device *kbdev)
 				GPU_ID_VERSION_PRODUCT_ID_SHIFT;
 	int error = 0;

-	kbdev->hw_quirks_jm = 0;
+	kbdev->hw_quirks_gpu = 0;
 	kbdev->hw_quirks_sc = 0;
 	kbdev->hw_quirks_tiler = 0;
 	kbdev->hw_quirks_mmu = 0;

-	if (!of_property_read_u32(np, "quirks_jm",
-				&kbdev->hw_quirks_jm)) {
+	if (!of_property_read_u32(np, "quirks_gpu", &kbdev->hw_quirks_gpu)) {
 		dev_info(kbdev->dev,
-			"Found quirks_jm = [0x%x] in Devicetree\n",
-			kbdev->hw_quirks_jm);
+			 "Found quirks_gpu = [0x%x] in Devicetree\n",
+			 kbdev->hw_quirks_gpu);
 	} else {
-		error = kbase_set_jm_quirks(kbdev, prod_id);
+		error = kbase_set_gpu_quirks(kbdev, prod_id);
 		if (error)
 			return error;
 	}
@@ -2199,10 +2451,10 @@ static void kbase_pm_hw_issues_apply(struct kbase_device *kbdev)
 			kbdev->hw_quirks_mmu);
 #if MALI_USE_CSF
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(CSF_CONFIG),
-			kbdev->hw_quirks_jm);
+			kbdev->hw_quirks_gpu);
 #else
 	kbase_reg_write(kbdev, GPU_CONTROL_REG(JM_CONFIG),
-			kbdev->hw_quirks_jm);
+			kbdev->hw_quirks_gpu);
 #endif
 }

@@ -2233,6 +2485,7 @@ void kbase_pm_cache_snoop_disable(struct kbase_device *kbdev)
 	}
 }

+#if !MALI_USE_CSF
 static void reenable_protected_mode_hwcnt(struct kbase_device *kbdev)
 {
 	unsigned long irq_flags;
@@ -2245,6 +2498,7 @@ static void reenable_protected_mode_hwcnt(struct kbase_device *kbdev)
 	}
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, irq_flags);
 }
+#endif

 static int kbase_pm_do_reset(struct kbase_device *kbdev)
 {
@@ -2271,7 +2525,7 @@ static int kbase_pm_do_reset(struct kbase_device *kbdev)

 	/* Initialize a structure for tracking the status of the reset */
 	rtdata.kbdev = kbdev;
-	rtdata.timed_out = 0;
+	rtdata.timed_out = false;

 	/* Create a timer to use as a timeout on the reset */
 	hrtimer_init_on_stack(&rtdata.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
@@ -2283,7 +2537,7 @@ static int kbase_pm_do_reset(struct kbase_device *kbdev)
 	/* Wait for the RESET_COMPLETED interrupt to be raised */
 	kbase_pm_wait_for_reset(kbdev);

-	if (rtdata.timed_out == 0) {
+	if (!rtdata.timed_out) {
 		/* GPU has been reset */
 		hrtimer_cancel(&rtdata.timer);
 		destroy_hrtimer_on_stack(&rtdata.timer);
@@ -2291,11 +2545,13 @@ static int kbase_pm_do_reset(struct kbase_device *kbdev)
 	}

 	/* No interrupt has been received - check if the RAWSTAT register says
-	 * the reset has completed */
+	 * the reset has completed
+	 */
 	if ((kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_IRQ_RAWSTAT)) &
 							RESET_COMPLETED)) {
 		/* The interrupt is set in the RAWSTAT; this suggests that the
-		 * interrupts are not getting to the CPU */
+		 * interrupts are not getting to the CPU
+		 */
 		dev_err(kbdev->dev, "Reset interrupt didn't reach CPU. Check interrupt assignments.\n");
 		/* If interrupts aren't working we can't continue. */
 		destroy_hrtimer_on_stack(&rtdata.timer);
@@ -2309,33 +2565,40 @@ static int kbase_pm_do_reset(struct kbase_device *kbdev)
 	}

 	/* The GPU doesn't seem to be responding to the reset so try a hard
-	 * reset */
-	dev_err(kbdev->dev, "Failed to soft-reset GPU (timed out after %d ms), now attempting a hard reset\n",
-								RESET_TIMEOUT);
-	KBASE_KTRACE_ADD(kbdev, CORE_GPU_HARD_RESET, NULL, 0);
-	kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
-						GPU_COMMAND_HARD_RESET);
+	 * reset, but only when NOT in arbitration mode.
+	 */
+#ifdef CONFIG_MALI_ARBITER_SUPPORT
+	if (!kbdev->arb.arb_if) {
+#endif /* CONFIG_MALI_ARBITER_SUPPORT */
+		dev_err(kbdev->dev, "Failed to soft-reset GPU (timed out after %d ms), now attempting a hard reset\n",
+					RESET_TIMEOUT);
+		KBASE_KTRACE_ADD(kbdev, CORE_GPU_HARD_RESET, NULL, 0);
+		kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
+					GPU_COMMAND_HARD_RESET);

-	/* Restart the timer to wait for the hard reset to complete */
-	rtdata.timed_out = 0;
+		/* Restart the timer to wait for the hard reset to complete */
+		rtdata.timed_out = false;

-	hrtimer_start(&rtdata.timer, HR_TIMER_DELAY_MSEC(RESET_TIMEOUT),
-							HRTIMER_MODE_REL);
+		hrtimer_start(&rtdata.timer, HR_TIMER_DELAY_MSEC(RESET_TIMEOUT),
+					HRTIMER_MODE_REL);

-	/* Wait for the RESET_COMPLETED interrupt to be raised */
-	kbase_pm_wait_for_reset(kbdev);
+		/* Wait for the RESET_COMPLETED interrupt to be raised */
+		kbase_pm_wait_for_reset(kbdev);
+
+		if (!rtdata.timed_out) {
+			/* GPU has been reset */
+			hrtimer_cancel(&rtdata.timer);
+			destroy_hrtimer_on_stack(&rtdata.timer);
+			return 0;
+		}

-	if (rtdata.timed_out == 0) {
-		/* GPU has been reset */
-		hrtimer_cancel(&rtdata.timer);
 		destroy_hrtimer_on_stack(&rtdata.timer);
-		return 0;
+
+		dev_err(kbdev->dev, "Failed to hard-reset the GPU (timed out after %d ms)\n",
+					RESET_TIMEOUT);
+#ifdef CONFIG_MALI_ARBITER_SUPPORT
 	}
-
-	destroy_hrtimer_on_stack(&rtdata.timer);
-
-	dev_err(kbdev->dev, "Failed to hard-reset the GPU (timed out after %d ms)\n",
-								RESET_TIMEOUT);
+#endif /* CONFIG_MALI_ARBITER_SUPPORT */

 	return -EINVAL;
 }
@@ -2359,7 +2622,7 @@ int kbase_pm_init_hw(struct kbase_device *kbdev, unsigned int flags)
 	unsigned long irq_flags;
 	int err = 0;

-	KBASE_DEBUG_ASSERT(NULL != kbdev);
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
 	lockdep_assert_held(&kbdev->pm.lock);

 	/* Ensure the clock is on before attempting to access the hardware */
@@ -2371,7 +2634,8 @@ int kbase_pm_init_hw(struct kbase_device *kbdev, unsigned int flags)
 	}

 	/* Ensure interrupts are off to begin with, this also clears any
-	 * outstanding interrupts */
+	 * outstanding interrupts
+	 */
 	kbase_pm_disable_interrupts(kbdev);
 	/* Ensure cache snoops are disabled before reset. */
 	kbase_pm_cache_snoop_disable(kbdev);
@@ -2392,6 +2656,17 @@ int kbase_pm_init_hw(struct kbase_device *kbdev, unsigned int flags)
 				kbdev->protected_dev);

 	spin_lock_irqsave(&kbdev->hwaccess_lock, irq_flags);
+#if MALI_USE_CSF
+	if (kbdev->protected_mode) {
+		unsigned long flags;
+
+		kbase_ipa_control_protm_exited(kbdev);
+
+		kbase_csf_scheduler_spin_lock(kbdev, &flags);
+		kbase_hwcnt_backend_csf_protm_exited(&kbdev->hwcnt_gpu_iface);
+		kbase_csf_scheduler_spin_unlock(kbdev, flags);
+	}
+#endif
 	kbdev->protected_mode = false;
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, irq_flags);

@@ -2412,7 +2687,8 @@ int kbase_pm_init_hw(struct kbase_device *kbdev, unsigned int flags)
 			GPU_STATUS_PROTECTED_MODE_ACTIVE);

 	/* If cycle counter was in use re-enable it, enable_irqs will only be
-	 * false when called from kbase_pm_powerup */
+	 * false when called from kbase_pm_powerup
+	 */
 	if (kbdev->pm.backend.gpu_cycle_counter_requests &&
 						(flags & PM_ENABLE_IRQS)) {
 		kbase_pm_enable_interrupts(kbdev);
@@ -2435,12 +2711,14 @@ int kbase_pm_init_hw(struct kbase_device *kbdev, unsigned int flags)
 		kbase_pm_enable_interrupts(kbdev);

 exit:
+#if !MALI_USE_CSF
 	if (!kbdev->pm.backend.protected_entry_transition_override) {
 		/* Re-enable GPU hardware counters if we're resetting from
 		 * protected mode.
 		 */
 		reenable_protected_mode_hwcnt(kbdev);
 	}
+#endif

 	return err;
 }
@@ -2467,12 +2745,22 @@ kbase_pm_request_gpu_cycle_counter_do_request(struct kbase_device *kbdev)

 	spin_lock_irqsave(&kbdev->pm.backend.gpu_cycle_counter_requests_lock,
 									flags);
-
 	++kbdev->pm.backend.gpu_cycle_counter_requests;

-	if (1 == kbdev->pm.backend.gpu_cycle_counter_requests)
+	if (kbdev->pm.backend.gpu_cycle_counter_requests == 1)
 		kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
 					GPU_COMMAND_CYCLE_COUNT_START);
+	else {
+		/* This might happen after GPU reset.
+		 * Then counter needs to be kicked.
+		 */
+		if (!IS_ENABLED(CONFIG_MALI_BIFROST_NO_MALI) &&
+		    (!(kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_STATUS)) &
+		       GPU_STATUS_CYCLE_COUNT_ACTIVE))) {
+			kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
+					GPU_COMMAND_CYCLE_COUNT_START);
+		}
+	}

 	spin_unlock_irqrestore(
 			&kbdev->pm.backend.gpu_cycle_counter_requests_lock,
@@ -2488,6 +2776,8 @@ void kbase_pm_request_gpu_cycle_counter(struct kbase_device *kbdev)
 	KBASE_DEBUG_ASSERT(kbdev->pm.backend.gpu_cycle_counter_requests <
 								INT_MAX);

+	kbase_pm_wait_for_l2_powered(kbdev);
+
 	kbase_pm_request_gpu_cycle_counter_do_request(kbdev);
 }

@@ -2522,7 +2812,7 @@ void kbase_pm_release_gpu_cycle_counter_nolock(struct kbase_device *kbdev)

 	--kbdev->pm.backend.gpu_cycle_counter_requests;

-	if (0 == kbdev->pm.backend.gpu_cycle_counter_requests)
+	if (kbdev->pm.backend.gpu_cycle_counter_requests == 0)
 		kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
 					GPU_COMMAND_CYCLE_COUNT_STOP);

--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_internal.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_internal.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2010-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * Power management API definitions used internally by GPU backend
 */
@@ -227,6 +224,7 @@ void kbase_pm_reset_done(struct kbase_device *kbdev);
 *
 * Return: 0 on success, error code on error
 */
+int kbase_pm_wait_for_desired_state(struct kbase_device *kbdev);
 #else
 /**
 * kbase_pm_wait_for_desired_state - Wait for the desired power state to be
@@ -250,8 +248,8 @@ void kbase_pm_reset_done(struct kbase_device *kbdev);
 *
 * Return: 0 on success, error code on error
 */
-#endif
 int kbase_pm_wait_for_desired_state(struct kbase_device *kbdev);
+#endif

 /**
 * kbase_pm_wait_for_l2_powered - Wait for the L2 cache to be powered on
@@ -492,7 +490,8 @@ void kbase_pm_register_access_enable(struct kbase_device *kbdev);
 void kbase_pm_register_access_disable(struct kbase_device *kbdev);

 /* NOTE: kbase_pm_is_suspending is in mali_kbase.h, because it is an inline
- * function */
+ * function
+ */

 /**
 * kbase_pm_metrics_is_active - Check if the power management metrics
@@ -536,8 +535,22 @@ void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev,

 #ifdef CONFIG_MALI_BIFROST_DVFS

+#if MALI_USE_CSF
 /**
- * kbase_platform_dvfs_event - Report utilisation to DVFS code
+ * kbase_platform_dvfs_event - Report utilisation to DVFS code for CSF GPU
+ *
+ * Function provided by platform specific code when DVFS is enabled to allow
+ * the power management metrics system to report utilisation.
+ *
+ * @kbdev:         The kbase device structure for the device (must be a
+ *                 valid pointer)
+ * @utilisation:   The current calculated utilisation by the metrics system.
+ * Return:         Returns 0 on failure and non zero on success.
+ */
+int kbase_platform_dvfs_event(struct kbase_device *kbdev, u32 utilisation);
+#else
+/**
+ * kbase_platform_dvfs_event - Report utilisation to DVFS code for JM GPU
 *
 * Function provided by platform specific code when DVFS is enabled to allow
 * the power management metrics system to report utilisation.
@@ -550,11 +563,12 @@ void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev,
 *                 group.
 * Return:         Returns 0 on failure and non zero on success.
 */
-
 int kbase_platform_dvfs_event(struct kbase_device *kbdev, u32 utilisation,
-	u32 util_gl_share, u32 util_cl_share[2]);
+			      u32 util_gl_share, u32 util_cl_share[2]);
 #endif

+#endif /* CONFIG_MALI_BIFROST_DVFS */
+
 void kbase_pm_power_changed(struct kbase_device *kbdev);

 /**
@@ -708,6 +722,72 @@ extern bool corestack_driver_control;
 */
 bool kbase_pm_is_l2_desired(struct kbase_device *kbdev);

+#if MALI_USE_CSF
+/**
+ * kbase_pm_is_mcu_desired - Check whether MCU is desired
+ *
+ * @kbdev: Device pointer
+ *
+ * This shall be called to check whether MCU needs to be enabled.
+ *
+ * Return: true if MCU needs to be enabled.
+ */
+bool kbase_pm_is_mcu_desired(struct kbase_device *kbdev);
+
+/**
+ * kbase_pm_idle_groups_sched_suspendable - Check whether the scheduler can be
+ *                                        suspended to low power state when all
+ *                                        the CSGs are idle
+ *
+ * @kbdev: Device pointer
+ *
+ * Return: true if allowed to enter the suspended state.
+ */
+static inline
+bool kbase_pm_idle_groups_sched_suspendable(struct kbase_device *kbdev)
+{
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	return !(kbdev->pm.backend.csf_pm_sched_flags &
+		 CSF_DYNAMIC_PM_SCHED_IGNORE_IDLE);
+}
+
+/**
+ * kbase_pm_no_runnables_sched_suspendable - Check whether the scheduler can be
+ *                                        suspended to low power state when
+ *                                        there are no runnable CSGs.
+ *
+ * @kbdev: Device pointer
+ *
+ * Return: true if allowed to enter the suspended state.
+ */
+static inline
+bool kbase_pm_no_runnables_sched_suspendable(struct kbase_device *kbdev)
+{
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	return !(kbdev->pm.backend.csf_pm_sched_flags &
+		 CSF_DYNAMIC_PM_SCHED_NO_SUSPEND);
+}
+
+/**
+ * kbase_pm_no_mcu_core_pwroff - Check whether the PM is required to keep the
+ *                               MCU core powered in accordance to the active
+ *                               power management policy
+ *
+ * @kbdev: Device pointer
+ *
+ * Return: true if the MCU is to retain powered.
+ */
+static inline bool kbase_pm_no_mcu_core_pwroff(struct kbase_device *kbdev)
+{
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	return kbdev->pm.backend.csf_pm_sched_flags &
+		CSF_DYNAMIC_PM_CORE_KEEP_ON;
+}
+#endif
+
 /**
 * kbase_pm_lock - Lock all necessary mutexes to perform PM actions
 *
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_l2_states.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_l2_states.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -25,6 +24,19 @@
 * The function-like macro KBASEP_L2_STATE() must be defined before including
 * this header file. This header file can be included multiple times in the
 * same compilation unit with different definitions of KBASEP_L2_STATE().
+ *
+ * @OFF:              The L2 cache and tiler are off
+ * @PEND_ON:          The L2 cache and tiler are powering on
+ * @RESTORE_CLOCKS:   The GPU clock is restored. Conditionally used.
+ * @ON_HWCNT_ENABLE:  The L2 cache and tiler are on, and hwcnt is being enabled
+ * @ON:               The L2 cache and tiler are on, and hwcnt is enabled
+ * @ON_HWCNT_DISABLE: The L2 cache and tiler are on, and hwcnt is being disabled
+ * @SLOW_DOWN_CLOCKS: The GPU clock is set to appropriate or lowest clock.
+ *                    Conditionally used.
+ * @POWER_DOWN:       The L2 cache and tiler are about to be powered off
+ * @PEND_OFF:         The L2 cache and tiler are powering off
+ * @RESET_WAIT:       The GPU is resetting, L2 cache and tiler power state are
+ *                    unknown
 */
 KBASEP_L2_STATE(OFF)
 KBASEP_L2_STATE(PEND_ON)
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_mcu_states.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_mcu_states.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -25,15 +24,40 @@
 * The function-like macro KBASEP_MCU_STATE() must be defined before including
 * this header file. This header file can be included multiple times in the
 * same compilation unit with different definitions of KBASEP_MCU_STATE().
+ *
+ * @OFF:                      The MCU is powered off.
+ * @PEND_ON_RELOAD:           The warm boot of MCU or cold boot of MCU (with
+ *                            firmware reloading) is in progress.
+ * @ON_GLB_REINIT_PEND:       The MCU is enabled and Global configuration
+ *                            requests have been sent to the firmware.
+ * @ON_HWCNT_ENABLE:          The Global requests have completed and MCU is now
+ *                            ready for use and hwcnt is being enabled.
+ * @ON:                       The MCU is active and hwcnt has been enabled.
+ * @ON_CORE_ATTR_UPDATE_PEND: The MCU is active and mask of enabled shader cores
+ *                            is being updated.
+ * @ON_HWCNT_DISABLE:         The MCU is on and hwcnt is being disabled.
+ * @ON_HALT:                  The MCU is on and hwcnt has been disabled, MCU
+ *                            halt would be triggered.
+ * @ON_PEND_HALT:             MCU halt in progress, confirmation pending.
+ * @POWER_DOWN:               MCU halted operations, pending being disabled.
+ * @PEND_OFF:                 MCU is being disabled, pending on powering off.
+ * @RESET_WAIT:               The GPU is resetting, MCU state is unknown.
 */
 KBASEP_MCU_STATE(OFF)
 KBASEP_MCU_STATE(PEND_ON_RELOAD)
 KBASEP_MCU_STATE(ON_GLB_REINIT_PEND)
 KBASEP_MCU_STATE(ON_HWCNT_ENABLE)
 KBASEP_MCU_STATE(ON)
+KBASEP_MCU_STATE(ON_CORE_ATTR_UPDATE_PEND)
 KBASEP_MCU_STATE(ON_HWCNT_DISABLE)
 KBASEP_MCU_STATE(ON_HALT)
 KBASEP_MCU_STATE(ON_PEND_HALT)
 KBASEP_MCU_STATE(POWER_DOWN)
 KBASEP_MCU_STATE(PEND_OFF)
 KBASEP_MCU_STATE(RESET_WAIT)
+/* Additional MCU states with HOST_CONTROL_SHADERS */
+KBASEP_MCU_STATE(HCTL_SHADERS_PEND_ON)
+KBASEP_MCU_STATE(HCTL_CORES_NOTIFY_PEND)
+KBASEP_MCU_STATE(HCTL_MCU_ON_RECHECK)
+KBASEP_MCU_STATE(HCTL_SHADERS_READY_OFF)
+KBASEP_MCU_STATE(HCTL_SHADERS_PEND_OFF)
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_metrics.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_metrics.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2011-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2011-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,12 +17,8 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-
-
 /*
 * Metrics for power management
 */
@@ -29,24 +26,28 @@
 #include <mali_kbase.h>
 #include <mali_kbase_pm.h>
 #include <backend/gpu/mali_kbase_pm_internal.h>
-#if !MALI_USE_CSF
+
+#if MALI_USE_CSF
+#include "mali_kbase_clk_rate_trace_mgr.h"
+#include <csf/ipa_control/mali_kbase_csf_ipa_control.h>
+#else
 #include <backend/gpu/mali_kbase_jm_rb.h>
 #endif /* !MALI_USE_CSF */
+
 #include <backend/gpu/mali_kbase_pm_defs.h>
 #include <mali_linux_trace.h>

-/* When VSync is being hit aim for utilisation between 70-90% */
-#define KBASE_PM_VSYNC_MIN_UTILISATION          70
-#define KBASE_PM_VSYNC_MAX_UTILISATION          90
-/* Otherwise aim for 10-40% */
-#define KBASE_PM_NO_VSYNC_MIN_UTILISATION       10
-#define KBASE_PM_NO_VSYNC_MAX_UTILISATION       40
-
 /* Shift used for kbasep_pm_metrics_data.time_busy/idle - units of (1 << 8) ns
 * This gives a maximum period between samples of 2^(32+8)/100 ns = slightly
- * under 11s. Exceeding this will cause overflow */
+ * under 11s. Exceeding this will cause overflow
+ */
 #define KBASE_PM_TIME_SHIFT			8

+#if MALI_USE_CSF
+/* To get the GPU_ACTIVE value in nano seconds unit */
+#define GPU_ACTIVE_SCALING_FACTOR ((u64)1E9)
+#endif
+
 #ifdef CONFIG_MALI_BIFROST_DVFS
 static enum hrtimer_restart dvfs_callback(struct hrtimer *timer)
 {
@@ -73,11 +74,45 @@ static enum hrtimer_restart dvfs_callback(struct hrtimer *timer)

 int kbasep_pm_metrics_init(struct kbase_device *kbdev)
 {
+#if MALI_USE_CSF
+	struct kbase_ipa_control_perf_counter perf_counter;
+	int err;
+
+	/* One counter group */
+	const size_t NUM_PERF_COUNTERS = 1;
+
 	KBASE_DEBUG_ASSERT(kbdev != NULL);
-
 	kbdev->pm.backend.metrics.kbdev = kbdev;
-
 	kbdev->pm.backend.metrics.time_period_start = ktime_get();
+	kbdev->pm.backend.metrics.values.time_busy = 0;
+	kbdev->pm.backend.metrics.values.time_idle = 0;
+	kbdev->pm.backend.metrics.values.time_in_protm = 0;
+
+	perf_counter.scaling_factor = GPU_ACTIVE_SCALING_FACTOR;
+
+	/* Normalize values by GPU frequency */
+	perf_counter.gpu_norm = true;
+
+	/* We need the GPU_ACTIVE counter, which is in the CSHW group */
+	perf_counter.type = KBASE_IPA_CORE_TYPE_CSHW;
+
+	/* We need the GPU_ACTIVE counter */
+	perf_counter.idx = GPU_ACTIVE_CNT_IDX;
+
+	err = kbase_ipa_control_register(
+		kbdev, &perf_counter, NUM_PERF_COUNTERS,
+		&kbdev->pm.backend.metrics.ipa_control_client);
+	if (err) {
+		dev_err(kbdev->dev,
+			"Failed to register IPA with kbase_ipa_control: err=%d",
+			err);
+		return -1;
+	}
+#else
+	KBASE_DEBUG_ASSERT(kbdev != NULL);
+	kbdev->pm.backend.metrics.kbdev = kbdev;
+	kbdev->pm.backend.metrics.time_period_start = ktime_get();
+
 	kbdev->pm.backend.metrics.gpu_active = false;
 	kbdev->pm.backend.metrics.active_cl_ctx[0] = 0;
 	kbdev->pm.backend.metrics.active_cl_ctx[1] = 0;
@@ -91,16 +126,25 @@ int kbasep_pm_metrics_init(struct kbase_device *kbdev)
 	kbdev->pm.backend.metrics.values.busy_cl[1] = 0;
 	kbdev->pm.backend.metrics.values.busy_gl = 0;

+#endif
 	spin_lock_init(&kbdev->pm.backend.metrics.lock);

 #ifdef CONFIG_MALI_BIFROST_DVFS
 	hrtimer_init(&kbdev->pm.backend.metrics.timer, CLOCK_MONOTONIC,
 							HRTIMER_MODE_REL);
 	kbdev->pm.backend.metrics.timer.function = dvfs_callback;
-
+	kbdev->pm.backend.metrics.initialized = true;
 	kbase_pm_metrics_start(kbdev);
 #endif /* CONFIG_MALI_BIFROST_DVFS */

+#if MALI_USE_CSF
+	/* The sanity check on the GPU_ACTIVE performance counter
+	 * is skipped for Juno platforms that have timing problems.
+	 */
+	kbdev->pm.backend.metrics.skip_gpu_active_sanity_check =
+		of_machine_is_compatible("arm,juno");
+#endif
+
 	return 0;
 }
 KBASE_EXPORT_TEST_API(kbasep_pm_metrics_init);
@@ -117,7 +161,13 @@ void kbasep_pm_metrics_term(struct kbase_device *kbdev)
 	spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);

 	hrtimer_cancel(&kbdev->pm.backend.metrics.timer);
+	kbdev->pm.backend.metrics.initialized = false;
 #endif /* CONFIG_MALI_BIFROST_DVFS */
+
+#if MALI_USE_CSF
+	kbase_ipa_control_unregister(
+		kbdev, kbdev->pm.backend.metrics.ipa_control_client);
+#endif
 }

 KBASE_EXPORT_TEST_API(kbasep_pm_metrics_term);
@@ -125,8 +175,121 @@ KBASE_EXPORT_TEST_API(kbasep_pm_metrics_term);
 /* caller needs to hold kbdev->pm.backend.metrics.lock before calling this
 * function
 */
+#if MALI_USE_CSF
+#if defined(CONFIG_MALI_BIFROST_DEVFREQ) || defined(CONFIG_MALI_BIFROST_DVFS)
+static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev)
+{
+	int err;
+	u64 gpu_active_counter;
+	u64 protected_time;
+	ktime_t now;
+
+	lockdep_assert_held(&kbdev->pm.backend.metrics.lock);
+
+	/* Query IPA_CONTROL for the latest GPU-active and protected-time
+	 * info.
+	 */
+	err = kbase_ipa_control_query(
+		kbdev, kbdev->pm.backend.metrics.ipa_control_client,
+		&gpu_active_counter, 1, &protected_time);
+
+	/* Read the timestamp after reading the GPU_ACTIVE counter value.
+	 * This ensures the time gap between the 2 reads is consistent for
+	 * a meaningful comparison between the increment of GPU_ACTIVE and
+	 * elapsed time. The lock taken inside kbase_ipa_control_query()
+	 * function can cause lot of variation.
+	 */
+	now = ktime_get();
+
+	if (err) {
+		dev_err(kbdev->dev,
+			"Failed to query the increment of GPU_ACTIVE counter: err=%d",
+			err);
+	} else {
+		u64 diff_ns, margin_ns;
+		s64 diff_ns_signed;
+		u32 ns_time;
+		ktime_t diff = ktime_sub(
+			now, kbdev->pm.backend.metrics.time_period_start);
+
+		diff_ns_signed = ktime_to_ns(diff);
+
+		if (diff_ns_signed < 0)
+			return;
+
+		diff_ns = (u64)diff_ns_signed;
+
+		/* Use a margin value that is approximately 1% of the time
+		 * difference.
+		 */
+		margin_ns = diff_ns >> 6;
+
+		/* Calculate time difference in units of 256ns */
+		ns_time = (u32)(diff_ns >> KBASE_PM_TIME_SHIFT);
+
+#ifndef CONFIG_MALI_BIFROST_NO_MALI
+		/* The GPU_ACTIVE counter shouldn't clock-up more time than has
+		 * actually elapsed - but still some margin needs to be given
+		 * when doing the comparison. There could be some drift between
+		 * the CPU and GPU clock.
+		 *
+		 * Can do the check only in a real driver build, as an arbitrary
+		 * value for GPU_ACTIVE can be fed into dummy model in no_mali
+		 * configuration which may not correspond to the real elapsed
+		 * time.
+		 */
+		if (!kbdev->pm.backend.metrics.skip_gpu_active_sanity_check) {
+			if (gpu_active_counter > (diff_ns + margin_ns)) {
+				dev_info(
+					kbdev->dev,
+					"GPU activity takes longer than time interval: %llu ns > %llu ns",
+					(unsigned long long)gpu_active_counter,
+					(unsigned long long)diff_ns);
+			}
+		}
+#else
+		CSTD_UNUSED(margin_ns);
+#endif
+
+		/* Add protected_time to gpu_active_counter so that time in
+		 * protected mode is included in the apparent GPU active time,
+		 * then convert it from units of 1ns to units of 256ns, to
+		 * match what JM GPUs use. The assumption is made here that the
+		 * GPU is 100% busy while in protected mode, so we should add
+		 * this since the GPU can't (and thus won't) update these
+		 * counters while it's actually in protected mode.
+		 *
+		 * Perform the add after dividing each value down, to reduce
+		 * the chances of overflows.
+		 */
+		protected_time >>= KBASE_PM_TIME_SHIFT;
+		gpu_active_counter >>= KBASE_PM_TIME_SHIFT;
+		gpu_active_counter += protected_time;
+
+		/* Ensure the following equations don't go wrong if ns_time is
+		 * slightly larger than gpu_active_counter somehow
+		 */
+		gpu_active_counter = MIN(gpu_active_counter, ns_time);
+
+		kbdev->pm.backend.metrics.values.time_busy +=
+			gpu_active_counter;
+
+		kbdev->pm.backend.metrics.values.time_idle +=
+			ns_time - gpu_active_counter;
+
+		/* Also make time in protected mode available explicitly,
+		 * so users of this data have this info, too.
+		 */
+		kbdev->pm.backend.metrics.values.time_in_protm +=
+			protected_time;
+	}
+
+	kbdev->pm.backend.metrics.time_period_start = now;
+}
+#endif /* defined(CONFIG_MALI_BIFROST_DEVFREQ) || defined(CONFIG_MALI_BIFROST_DVFS) */
+#else
 static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev,
-								ktime_t now)
+					       ktime_t now)
 {
 	ktime_t diff;

@@ -151,12 +314,13 @@ static void kbase_pm_get_dvfs_utilisation_calc(struct kbase_device *kbdev,
 		if (kbdev->pm.backend.metrics.active_gl_ctx[2])
 			kbdev->pm.backend.metrics.values.busy_gl += ns_time;
 	} else {
-		kbdev->pm.backend.metrics.values.time_idle += (u32) (ktime_to_ns(diff)
-							>> KBASE_PM_TIME_SHIFT);
+		kbdev->pm.backend.metrics.values.time_idle +=
+			(u32)(ktime_to_ns(diff) >> KBASE_PM_TIME_SHIFT);
 	}

 	kbdev->pm.backend.metrics.time_period_start = now;
 }
+#endif  /* MALI_USE_CSF */

 #if defined(CONFIG_MALI_BIFROST_DEVFREQ) || defined(CONFIG_MALI_BIFROST_DVFS)
 void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev,
@@ -167,14 +331,23 @@ void kbase_pm_get_dvfs_metrics(struct kbase_device *kbdev,
 	unsigned long flags;

 	spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
+#if MALI_USE_CSF
+	kbase_pm_get_dvfs_utilisation_calc(kbdev);
+#else
 	kbase_pm_get_dvfs_utilisation_calc(kbdev, ktime_get());
+#endif

 	memset(diff, 0, sizeof(*diff));
 	diff->time_busy = cur->time_busy - last->time_busy;
 	diff->time_idle = cur->time_idle - last->time_idle;
+
+#if MALI_USE_CSF
+	diff->time_in_protm = cur->time_in_protm - last->time_in_protm;
+#else
 	diff->busy_cl[0] = cur->busy_cl[0] - last->busy_cl[0];
 	diff->busy_cl[1] = cur->busy_cl[1] - last->busy_cl[1];
 	diff->busy_gl = cur->busy_gl - last->busy_gl;
+#endif

 	*last = *cur;

@@ -186,26 +359,42 @@ KBASE_EXPORT_TEST_API(kbase_pm_get_dvfs_metrics);
 #ifdef CONFIG_MALI_BIFROST_DVFS
 void kbase_pm_get_dvfs_action(struct kbase_device *kbdev)
 {
-	int utilisation, util_gl_share;
-	int util_cl_share[2];
-	int busy;
+	int utilisation;
 	struct kbasep_pm_metrics *diff;
+#if !MALI_USE_CSF
+	int busy;
+	int util_gl_share;
+	int util_cl_share[2];
+#endif

 	KBASE_DEBUG_ASSERT(kbdev != NULL);

 	diff = &kbdev->pm.backend.metrics.dvfs_diff;

-	kbase_pm_get_dvfs_metrics(kbdev, &kbdev->pm.backend.metrics.dvfs_last, diff);
+	kbase_pm_get_dvfs_metrics(kbdev, &kbdev->pm.backend.metrics.dvfs_last,
+				  diff);

 	utilisation = (100 * diff->time_busy) /
 			max(diff->time_busy + diff->time_idle, 1u);

+#if !MALI_USE_CSF
 	busy = max(diff->busy_gl + diff->busy_cl[0] + diff->busy_cl[1], 1u);
+
 	util_gl_share = (100 * diff->busy_gl) / busy;
 	util_cl_share[0] = (100 * diff->busy_cl[0]) / busy;
 	util_cl_share[1] = (100 * diff->busy_cl[1]) / busy;

-	kbase_platform_dvfs_event(kbdev, utilisation, util_gl_share, util_cl_share);
+	kbase_platform_dvfs_event(kbdev, utilisation, util_gl_share,
+				  util_cl_share);
+#else
+	/* Note that, at present, we don't pass protected-mode time to the
+	 * platform here. It's unlikely to be useful, however, as the platform
+	 * probably just cares whether the GPU is busy or not; time in
+	 * protected mode is already added to busy-time at this point, though,
+	 * so we should be good.
+	 */
+	kbase_platform_dvfs_event(kbdev, utilisation);
+#endif
 }

 bool kbase_pm_metrics_is_active(struct kbase_device *kbdev)
@@ -226,11 +415,20 @@ KBASE_EXPORT_TEST_API(kbase_pm_metrics_is_active);
 void kbase_pm_metrics_start(struct kbase_device *kbdev)
 {
 	unsigned long flags;
+	bool update = true;
+
+	if (unlikely(!kbdev->pm.backend.metrics.initialized))
+		return;

 	spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
-	kbdev->pm.backend.metrics.timer_active = true;
+	if (!kbdev->pm.backend.metrics.timer_active)
+		kbdev->pm.backend.metrics.timer_active = true;
+	else
+		update = false;
 	spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
-	hrtimer_start(&kbdev->pm.backend.metrics.timer,
+
+	if (update)
+		hrtimer_start(&kbdev->pm.backend.metrics.timer,
 			HR_TIMER_DELAY_MSEC(kbdev->pm.dvfs_period),
 			HRTIMER_MODE_REL);
 }
@@ -238,11 +436,20 @@ void kbase_pm_metrics_start(struct kbase_device *kbdev)
 void kbase_pm_metrics_stop(struct kbase_device *kbdev)
 {
 	unsigned long flags;
+	bool update = true;
+
+	if (unlikely(!kbdev->pm.backend.metrics.initialized))
+		return;

 	spin_lock_irqsave(&kbdev->pm.backend.metrics.lock, flags);
-	kbdev->pm.backend.metrics.timer_active = false;
+	if (kbdev->pm.backend.metrics.timer_active)
+		kbdev->pm.backend.metrics.timer_active = false;
+	else
+		update = false;
 	spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
-	hrtimer_cancel(&kbdev->pm.backend.metrics.timer);
+
+	if (update)
+		hrtimer_cancel(&kbdev->pm.backend.metrics.timer);
 }


@@ -273,7 +480,8 @@ static void kbase_pm_metrics_active_calc(struct kbase_device *kbdev)
 		struct kbase_jd_atom *katom = kbase_gpu_inspect(kbdev, js, 0);

 		/* Head atom may have just completed, so if it isn't running
-		 * then try the next atom */
+		 * then try the next atom
+		 */
 		if (katom && katom->gpu_rb_state != KBASE_ATOM_GPU_RB_SUBMITTED)
 			katom = kbase_gpu_inspect(kbdev, js, 1);

@@ -296,7 +504,6 @@ static void kbase_pm_metrics_active_calc(struct kbase_device *kbdev)
 		}
 	}
 }
-#endif /* !MALI_USE_CSF */

 /* called when job is submitted to or removed from a GPU slot */
 void kbase_pm_metrics_update(struct kbase_device *kbdev, ktime_t *timestamp)
@@ -313,12 +520,12 @@ void kbase_pm_metrics_update(struct kbase_device *kbdev, ktime_t *timestamp)
 		timestamp = &now;
 	}

-	/* Track how long CL and/or GL jobs have been busy for */
+	/* Track how much of time has been spent busy or idle. For JM GPUs,
+	 * this also evaluates how long CL and/or GL jobs have been busy for.
+	 */
 	kbase_pm_get_dvfs_utilisation_calc(kbdev, *timestamp);

-#if !MALI_USE_CSF
 	kbase_pm_metrics_active_calc(kbdev);
-#endif /* !MALI_USE_CSF */
-
 	spin_unlock_irqrestore(&kbdev->pm.backend.metrics.lock, flags);
 }
+#endif /* !MALI_USE_CSF */
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_policy.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_policy.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2010-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -28,6 +27,11 @@
 #include <gpu/mali_kbase_gpu_regmap.h>
 #include <mali_kbase_pm.h>
 #include <backend/gpu/mali_kbase_pm_internal.h>
+#include <mali_kbase_reset_gpu.h>
+
+#if MALI_USE_CSF && defined CONFIG_MALI_BIFROST_DEBUG
+#include <csf/mali_kbase_csf_firmware.h>
+#endif

 static const struct kbase_pm_policy *const all_policy_list[] = {
 #ifdef CONFIG_MALI_BIFROST_NO_MALI
@@ -45,11 +49,45 @@ static const struct kbase_pm_policy *const all_policy_list[] = {
 #endif /* CONFIG_MALI_BIFROST_NO_MALI */
 };

+#if MALI_USE_CSF
+void kbase_pm_policy_init(struct kbase_device *kbdev)
+{
+	unsigned long flags;
+	const struct kbase_pm_policy *default_policy = all_policy_list[0];
+
+#if defined CONFIG_MALI_BIFROST_DEBUG
+	/* Use always_on policy if module param fw_debug=1 is
+	 * passed, to aid firmware debugging.
+	 */
+	if (fw_debug)
+		default_policy = &kbase_pm_always_on_policy_ops;
+#endif
+
+
+	default_policy->init(kbdev);
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	kbdev->pm.backend.pm_current_policy = default_policy;
+	kbdev->pm.backend.csf_pm_sched_flags =
+				default_policy->pm_sched_flags;
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+}
+#else /* MALI_USE_CSF */
 void kbase_pm_policy_init(struct kbase_device *kbdev)
 {
 	kbdev->pm.backend.pm_current_policy = all_policy_list[0];
+
+#if MALI_USE_CSF && defined CONFIG_MALI_BIFROST_DEBUG
+	/* Use always_on policy if module param fw_debug=1 is
+	 * passed, to aid firmware debugging.
+	 */
+	if (fw_debug)
+		kbdev->pm.backend.pm_current_policy =
+			&kbase_pm_always_on_policy_ops;
+#endif
 	kbdev->pm.backend.pm_current_policy->init(kbdev);
 }
+#endif /* MALI_USE_CSF */

 void kbase_pm_policy_term(struct kbase_device *kbdev)
 {
@@ -102,7 +140,8 @@ void kbase_pm_update_active(struct kbase_device *kbdev)
 		}
 	} else {
 		/* It is an error for the power policy to power off the GPU
-		 * when there are contexts active */
+		 * when there are contexts active
+		 */
 		KBASE_DEBUG_ASSERT(pm->active_count == 0);

 		pm->backend.poweron_required = false;
@@ -126,18 +165,20 @@ void kbase_pm_update_dynamic_cores_onoff(struct kbase_device *kbdev)
 	lockdep_assert_held(&kbdev->hwaccess_lock);
 	lockdep_assert_held(&kbdev->pm.lock);

-#if MALI_USE_CSF
-	/* On CSF GPUs, Host driver isn't supposed to do the power management
-	 * for shader cores. CSF firmware will power up the cores appropriately
-	 * and so from Driver's standpoint 'shaders_desired' flag shall always
-	 * remain 0.
-	 */
-	return;
-#endif
 	if (kbdev->pm.backend.pm_current_policy == NULL)
 		return;
 	if (kbdev->pm.backend.poweroff_wait_in_progress)
 		return;
+
+#if MALI_USE_CSF
+	CSTD_UNUSED(shaders_desired);
+	/* Invoke the MCU state machine to send a request to FW for updating
+	 * the mask of shader cores that can be used for allocation of
+	 * endpoints requested by CSGs.
+	 */
+	if (kbase_pm_is_mcu_desired(kbdev))
+		kbase_pm_update_state(kbdev);
+#else
 	/* In protected transition, don't allow outside shader core request
 	 * affect transition, return directly
 	 */
@@ -149,6 +190,7 @@ void kbase_pm_update_dynamic_cores_onoff(struct kbase_device *kbdev)
 	if (shaders_desired && kbase_pm_is_l2_desired(kbdev)) {
 		kbase_pm_update_state(kbdev);
 	}
+#endif
 }

 void kbase_pm_update_cores_state_nolock(struct kbase_device *kbdev)
@@ -164,7 +206,8 @@ void kbase_pm_update_cores_state_nolock(struct kbase_device *kbdev)

 	if (kbdev->pm.backend.protected_transition_override)
 		/* We are trying to change in/out of protected mode - force all
-		 * cores off so that the L2 powers down */
+		 * cores off so that the L2 powers down
+		 */
 		shaders_desired = false;
 	else
 		shaders_desired = kbdev->pm.backend.pm_current_policy->shaders_needed(kbdev);
@@ -216,20 +259,106 @@ const struct kbase_pm_policy *kbase_pm_get_policy(struct kbase_device *kbdev)

 KBASE_EXPORT_TEST_API(kbase_pm_get_policy);

+#if MALI_USE_CSF
+static int policy_change_wait_for_L2_off(struct kbase_device *kbdev)
+{
+#define WAIT_DURATION_MS (3000)
+	long remaining;
+	long timeout = kbase_csf_timeout_in_jiffies(WAIT_DURATION_MS);
+	int err = 0;
+
+	/* Wait for L2 becoming off, by which the MCU is also implicitly off
+	 * since the L2 state machine would only start its power-down
+	 * sequence when the MCU is in off state. The L2 off is required
+	 * as the tiler may need to be power cycled for MCU reconfiguration
+	 * for host control of shader cores.
+	 */
+#if KERNEL_VERSION(4, 13, 1) <= LINUX_VERSION_CODE
+	remaining = wait_event_killable_timeout(
+		kbdev->pm.backend.gpu_in_desired_state_wait,
+		kbdev->pm.backend.l2_state == KBASE_L2_OFF, timeout);
+#else
+	remaining = wait_event_timeout(
+		kbdev->pm.backend.gpu_in_desired_state_wait,
+		kbdev->pm.backend.l2_state == KBASE_L2_OFF, timeout);
+#endif
+
+	if (!remaining) {
+		err = -ETIMEDOUT;
+	} else if (remaining < 0) {
+		dev_info(kbdev->dev,
+			 "Wait for L2_off got interrupted");
+		err = (int)remaining;
+	}
+
+	dev_dbg(kbdev->dev, "%s: err=%d mcu_state=%d, L2_state=%d\n", __func__,
+		err, kbdev->pm.backend.mcu_state, kbdev->pm.backend.l2_state);
+
+	return err;
+}
+#endif
+
 void kbase_pm_set_policy(struct kbase_device *kbdev,
 				const struct kbase_pm_policy *new_policy)
 {
 	const struct kbase_pm_policy *old_policy;
 	unsigned long flags;
+#if MALI_USE_CSF
+	unsigned int new_policy_csf_pm_sched_flags;
+	bool sched_suspend;
+	bool reset_gpu = false;
+#endif

 	KBASE_DEBUG_ASSERT(kbdev != NULL);
 	KBASE_DEBUG_ASSERT(new_policy != NULL);

 	KBASE_KTRACE_ADD(kbdev, PM_SET_POLICY, NULL, new_policy->id);

+#if MALI_USE_CSF
+	/* Serialize calls on kbase_pm_set_policy() */
+	mutex_lock(&kbdev->pm.backend.policy_change_lock);
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	/* policy_change_clamp_state_to_off, when needed, is set/cleared in
+	 * this function, a very limited temporal scope for covering the
+	 * change transition.
+	 */
+	WARN_ON(kbdev->pm.backend.policy_change_clamp_state_to_off);
+	new_policy_csf_pm_sched_flags = new_policy->pm_sched_flags;
+
+	/* Requiring the scheduler PM suspend operation when changes involving
+	 * the always_on policy, reflected by the CSF_DYNAMIC_PM_CORE_KEEP_ON
+	 * flag bit.
+	 */
+	sched_suspend = kbdev->csf.firmware_inited &&
+			(CSF_DYNAMIC_PM_CORE_KEEP_ON &
+			 (new_policy_csf_pm_sched_flags |
+			  kbdev->pm.backend.csf_pm_sched_flags));
+
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+
+	if (sched_suspend)
+		kbase_csf_scheduler_pm_suspend(kbdev);
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	/* If the current active policy is always_on, one needs to clamp the
+	 * MCU/L2 for reaching off-state
+	 */
+	if (sched_suspend)
+		kbdev->pm.backend.policy_change_clamp_state_to_off =
+			CSF_DYNAMIC_PM_CORE_KEEP_ON & kbdev->pm.backend.csf_pm_sched_flags;
+
+	kbase_pm_update_state(kbdev);
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+
+	if (sched_suspend)
+		reset_gpu = policy_change_wait_for_L2_off(kbdev);
+#endif
+
 	/* During a policy change we pretend the GPU is active */
 	/* A suspend won't happen here, because we're in a syscall from a
-	 * userspace thread */
+	 * userspace thread
+	 */
 	kbase_pm_context_active(kbdev);

 	kbase_pm_lock(kbdev);
@@ -250,19 +379,40 @@ void kbase_pm_set_policy(struct kbase_device *kbdev,

 	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
 	kbdev->pm.backend.pm_current_policy = new_policy;
+#if MALI_USE_CSF
+	kbdev->pm.backend.csf_pm_sched_flags = new_policy_csf_pm_sched_flags;
+	/* New policy in place, release the clamping on mcu/L2 off state */
+	kbdev->pm.backend.policy_change_clamp_state_to_off = false;
+	kbase_pm_update_state(kbdev);
+#endif
 	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);

 	/* If any core power state changes were previously attempted, but
 	 * couldn't be made because the policy was changing (current_policy was
-	 * NULL), then re-try them here. */
+	 * NULL), then re-try them here.
+	 */
 	kbase_pm_update_active(kbdev);
 	kbase_pm_update_cores_state(kbdev);

 	kbase_pm_unlock(kbdev);

 	/* Now the policy change is finished, we release our fake context active
-	 * reference */
+	 * reference
+	 */
 	kbase_pm_context_idle(kbdev);
+
+#if MALI_USE_CSF
+	/* Reverse the suspension done */
+	if (reset_gpu) {
+		dev_warn(kbdev->dev, "Resorting to GPU reset for policy change\n");
+		if (kbase_prepare_to_reset_gpu(kbdev, RESET_FLAGS_NONE))
+			kbase_reset_gpu(kbdev);
+		kbase_reset_gpu_wait(kbdev);
+	} else if (sched_suspend)
+		kbase_csf_scheduler_pm_resume(kbdev);
+
+	mutex_unlock(&kbdev->pm.backend.policy_change_lock);
+#endif
 }

 KBASE_EXPORT_TEST_API(kbase_pm_set_policy);
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_policy.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_policy.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2010-2015, 2018-2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2010-2015, 2018-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_shader_states.h
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_pm_shader_states.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2018-2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -26,6 +25,41 @@
 * including this header file. This header file can be included multiple
 * times in the same compilation unit with different definitions of
 * KBASEP_SHADER_STATE().
+ *
+ * @OFF_CORESTACK_OFF:                The shaders and core stacks are off
+ * @OFF_CORESTACK_PEND_ON:            The shaders are off, core stacks have been
+ *                                    requested to power on and hwcnt is being
+ *                                    disabled
+ * @PEND_ON_CORESTACK_ON:             Core stacks are on, shaders have been
+ *                                    requested to power on. Or after doing
+ *                                    partial shader on/off, checking whether
+ *                                    it's the desired state.
+ * @ON_CORESTACK_ON:                  The shaders and core stacks are on, and
+ *                                    hwcnt already enabled.
+ * @ON_CORESTACK_ON_RECHECK:          The shaders and core stacks are on, hwcnt
+ *                                    disabled, and checks to powering down or
+ *                                    re-enabling hwcnt.
+ * @WAIT_OFF_CORESTACK_ON:            The shaders have been requested to power
+ *                                    off, but they remain on for the duration
+ *                                    of the hysteresis timer
+ * @WAIT_GPU_IDLE:                    The shaders partial poweroff needs to
+ *                                    reach a state where jobs on the GPU are
+ *                                    finished including jobs currently running
+ *                                    and in the GPU queue because of
+ *                                    GPU2017-861
+ * @WAIT_FINISHED_CORESTACK_ON:       The hysteresis timer has expired
+ * @L2_FLUSHING_CORESTACK_ON:         The core stacks are on and the level 2
+ *                                    cache is being flushed.
+ * @READY_OFF_CORESTACK_ON:           The core stacks are on and the shaders are
+ *                                    ready to be powered off.
+ * @PEND_OFF_CORESTACK_ON:            The core stacks are on, and the shaders
+ *                                    have been requested to power off
+ * @OFF_CORESTACK_PEND_OFF:           The shaders are off, and the core stacks
+ *                                    have been requested to power off
+ * @OFF_CORESTACK_OFF_TIMER_PEND_OFF: Shaders and corestacks are off, but the
+ *                                    tick timer cancellation is still pending.
+ * @RESET_WAIT:                       The GPU is resetting, shader and core
+ *                                    stack power states are unknown
 */
 KBASEP_SHADER_STATE(OFF_CORESTACK_OFF)
 KBASEP_SHADER_STATE(OFF_CORESTACK_PEND_ON)
--- a/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_time.c
+++ b/drivers/gpu/arm/bifrost/backend/gpu/mali_kbase_time.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2014-2016,2018-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2014-2016, 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
@@ -67,14 +66,47 @@ void kbase_backend_get_gpu_time_norequest(struct kbase_device *kbdev,
 #endif
 }

+#if !MALI_USE_CSF
+/**
+ * timedwait_cycle_count_active() - Timed wait till CYCLE_COUNT_ACTIVE is active
+ *
+ * @kbdev: Kbase device
+ *
+ * Return: true if CYCLE_COUNT_ACTIVE is active within the timeout.
+ */
+static bool timedwait_cycle_count_active(struct kbase_device *kbdev)
+{
+#ifdef CONFIG_MALI_BIFROST_NO_MALI
+	return true;
+#else
+	bool success = false;
+	const unsigned int timeout = 100;
+	const unsigned long remaining = jiffies + msecs_to_jiffies(timeout);
+
+	while (time_is_after_jiffies(remaining)) {
+		if ((kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_STATUS)) &
+		     GPU_STATUS_CYCLE_COUNT_ACTIVE)) {
+			success = true;
+			break;
+		}
+	}
+	return success;
+#endif
+}
+#endif
+
 void kbase_backend_get_gpu_time(struct kbase_device *kbdev, u64 *cycle_counter,
 				u64 *system_time, struct timespec64 *ts)
 {
 #if !MALI_USE_CSF
 	kbase_pm_request_gpu_cycle_counter(kbdev);
+	WARN_ONCE(kbdev->pm.backend.l2_state != KBASE_L2_ON,
+		  "L2 not powered up");
+	WARN_ONCE((!timedwait_cycle_count_active(kbdev)),
+		  "Timed out on CYCLE_COUNT_ACTIVE");
 #endif
-	kbase_backend_get_gpu_time_norequest(
-		kbdev, cycle_counter, system_time, ts);
+	kbase_backend_get_gpu_time_norequest(kbdev, cycle_counter, system_time,
+					     ts);
 #if !MALI_USE_CSF
 	kbase_pm_release_gpu_cycle_counter(kbdev);
 #endif
--- a/drivers/gpu/arm/bifrost/build.bp
+++ b/drivers/gpu/arm/bifrost/build.bp
@@ -1,15 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2017-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2017-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
- * A copy of the licence is included with the program, and can also be obtained
- * from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
- * Boston, MA 02110-1301, USA.
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
 *
 */

@@ -135,8 +141,11 @@ bob_kernel_module {
    cinstr_secondary_hwc: {
        kbuild_options: ["CONFIG_MALI_BIFROST_PRFCNT_SET_SECONDARY=y"],
    },
-    cinstr_secondary_hwc_via_debug_fs: {
-        kbuild_options: ["CONFIG_MALI_PRFCNT_SET_SECONDARY_VIA_DEBUG_FS=y"],
+    cinstr_tertiary_hwc: {
+        kbuild_options: ["CONFIG_MALI_PRFCNT_SET_TERTIARY=y"],
+    },
+    cinstr_hwc_set_select_via_debug_fs: {
+        kbuild_options: ["CONFIG_MALI_PRFCNT_SET_SELECT_VIA_DEBUG_FS=y"],
    },
    mali_2mb_alloc: {
        kbuild_options: ["CONFIG_MALI_2MB_ALLOC=y"],
@@ -158,14 +167,20 @@ bob_kernel_module {
            "jm/*.h",
            "tl/backend/*_jm.c",
            "mmu/backend/*_jm.c",
+            "ipa/backend/*_jm.c",
+            "ipa/backend/*_jm.h",
        ],
    },
    gpu_has_csf: {
+        kbuild_options: ["CONFIG_MALI_CSF_SUPPORT=y"],
        srcs: [
            "context/backend/*_csf.c",
            "csf/*.c",
            "csf/*.h",
            "csf/Kbuild",
+            "csf/ipa_control/*.c",
+            "csf/ipa_control/*.h",
+            "csf/ipa_control/Kbuild",
            "debug/backend/*_csf.c",
            "debug/backend/*_csf.h",
            "device/backend/*_csf.c",
@@ -173,6 +188,8 @@ bob_kernel_module {
            "gpu/backend/*_csf.h",
            "tl/backend/*_csf.c",
            "mmu/backend/*_csf.c",
+            "ipa/backend/*_csf.c",
+            "ipa/backend/*_csf.h",
        ],
    },
    mali_arbiter_support: {
--- a/drivers/gpu/arm/bifrost/context/backend/mali_kbase_context_csf.c
+++ b/drivers/gpu/arm/bifrost/context/backend/mali_kbase_context_csf.c
@@ -6,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -28,18 +26,17 @@
 #include <context/mali_kbase_context_internal.h>
 #include <gpu/mali_kbase_gpu_regmap.h>
 #include <mali_kbase.h>
-#include <mali_kbase_ctx_sched.h>
 #include <mali_kbase_dma_fence.h>
 #include <mali_kbase_mem_linux.h>
 #include <mali_kbase_mem_pool_group.h>
 #include <mmu/mali_kbase_mmu.h>
 #include <tl/mali_kbase_timeline.h>
-#include <tl/mali_kbase_tracepoints.h>

 #ifdef CONFIG_DEBUG_FS
 #include <csf/mali_kbase_csf_csg_debugfs.h>
 #include <csf/mali_kbase_csf_kcpu_debugfs.h>
 #include <csf/mali_kbase_csf_tiler_heap_debugfs.h>
+#include <csf/mali_kbase_csf_cpu_queue_debugfs.h>
 #include <mali_kbase_debug_mem_view.h>
 #include <mali_kbase_mem_pool_debugfs.h>

@@ -51,6 +48,7 @@ void kbase_context_debugfs_init(struct kbase_context *const kctx)
 	kbase_csf_queue_group_debugfs_init(kctx);
 	kbase_csf_kcpu_debugfs_init(kctx);
 	kbase_csf_tiler_heap_debugfs_init(kctx);
+	kbase_csf_cpu_queue_debugfs_init(kctx);
 }
 KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);

@@ -73,24 +71,34 @@ void kbase_context_debugfs_term(struct kbase_context *const kctx)
 KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
 #endif /* CONFIG_DEBUG_FS */

+static void kbase_context_free(struct kbase_context *kctx)
+{
+	kbase_timeline_post_kbase_context_destroy(kctx);
+
+	vfree(kctx);
+}
+
 static const struct kbase_context_init context_init[] = {
-	{kbase_context_common_init, kbase_context_common_term, NULL},
-	{kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
-			"Memory pool goup initialization failed"},
-	{kbase_mem_evictable_init, kbase_mem_evictable_deinit,
-			"Memory evictable initialization failed"},
-	{kbase_context_mmu_init, kbase_context_mmu_term,
-			"MMU initialization failed"},
-	{kbase_context_mem_alloc_page, kbase_context_mem_pool_free,
-			"Memory alloc page failed"},
-	{kbase_region_tracker_init, kbase_region_tracker_term,
-			"Region tracker initialization failed"},
-	{kbase_sticky_resource_init, kbase_context_sticky_resource_term,
-			"Sticky resource initialization failed"},
-	{kbase_jit_init, kbase_jit_term,
-			"JIT initialization failed"},
-	{kbase_csf_ctx_init, kbase_csf_ctx_term,
-			"CSF context initialization failed"},
+	{ NULL, kbase_context_free, NULL },
+	{ kbase_context_common_init, kbase_context_common_term,
+	  "Common context initialization failed" },
+	{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
+	  "Memory pool group initialization failed" },
+	{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
+	  "Memory evictable initialization failed" },
+	{ kbase_context_mmu_init, kbase_context_mmu_term,
+	  "MMU initialization failed" },
+	{ kbase_context_mem_alloc_page, kbase_context_mem_pool_free,
+	  "Memory alloc page failed" },
+	{ kbase_region_tracker_init, kbase_region_tracker_term,
+	  "Region tracker initialization failed" },
+	{ kbase_sticky_resource_init, kbase_context_sticky_resource_term,
+	  "Sticky resource initialization failed" },
+	{ kbase_jit_init, kbase_jit_term, "JIT initialization failed" },
+	{ kbase_csf_ctx_init, kbase_csf_ctx_term,
+	  "CSF context initialization failed" },
+	{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
+	  "Adding kctx to device failed" },
 };

 static void kbase_context_term_partial(
@@ -134,14 +142,23 @@ struct kbase_context *kbase_create_context(struct kbase_device *kbdev,
 #if defined(CONFIG_64BIT)
 	else
 		kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
-#endif /* !defined(CONFIG_64BIT) */
+#endif /* defined(CONFIG_64BIT) */

 	for (i = 0; i < ARRAY_SIZE(context_init); i++) {
-		int err = context_init[i].init(kctx);
+		int err = 0;
+
+		if (context_init[i].init)
+			err = context_init[i].init(kctx);

 		if (err) {
 			dev_err(kbdev->dev, "%s error = %d\n",
 						context_init[i].err_mes, err);
+
+			/* kctx should be freed by kbase_context_free().
+			 * Otherwise it will result in memory leak.
+			 */
+			WARN_ON(i == 0);
+
 			kbase_context_term_partial(kctx, i);
 			return NULL;
 		}
@@ -162,11 +179,18 @@ void kbase_destroy_context(struct kbase_context *kctx)
 	if (WARN_ON(!kbdev))
 		return;

-	/* Ensure the core is powered up for the destroy process
-	 * A suspend won't happen here, because we're in a syscall
-	 * from a userspace thread.
+	/* Context termination could happen whilst the system suspend of
+	 * the GPU device is ongoing or has completed. It has been seen on
+	 * Customer side that a hang could occur if context termination is
+	 * not blocked until the resume of GPU device.
 	 */
-	kbase_pm_context_active(kbdev);
+	while (kbase_pm_context_active_handle_suspend(
+		kbdev, KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
+		dev_info(kbdev->dev,
+			 "Suspend in progress when destroying context");
+		wait_event(kbdev->pm.resume_wait,
+			   !kbase_pm_is_suspending(kbdev));
+	}

 	kbase_mem_pool_group_mark_dying(&kctx->mem_pools);

--- a/drivers/gpu/arm/bifrost/context/backend/mali_kbase_context_jm.c
+++ b/drivers/gpu/arm/bifrost/context/backend/mali_kbase_context_jm.c
@@ -1,12 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -35,7 +33,6 @@
 #include <mali_kbase_mem_pool_group.h>
 #include <mmu/mali_kbase_mmu.h>
 #include <tl/mali_kbase_timeline.h>
-#include <tl/mali_kbase_tracepoints.h>

 #ifdef CONFIG_DEBUG_FS
 #include <mali_kbase_debug_mem_view.h>
@@ -47,14 +44,12 @@ void kbase_context_debugfs_init(struct kbase_context *const kctx)
 	kbase_mem_pool_debugfs_init(kctx->kctx_dentry, kctx);
 	kbase_jit_debugfs_init(kctx);
 	kbasep_jd_debugfs_ctx_init(kctx);
-	kbase_debug_job_fault_context_init(kctx);
 }
 KBASE_EXPORT_SYMBOL(kbase_context_debugfs_init);

 void kbase_context_debugfs_term(struct kbase_context *const kctx)
 {
 	debugfs_remove_recursive(kctx->kctx_dentry);
-	kbase_debug_job_fault_context_term(kctx);
 }
 KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);
 #else
@@ -73,12 +68,7 @@ KBASE_EXPORT_SYMBOL(kbase_context_debugfs_term);

 static int kbase_context_kbase_kinstr_jm_init(struct kbase_context *kctx)
 {
-	int ret = kbase_kinstr_jm_init(&kctx->kinstr_jm);
-
-	if (!ret)
-		return ret;
-
-	return 0;
+	return kbase_kinstr_jm_init(&kctx->kinstr_jm);
 }

 static void kbase_context_kbase_kinstr_jm_term(struct kbase_context *kctx)
@@ -114,12 +104,27 @@ static int kbase_context_submit_check(struct kbase_context *kctx)
 	return 0;
 }

+static void kbase_context_flush_jobs(struct kbase_context *kctx)
+{
+	kbase_jd_zap_context(kctx);
+	flush_workqueue(kctx->jctx.job_done_wq);
+}
+
+static void kbase_context_free(struct kbase_context *kctx)
+{
+	kbase_timeline_post_kbase_context_destroy(kctx);
+
+	vfree(kctx);
+}
+
 static const struct kbase_context_init context_init[] = {
-	{ kbase_context_common_init, kbase_context_common_term, NULL },
+	{ NULL, kbase_context_free, NULL },
+	{ kbase_context_common_init, kbase_context_common_term,
+	  "Common context initialization failed" },
 	{ kbase_dma_fence_init, kbase_dma_fence_term,
 	  "DMA fence initialization failed" },
 	{ kbase_context_mem_pool_group_init, kbase_context_mem_pool_group_term,
-	  "Memory pool goup initialization failed" },
+	  "Memory pool group initialization failed" },
 	{ kbase_mem_evictable_init, kbase_mem_evictable_deinit,
 	  "Memory evictable initialization failed" },
 	{ kbase_context_mmu_init, kbase_context_mmu_term,
@@ -134,13 +139,22 @@ static const struct kbase_context_init context_init[] = {
 	{ kbase_context_kbase_kinstr_jm_init,
 	  kbase_context_kbase_kinstr_jm_term,
 	  "JM instrumentation initialization failed" },
-	{ kbase_context_kbase_timer_setup, NULL, NULL },
+	{ kbase_context_kbase_timer_setup, NULL,
+	  "Timers initialization failed" },
 	{ kbase_event_init, kbase_event_cleanup,
 	  "Event initialization failed" },
 	{ kbasep_js_kctx_init, kbasep_js_kctx_term,
 	  "JS kctx initialization failed" },
 	{ kbase_jd_init, kbase_jd_exit, "JD initialization failed" },
-	{ kbase_context_submit_check, NULL, NULL },
+	{ kbase_context_submit_check, NULL, "Enabling job submission failed" },
+#ifdef CONFIG_DEBUG_FS
+	{ kbase_debug_job_fault_context_init,
+	  kbase_debug_job_fault_context_term,
+	  "Job fault context initialization failed" },
+#endif
+	{ NULL, kbase_context_flush_jobs, NULL },
+	{ kbase_context_add_to_dev_list, kbase_context_remove_from_dev_list,
+	  "Adding kctx to device failed" },
 };

 static void kbase_context_term_partial(
@@ -184,14 +198,23 @@ struct kbase_context *kbase_create_context(struct kbase_device *kbdev,
 #if defined(CONFIG_64BIT)
 	else
 		kbase_ctx_flag_set(kctx, KCTX_FORCE_SAME_VA);
-#endif /* !defined(CONFIG_64BIT) */
+#endif /* defined(CONFIG_64BIT) */

 	for (i = 0; i < ARRAY_SIZE(context_init); i++) {
-		int err = context_init[i].init(kctx);
+		int err = 0;
+
+		if (context_init[i].init)
+			err = context_init[i].init(kctx);

 		if (err) {
 			dev_err(kbdev->dev, "%s error = %d\n",
 						context_init[i].err_mes, err);
+
+			/* kctx should be freed by kbase_context_free().
+			 * Otherwise it will result in memory leak.
+			 */
+			WARN_ON(i == 0);
+
 			kbase_context_term_partial(kctx, i);
 			return NULL;
 		}
@@ -212,17 +235,27 @@ void kbase_destroy_context(struct kbase_context *kctx)
 	if (WARN_ON(!kbdev))
 		return;

-	/* Ensure the core is powered up for the destroy process
-	 * A suspend won't happen here, because we're in a syscall
-	 * from a userspace thread.
+	/* Context termination could happen whilst the system suspend of
+	 * the GPU device is ongoing or has completed. It has been seen on
+	 * Customer side that a hang could occur if context termination is
+	 * not blocked until the resume of GPU device.
 	 */
-	kbase_pm_context_active(kbdev);
+#ifdef CONFIG_MALI_ARBITER_SUPPORT
+	atomic_inc(&kbdev->pm.gpu_users_waiting);
+#endif /* CONFIG_MALI_ARBITER_SUPPORT */
+	while (kbase_pm_context_active_handle_suspend(
+		kbdev, KBASE_PM_SUSPEND_HANDLER_DONT_INCREASE)) {
+		dev_dbg(kbdev->dev,
+			 "Suspend in progress when destroying context");
+		wait_event(kbdev->pm.resume_wait,
+			   !kbase_pm_is_suspending(kbdev));
+	}
+#ifdef CONFIG_MALI_ARBITER_SUPPORT
+	atomic_dec(&kbdev->pm.gpu_users_waiting);
+#endif /* CONFIG_MALI_ARBITER_SUPPORT */

 	kbase_mem_pool_group_mark_dying(&kctx->mem_pools);

-	kbase_jd_zap_context(kctx);
-	flush_workqueue(kctx->jctx.job_done_wq);
-
 	kbase_context_term_partial(kctx, ARRAY_SIZE(context_init));

 	kbase_pm_context_idle(kbdev);
--- a/drivers/gpu/arm/bifrost/context/mali_kbase_context.c
+++ b/drivers/gpu/arm/bifrost/context/mali_kbase_context.c
@@ -6,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -17,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 /*
@@ -28,10 +26,8 @@
 #include <mali_kbase.h>
 #include <gpu/mali_kbase_gpu_regmap.h>
 #include <mali_kbase_mem_linux.h>
-#include <mali_kbase_dma_fence.h>
 #include <mali_kbase_ctx_sched.h>
 #include <mali_kbase_mem_pool_group.h>
-#include <tl/mali_kbase_tracepoints.h>
 #include <tl/mali_kbase_timeline.h>
 #include <mmu/mali_kbase_mmu.h>
 #include <context/mali_kbase_context_internal.h>
@@ -170,22 +166,49 @@ int kbase_context_common_init(struct kbase_context *kctx)
 	mutex_init(&kctx->legacy_hwcnt_lock);

 	mutex_lock(&kctx->kbdev->kctx_list_lock);
-	list_add(&kctx->kctx_list_link, &kctx->kbdev->kctx_list);

 	err = kbase_insert_kctx_to_process(kctx);
 	if (err)
 		dev_err(kctx->kbdev->dev,
 		"(err:%d) failed to insert kctx to kbase_process\n", err);

-	KBASE_TLSTREAM_TL_KBASE_NEW_CTX(kctx->kbdev, kctx->id,
-		kctx->kbdev->gpu_props.props.raw_props.gpu_id);
-	KBASE_TLSTREAM_TL_NEW_CTX(kctx->kbdev, kctx, kctx->id,
-			(u32)(kctx->tgid));
 	mutex_unlock(&kctx->kbdev->kctx_list_lock);

 	return err;
 }

+int kbase_context_add_to_dev_list(struct kbase_context *kctx)
+{
+	if (WARN_ON(!kctx))
+		return -EINVAL;
+
+	if (WARN_ON(!kctx->kbdev))
+		return -EINVAL;
+
+	mutex_lock(&kctx->kbdev->kctx_list_lock);
+	list_add(&kctx->kctx_list_link, &kctx->kbdev->kctx_list);
+	mutex_unlock(&kctx->kbdev->kctx_list_lock);
+
+	kbase_timeline_post_kbase_context_create(kctx);
+
+	return 0;
+}
+
+void kbase_context_remove_from_dev_list(struct kbase_context *kctx)
+{
+	if (WARN_ON(!kctx))
+		return;
+
+	if (WARN_ON(!kctx->kbdev))
+		return;
+
+	kbase_timeline_pre_kbase_context_destroy(kctx);
+
+	mutex_lock(&kctx->kbdev->kctx_list_lock);
+	list_del_init(&kctx->kctx_list_link);
+	mutex_unlock(&kctx->kbdev->kctx_list_lock);
+}
+
 /**
 * kbase_remove_kctx_from_process - remove a terminating context from
 *                                    the process list.
@@ -238,24 +261,9 @@ void kbase_context_common_term(struct kbase_context *kctx)

 	mutex_lock(&kctx->kbdev->kctx_list_lock);
 	kbase_remove_kctx_from_process(kctx);
-
-	KBASE_TLSTREAM_TL_KBASE_DEL_CTX(kctx->kbdev, kctx->id);
-
-	KBASE_TLSTREAM_TL_DEL_CTX(kctx->kbdev, kctx);
-	list_del(&kctx->kctx_list_link);
 	mutex_unlock(&kctx->kbdev->kctx_list_lock);

 	KBASE_KTRACE_ADD(kctx->kbdev, CORE_CTX_DESTROY, kctx, 0u);
-
-	/* Flush the timeline stream, so the user can see the termination
-	 * tracepoints being fired.
-	 * The "if" statement below is for optimization. It is safe to call
-	 * kbase_timeline_streams_flush when timeline is disabled.
-	 */
-	if (atomic_read(&kctx->kbdev->timeline_flags) != 0)
-		kbase_timeline_streams_flush(kctx->kbdev->timeline);
-
-	vfree(kctx);
 }

 int kbase_context_mem_pool_group_init(struct kbase_context *kctx)
@@ -273,11 +281,9 @@ void kbase_context_mem_pool_group_term(struct kbase_context *kctx)

 int kbase_context_mmu_init(struct kbase_context *kctx)
 {
-	kbase_mmu_init(kctx->kbdev,
-		&kctx->mmu, kctx,
+	return kbase_mmu_init(
+		kctx->kbdev, &kctx->mmu, kctx,
 		base_context_mmu_group_id_get(kctx->create_flags));
-
-	return 0;
 }

 void kbase_context_mmu_term(struct kbase_context *kctx)
--- a/drivers/gpu/arm/bifrost/context/mali_kbase_context.h
+++ b/drivers/gpu/arm/bifrost/context/mali_kbase_context.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2011-2017, 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,18 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-/*
- *
- * (C) COPYRIGHT 2011-2017, 2019 ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
- *
 */

 #ifndef _KBASE_CONTEXT_H_
@@ -117,25 +106,7 @@ static inline bool kbase_ctx_flag(struct kbase_context *kctx,
 static inline void kbase_ctx_flag_clear(struct kbase_context *kctx,
 					enum kbase_context_flags flag)
 {
-#if KERNEL_VERSION(4, 3, 0) > LINUX_VERSION_CODE
-	/*
-	 * Earlier kernel versions doesn't have atomic_andnot() or
-	 * atomic_and(). atomic_clear_mask() was only available on some
-	 * architectures and removed on arm in v3.13 on arm and arm64.
-	 *
-	 * Use a compare-exchange loop to clear the flag on pre 4.3 kernels,
-	 * when atomic_andnot() becomes available.
-	 */
-	int old, new;
-
-	do {
-		old = atomic_read(&kctx->flags);
-		new = old & ~flag;
-
-	} while (atomic_cmpxchg(&kctx->flags, old, new) != old);
-#else
 	atomic_andnot(flag, &kctx->flags);
-#endif
 }

 /**
--- a/drivers/gpu/arm/bifrost/context/mali_kbase_context_internal.h
+++ b/drivers/gpu/arm/bifrost/context/mali_kbase_context_internal.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,16 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
- *//* SPDX-License-Identifier: GPL-2.0 */
-/*
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
- *
- * This program is free software and is provided to you under the terms of the
- * GNU General Public License version 2 as published by the Free Software
- * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
 */

 #include <mali_kbase.h>
@@ -58,3 +49,6 @@ int kbase_context_mem_alloc_page(struct kbase_context *kctx);
 void kbase_context_mem_pool_free(struct kbase_context *kctx);

 void kbase_context_sticky_resource_term(struct kbase_context *kctx);
+
+int kbase_context_add_to_dev_list(struct kbase_context *kctx);
+void kbase_context_remove_from_dev_list(struct kbase_context *kctx);
--- a/drivers/gpu/arm/bifrost/csf/Kbuild
+++ b/drivers/gpu/arm/bifrost/csf/Kbuild
@@ -1,10 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
 #
 # (C) COPYRIGHT 2018-2020 ARM Limited. All rights reserved.
 #
 # This program is free software and is provided to you under the terms of the
 # GNU General Public License version 2 as published by the Free Software
 # Foundation, and any use by you of this program is subject to the terms
-# of such GNU licence.
+# of such GNU license.
 #
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -15,11 +16,9 @@
 # along with this program; if not, you can access it online at
 # http://www.gnu.org/licenses/gpl-2.0.html.
 #
-# SPDX-License-Identifier: GPL-2.0
-#
 #

-mali_kbase-y += \
+bifrost_kbase-y += \
 	csf/mali_kbase_csf_firmware_cfg.o \
 	csf/mali_kbase_csf_trace_buffer.o \
 	csf/mali_kbase_csf.o \
@@ -33,8 +32,11 @@ mali_kbase-y += \
 	csf/mali_kbase_csf_csg_debugfs.o \
 	csf/mali_kbase_csf_kcpu_debugfs.o \
 	csf/mali_kbase_csf_protected_memory.o \
-	csf/mali_kbase_csf_tiler_heap_debugfs.o
+	csf/mali_kbase_csf_tiler_heap_debugfs.o \
+	csf/mali_kbase_csf_cpu_queue_debugfs.o

-mali_kbase-$(CONFIG_MALI_REAL_HW) += csf/mali_kbase_csf_firmware.o
+bifrost_kbase-$(CONFIG_MALI_REAL_HW) += csf/mali_kbase_csf_firmware.o

-mali_kbase-$(CONFIG_MALI_BIFROST_NO_MALI) += csf/mali_kbase_csf_firmware_no_mali.o
+bifrost_kbase-$(CONFIG_MALI_BIFROST_NO_MALI) += csf/mali_kbase_csf_firmware_no_mali.o
+
+include $(src)/csf/ipa_control/Kbuild
--- a/drivers/gpu/arm/bifrost/csf/ipa_control/Kbuild
+++ b/drivers/gpu/arm/bifrost/csf/ipa_control/Kbuild
@@ -0,0 +1,22 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+#
+# This program is free software and is provided to you under the terms of the
+# GNU General Public License version 2 as published by the Free Software
+# Foundation, and any use by you of this program is subject to the terms
+# of such GNU license.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+#
+
+bifrost_kbase-y += \
+	csf/ipa_control/mali_kbase_csf_ipa_control.o
--- a/drivers/gpu/arm/bifrost/csf/ipa_control/mali_kbase_csf_ipa_control.c
+++ b/drivers/gpu/arm/bifrost/csf/ipa_control/mali_kbase_csf_ipa_control.c
@@ -0,0 +1,925 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+#include <mali_kbase.h>
+#include "mali_kbase_clk_rate_trace_mgr.h"
+#include "mali_kbase_csf_ipa_control.h"
+
+/*
+ * Status flags from the STATUS register of the IPA Control interface.
+ */
+#define STATUS_COMMAND_ACTIVE ((u32)1 << 0)
+#define STATUS_TIMER_ACTIVE ((u32)1 << 1)
+#define STATUS_AUTO_ACTIVE ((u32)1 << 2)
+#define STATUS_PROTECTED_MODE ((u32)1 << 8)
+#define STATUS_RESET ((u32)1 << 9)
+#define STATUS_TIMER_ENABLED ((u32)1 << 31)
+
+/*
+ * Commands for the COMMAND register of the IPA Control interface.
+ */
+#define COMMAND_NOP ((u32)0)
+#define COMMAND_APPLY ((u32)1)
+#define COMMAND_CLEAR ((u32)2)
+#define COMMAND_SAMPLE ((u32)3)
+#define COMMAND_PROTECTED_ACK ((u32)4)
+#define COMMAND_RESET_ACK ((u32)5)
+
+/**
+ * Default value for the TIMER register of the IPA Control interface,
+ * expressed in milliseconds.
+ *
+ * The chosen value is a trade off between two requirements: the IPA Control
+ * interface should sample counters with a resolution in the order of
+ * milliseconds, while keeping GPU overhead as limited as possible.
+ */
+#define TIMER_DEFAULT_VALUE_MS ((u32)10) /* 10 milliseconds */
+
+/**
+ * Number of timer events per second.
+ */
+#define TIMER_EVENTS_PER_SECOND ((u32)1000 / TIMER_DEFAULT_VALUE_MS)
+
+/**
+ * Maximum number of loops polling the GPU before we assume the GPU has hung.
+ */
+#define IPA_INACTIVE_MAX_LOOPS ((unsigned int)8000000)
+
+/**
+ * Number of bits used to configure a performance counter in SELECT registers.
+ */
+#define IPA_CONTROL_SELECT_BITS_PER_CNT ((u64)8)
+
+/**
+ * Maximum value of a performance counter.
+ */
+#define MAX_PRFCNT_VALUE (((u64)1 << 48) - 1)
+
+/**
+ * struct kbase_ipa_control_listener_data - Data for the GPU clock frequency
+ *                                          listener
+ *
+ * @listener: GPU clock frequency listener.
+ * @kbdev:    Pointer to kbase device.
+ */
+struct kbase_ipa_control_listener_data {
+	struct kbase_clk_rate_listener listener;
+	struct kbase_device *kbdev;
+};
+
+static u32 timer_value(u32 gpu_rate)
+{
+	return gpu_rate / TIMER_EVENTS_PER_SECOND;
+}
+
+static int wait_status(struct kbase_device *kbdev, u32 flags)
+{
+	unsigned int max_loops = IPA_INACTIVE_MAX_LOOPS;
+	u32 status = kbase_reg_read(kbdev, IPA_CONTROL_REG(STATUS));
+
+	/*
+	 * Wait for the STATUS register to indicate that flags have been
+	 * cleared, in case a transition is pending.
+	 */
+	while (--max_loops && (status & flags))
+		status = kbase_reg_read(kbdev, IPA_CONTROL_REG(STATUS));
+	if (max_loops == 0) {
+		dev_err(kbdev->dev, "IPA_CONTROL STATUS register stuck");
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static int apply_select_config(struct kbase_device *kbdev, u64 *select)
+{
+	int ret;
+
+	u32 select_cshw_lo = (u32)(select[KBASE_IPA_CORE_TYPE_CSHW] & U32_MAX);
+	u32 select_cshw_hi =
+		(u32)((select[KBASE_IPA_CORE_TYPE_CSHW] >> 32) & U32_MAX);
+	u32 select_memsys_lo =
+		(u32)(select[KBASE_IPA_CORE_TYPE_MEMSYS] & U32_MAX);
+	u32 select_memsys_hi =
+		(u32)((select[KBASE_IPA_CORE_TYPE_MEMSYS] >> 32) & U32_MAX);
+	u32 select_tiler_lo =
+		(u32)(select[KBASE_IPA_CORE_TYPE_TILER] & U32_MAX);
+	u32 select_tiler_hi =
+		(u32)((select[KBASE_IPA_CORE_TYPE_TILER] >> 32) & U32_MAX);
+	u32 select_shader_lo =
+		(u32)(select[KBASE_IPA_CORE_TYPE_SHADER] & U32_MAX);
+	u32 select_shader_hi =
+		(u32)((select[KBASE_IPA_CORE_TYPE_SHADER] >> 32) & U32_MAX);
+
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_CSHW_LO), select_cshw_lo);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_CSHW_HI), select_cshw_hi);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_MEMSYS_LO),
+			select_memsys_lo);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_MEMSYS_HI),
+			select_memsys_hi);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_TILER_LO),
+			select_tiler_lo);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_TILER_HI),
+			select_tiler_hi);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_SHADER_LO),
+			select_shader_lo);
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(SELECT_SHADER_HI),
+			select_shader_hi);
+
+	ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
+
+	if (!ret)
+		kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND), COMMAND_APPLY);
+
+	return ret;
+}
+
+static u64 read_value_cnt(struct kbase_device *kbdev, u8 type, int select_idx)
+{
+	u32 value_lo, value_hi;
+
+	switch (type) {
+	case KBASE_IPA_CORE_TYPE_CSHW:
+		value_lo = kbase_reg_read(
+			kbdev, IPA_CONTROL_REG(VALUE_CSHW_REG_LO(select_idx)));
+		value_hi = kbase_reg_read(
+			kbdev, IPA_CONTROL_REG(VALUE_CSHW_REG_HI(select_idx)));
+		break;
+	case KBASE_IPA_CORE_TYPE_MEMSYS:
+		value_lo = kbase_reg_read(
+			kbdev,
+			IPA_CONTROL_REG(VALUE_MEMSYS_REG_LO(select_idx)));
+		value_hi = kbase_reg_read(
+			kbdev,
+			IPA_CONTROL_REG(VALUE_MEMSYS_REG_HI(select_idx)));
+		break;
+	case KBASE_IPA_CORE_TYPE_TILER:
+		value_lo = kbase_reg_read(
+			kbdev, IPA_CONTROL_REG(VALUE_TILER_REG_LO(select_idx)));
+		value_hi = kbase_reg_read(
+			kbdev, IPA_CONTROL_REG(VALUE_TILER_REG_HI(select_idx)));
+		break;
+	case KBASE_IPA_CORE_TYPE_SHADER:
+		value_lo = kbase_reg_read(
+			kbdev,
+			IPA_CONTROL_REG(VALUE_SHADER_REG_LO(select_idx)));
+		value_hi = kbase_reg_read(
+			kbdev,
+			IPA_CONTROL_REG(VALUE_SHADER_REG_HI(select_idx)));
+		break;
+	default:
+		WARN(1, "Unknown core type: %u\n", type);
+		value_lo = value_hi = 0;
+		break;
+	}
+
+	return (((u64)value_hi << 32) | value_lo);
+}
+
+static void build_select_config(struct kbase_ipa_control *ipa_ctrl,
+				u64 *select_config)
+{
+	size_t i;
+
+	for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
+		size_t j;
+
+		select_config[i] = 0ULL;
+
+		for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
+			struct kbase_ipa_control_prfcnt_config *prfcnt_config =
+				&ipa_ctrl->blocks[i].select[j];
+
+			select_config[i] |=
+				((u64)prfcnt_config->idx
+				 << (IPA_CONTROL_SELECT_BITS_PER_CNT * j));
+		}
+	}
+}
+
+static inline void calc_prfcnt_delta(struct kbase_device *kbdev,
+				     struct kbase_ipa_control_prfcnt *prfcnt,
+				     bool gpu_ready)
+{
+	u64 delta_value, raw_value;
+
+	if (gpu_ready)
+		raw_value = read_value_cnt(kbdev, (u8)prfcnt->type,
+					   prfcnt->select_idx);
+	else
+		raw_value = prfcnt->latest_raw_value;
+
+	if (raw_value < prfcnt->latest_raw_value) {
+		delta_value = (MAX_PRFCNT_VALUE - prfcnt->latest_raw_value) +
+			      raw_value;
+	} else {
+		delta_value = raw_value - prfcnt->latest_raw_value;
+	}
+
+	delta_value *= prfcnt->scaling_factor;
+
+	if (!WARN_ON_ONCE(kbdev->csf.ipa_control.cur_gpu_rate == 0))
+		if (prfcnt->gpu_norm)
+			delta_value /= kbdev->csf.ipa_control.cur_gpu_rate;
+
+	prfcnt->latest_raw_value = raw_value;
+
+	/* Accumulate the difference */
+	prfcnt->accumulated_diff += delta_value;
+}
+
+/**
+ * kbase_ipa_control_rate_change_notify - GPU frequency change callback
+ *
+ * @listener:     Clock frequency change listener.
+ * @clk_index:    Index of the clock for which the change has occurred.
+ * @clk_rate_hz:  Clock frequency(Hz).
+ *
+ * This callback notifies kbase_ipa_control about GPU frequency changes.
+ * Only top-level clock changes are meaningful. GPU frequency updates
+ * affect all performance counters which require GPU normalization
+ * in every session.
+ */
+static void
+kbase_ipa_control_rate_change_notify(struct kbase_clk_rate_listener *listener,
+				     u32 clk_index, u32 clk_rate_hz)
+{
+	if ((clk_index == KBASE_CLOCK_DOMAIN_TOP) && (clk_rate_hz != 0)) {
+		size_t i;
+		unsigned long flags;
+		struct kbase_ipa_control_listener_data *listener_data =
+			container_of(listener,
+				     struct kbase_ipa_control_listener_data,
+				     listener);
+		struct kbase_device *kbdev = listener_data->kbdev;
+		struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+
+		spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+
+		if (!kbdev->pm.backend.gpu_ready) {
+			dev_err(kbdev->dev,
+				"%s: GPU frequency cannot change while GPU is off",
+				__func__);
+			spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+			return;
+		}
+
+		/* Interrupts are already disabled and interrupt state is also saved */
+		spin_lock(&ipa_ctrl->lock);
+
+		for (i = 0; i < ipa_ctrl->num_active_sessions; i++) {
+			size_t j;
+			struct kbase_ipa_control_session *session = &ipa_ctrl->sessions[i];
+
+			for (j = 0; j < session->num_prfcnts; j++) {
+				struct kbase_ipa_control_prfcnt *prfcnt =
+					&session->prfcnts[j];
+
+				if (prfcnt->gpu_norm)
+					calc_prfcnt_delta(kbdev, prfcnt, true);
+			 }
+		}
+
+		ipa_ctrl->cur_gpu_rate = clk_rate_hz;
+
+		/* Update the timer for automatic sampling if active sessions
+		 * are present. Counters have already been manually sampled.
+		 */
+		if (ipa_ctrl->num_active_sessions > 0) {
+			kbase_reg_write(kbdev, IPA_CONTROL_REG(TIMER),
+					timer_value(ipa_ctrl->cur_gpu_rate));
+		}
+
+		spin_unlock(&ipa_ctrl->lock);
+
+		spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+	}
+}
+
+void kbase_ipa_control_init(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
+	struct kbase_ipa_control_listener_data *listener_data;
+	size_t i, j;
+
+	for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++) {
+		for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
+			ipa_ctrl->blocks[i].select[j].idx = 0;
+			ipa_ctrl->blocks[i].select[j].refcount = 0;
+		}
+		ipa_ctrl->blocks[i].num_available_counters =
+			KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS;
+	}
+
+	spin_lock_init(&ipa_ctrl->lock);
+	ipa_ctrl->num_active_sessions = 0;
+	for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
+		ipa_ctrl->sessions[i].active = false;
+	}
+
+	listener_data = kmalloc(sizeof(struct kbase_ipa_control_listener_data),
+				GFP_KERNEL);
+	if (listener_data) {
+		listener_data->listener.notify =
+			kbase_ipa_control_rate_change_notify;
+		listener_data->kbdev = kbdev;
+		ipa_ctrl->rtm_listener_data = listener_data;
+	}
+
+	spin_lock(&clk_rtm->lock);
+	if (clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP])
+		ipa_ctrl->cur_gpu_rate =
+			clk_rtm->clks[KBASE_CLOCK_DOMAIN_TOP]->clock_val;
+	if (listener_data)
+		kbase_clk_rate_trace_manager_subscribe_no_lock(
+			clk_rtm, &listener_data->listener);
+	spin_unlock(&clk_rtm->lock);
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_init);
+
+void kbase_ipa_control_term(struct kbase_device *kbdev)
+{
+	unsigned long flags;
+	struct kbase_clk_rate_trace_manager *clk_rtm = &kbdev->pm.clk_rtm;
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	struct kbase_ipa_control_listener_data *listener_data =
+		ipa_ctrl->rtm_listener_data;
+
+	WARN_ON(ipa_ctrl->num_active_sessions);
+
+	if (listener_data)
+		kbase_clk_rate_trace_manager_unsubscribe(clk_rtm, &listener_data->listener);
+	kfree(ipa_ctrl->rtm_listener_data);
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	if (kbdev->pm.backend.gpu_powered)
+		kbase_reg_write(kbdev, IPA_CONTROL_REG(TIMER), 0);
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_term);
+
+int kbase_ipa_control_register(
+	struct kbase_device *kbdev,
+	const struct kbase_ipa_control_perf_counter *perf_counters,
+	size_t num_counters, void **client)
+{
+	int ret = 0;
+	size_t i, session_idx, req_counters[KBASE_IPA_CORE_TYPE_NUM];
+	bool already_configured[KBASE_IPA_CONTROL_MAX_COUNTERS];
+	bool new_config = false;
+	struct kbase_ipa_control *ipa_ctrl;
+	struct kbase_ipa_control_session *session = NULL;
+	unsigned long flags;
+
+	if (WARN_ON(kbdev == NULL) || WARN_ON(perf_counters == NULL) ||
+	    WARN_ON(client == NULL) ||
+	    WARN_ON(num_counters > KBASE_IPA_CONTROL_MAX_COUNTERS)) {
+		dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
+		return -EINVAL;
+	}
+
+	kbase_pm_context_active(kbdev);
+
+	ipa_ctrl = &kbdev->csf.ipa_control;
+	spin_lock_irqsave(&ipa_ctrl->lock, flags);
+
+	if (ipa_ctrl->num_active_sessions == KBASE_IPA_CONTROL_MAX_SESSIONS) {
+		dev_err(kbdev->dev, "%s: too many sessions", __func__);
+		ret = -EBUSY;
+		goto exit;
+	}
+
+	for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
+		req_counters[i] = 0;
+
+	/*
+	 * Count how many counters would need to be configured in order to
+	 * satisfy the request. Requested counters which happen to be already
+	 * configured can be skipped.
+	 */
+	for (i = 0; i < num_counters; i++) {
+		size_t j;
+		enum kbase_ipa_core_type type = perf_counters[i].type;
+		u8 idx = perf_counters[i].idx;
+
+		if ((type >= KBASE_IPA_CORE_TYPE_NUM) ||
+		    (idx >= KBASE_IPA_CONTROL_CNT_MAX_IDX)) {
+			dev_err(kbdev->dev,
+				"%s: invalid requested type %u and/or index %u",
+				__func__, type, idx);
+			ret = -EINVAL;
+			goto exit;
+		}
+
+		for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
+			struct kbase_ipa_control_prfcnt_config *prfcnt_config =
+				&ipa_ctrl->blocks[type].select[j];
+
+			if (prfcnt_config->refcount > 0) {
+				if (prfcnt_config->idx == idx) {
+					already_configured[i] = true;
+					break;
+				}
+			}
+		}
+
+		if (j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS) {
+			already_configured[i] = false;
+			req_counters[type]++;
+			new_config = true;
+		}
+	}
+
+	for (i = 0; i < KBASE_IPA_CORE_TYPE_NUM; i++)
+		if (req_counters[i] >
+		    ipa_ctrl->blocks[i].num_available_counters) {
+			dev_err(kbdev->dev,
+				"%s: more counters (%zu) than available (%zu) have been requested for type %zu",
+				__func__, req_counters[i],
+				ipa_ctrl->blocks[i].num_available_counters, i);
+			ret = -EINVAL;
+			goto exit;
+		}
+
+	/*
+	 * The request has been validated.
+	 * Firstly, find an available session and then set up the initial state
+	 * of the session and update the configuration of performance counters
+	 * in the internal state of kbase_ipa_control.
+	 */
+	for (session_idx = 0; session_idx < KBASE_IPA_CONTROL_MAX_SESSIONS;
+	     session_idx++) {
+		session = &ipa_ctrl->sessions[session_idx];
+		if (!session->active)
+			break;
+	}
+
+	if (!session) {
+		dev_err(kbdev->dev, "%s: wrong or corrupt session state",
+			__func__);
+		ret = -EBUSY;
+		goto exit;
+	}
+
+	for (i = 0; i < num_counters; i++) {
+		struct kbase_ipa_control_prfcnt_config *prfcnt_config;
+		size_t j;
+		u8 type = perf_counters[i].type;
+		u8 idx = perf_counters[i].idx;
+
+		for (j = 0; j < KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS; j++) {
+			prfcnt_config = &ipa_ctrl->blocks[type].select[j];
+
+			if (already_configured[i]) {
+				if ((prfcnt_config->refcount > 0) &&
+				    (prfcnt_config->idx == idx)) {
+					break;
+				}
+			} else {
+				if (prfcnt_config->refcount == 0)
+					break;
+			}
+		}
+
+		if (WARN_ON((prfcnt_config->refcount > 0 &&
+			     prfcnt_config->idx != idx) ||
+			    (j == KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS))) {
+			dev_err(kbdev->dev,
+				"%s: invalid internal state: counter already configured or no counter available to configure",
+				__func__);
+			ret = -EBUSY;
+			goto exit;
+		}
+
+		if (prfcnt_config->refcount == 0) {
+			prfcnt_config->idx = idx;
+			ipa_ctrl->blocks[type].num_available_counters--;
+		}
+
+		session->prfcnts[i].accumulated_diff = 0;
+		session->prfcnts[i].type = type;
+		session->prfcnts[i].select_idx = j;
+		session->prfcnts[i].scaling_factor =
+			perf_counters[i].scaling_factor;
+		session->prfcnts[i].gpu_norm = perf_counters[i].gpu_norm;
+
+		/* Reports to this client for GPU time spent in protected mode
+		 * should begin from the point of registration.
+		 */
+		session->last_query_time = ktime_get_ns();
+
+		/* Initially, no time has been spent in protected mode */
+		session->protm_time = 0;
+
+		prfcnt_config->refcount++;
+	}
+
+	/*
+	 * Apply new configuration, if necessary.
+	 * As a temporary solution, make sure that the GPU is on
+	 * before applying the new configuration.
+	 */
+	if (new_config) {
+		u64 select_config[KBASE_IPA_CORE_TYPE_NUM];
+
+		build_select_config(ipa_ctrl, select_config);
+		ret = apply_select_config(kbdev, select_config);
+		if (ret)
+			dev_err(kbdev->dev,
+				"%s: failed to apply SELECT configuration",
+				__func__);
+	}
+
+	if (!ret) {
+		/* Accumulator registers don't contain any sample if the timer
+		 * has not been enabled first. Take a sample manually before
+		 * enabling the timer.
+		 */
+		if (ipa_ctrl->num_active_sessions == 0) {
+			kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND),
+					COMMAND_SAMPLE);
+			ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
+			if (!ret) {
+				kbase_reg_write(
+					kbdev, IPA_CONTROL_REG(TIMER),
+					timer_value(ipa_ctrl->cur_gpu_rate));
+			} else {
+				dev_err(kbdev->dev,
+					"%s: failed to sample new counters",
+					__func__);
+			}
+		}
+	}
+
+	if (!ret) {
+		session->num_prfcnts = num_counters;
+		session->active = true;
+		ipa_ctrl->num_active_sessions++;
+		*client = session;
+
+		/*
+		 * Read current raw value to initialize the session.
+		 * This is necessary to put the first query in condition
+		 * to generate a correct value by calculating the difference
+		 * from the beginning of the session.
+		 */
+		for (i = 0; i < session->num_prfcnts; i++) {
+			struct kbase_ipa_control_prfcnt *prfcnt =
+				&session->prfcnts[i];
+			u64 raw_value = read_value_cnt(kbdev, (u8)prfcnt->type,
+						       prfcnt->select_idx);
+			prfcnt->latest_raw_value = raw_value;
+		}
+	}
+
+exit:
+	spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
+	kbase_pm_context_idle(kbdev);
+	return ret;
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_register);
+
+int kbase_ipa_control_unregister(struct kbase_device *kbdev, const void *client)
+{
+	struct kbase_ipa_control *ipa_ctrl;
+	struct kbase_ipa_control_session *session;
+	int ret = 0;
+	size_t i;
+	unsigned long flags;
+	bool new_config = false, valid_session = false;
+
+	if (WARN_ON(kbdev == NULL) || WARN_ON(client == NULL)) {
+		dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
+		return -EINVAL;
+	}
+
+	kbase_pm_context_active(kbdev);
+
+	ipa_ctrl = &kbdev->csf.ipa_control;
+	session = (struct kbase_ipa_control_session *)client;
+
+	spin_lock_irqsave(&ipa_ctrl->lock, flags);
+
+	for (i = 0; i < KBASE_IPA_CONTROL_MAX_SESSIONS; i++) {
+		if (session == &ipa_ctrl->sessions[i]) {
+			valid_session = true;
+			break;
+		}
+	}
+
+	if (!valid_session) {
+		dev_err(kbdev->dev, "%s: invalid session handle", __func__);
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (ipa_ctrl->num_active_sessions == 0) {
+		dev_err(kbdev->dev, "%s: no active sessions found", __func__);
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (!session->active) {
+		dev_err(kbdev->dev, "%s: session is already inactive",
+			__func__);
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	for (i = 0; i < session->num_prfcnts; i++) {
+		struct kbase_ipa_control_prfcnt_config *prfcnt_config;
+		u8 type = session->prfcnts[i].type;
+		u8 idx = session->prfcnts[i].select_idx;
+
+		prfcnt_config = &ipa_ctrl->blocks[type].select[idx];
+
+		if (!WARN_ON(prfcnt_config->refcount == 0)) {
+			prfcnt_config->refcount--;
+			if (prfcnt_config->refcount == 0) {
+				new_config = true;
+				ipa_ctrl->blocks[type].num_available_counters++;
+			}
+		}
+	}
+
+	if (new_config) {
+		u64 select_config[KBASE_IPA_CORE_TYPE_NUM];
+
+		build_select_config(ipa_ctrl, select_config);
+		ret = apply_select_config(kbdev, select_config);
+		if (ret)
+			dev_err(kbdev->dev,
+				"%s: failed to apply SELECT configuration",
+				__func__);
+	}
+
+	session->num_prfcnts = 0;
+	session->active = false;
+	ipa_ctrl->num_active_sessions--;
+
+exit:
+	spin_unlock_irqrestore(&ipa_ctrl->lock, flags);
+	kbase_pm_context_idle(kbdev);
+	return ret;
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_unregister);
+
+int kbase_ipa_control_query(struct kbase_device *kbdev, const void *client,
+			    u64 *values, size_t num_values, u64 *protected_time)
+{
+	struct kbase_ipa_control *ipa_ctrl;
+	struct kbase_ipa_control_session *session;
+	size_t i;
+	unsigned long flags;
+	bool gpu_ready;
+
+	if (WARN_ON(kbdev == NULL) || WARN_ON(client == NULL) ||
+	    WARN_ON(values == NULL)) {
+		dev_err(kbdev->dev, "%s: wrong input arguments", __func__);
+		return -EINVAL;
+	}
+
+	ipa_ctrl = &kbdev->csf.ipa_control;
+	session = (struct kbase_ipa_control_session *)client;
+
+	if (WARN_ON(num_values < session->num_prfcnts)) {
+		dev_err(kbdev->dev,
+			"%s: not enough space (%zu) to return all counter values (%zu)",
+			__func__, num_values, session->num_prfcnts);
+		return -EINVAL;
+	}
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	gpu_ready = kbdev->pm.backend.gpu_ready;
+
+	for (i = 0; i < session->num_prfcnts; i++) {
+		struct kbase_ipa_control_prfcnt *prfcnt = &session->prfcnts[i];
+
+		calc_prfcnt_delta(kbdev, prfcnt, gpu_ready);
+		/* Return all the accumulated difference */
+		values[i] = prfcnt->accumulated_diff;
+		prfcnt->accumulated_diff = 0;
+	}
+
+	if (protected_time) {
+		u64 time_now = ktime_get_ns();
+
+		/* This is the amount of protected-mode time spent prior to
+		 * the current protm period.
+		 */
+		*protected_time = session->protm_time;
+
+		if (kbdev->protected_mode) {
+			*protected_time +=
+				time_now - MAX(session->last_query_time,
+					       ipa_ctrl->protm_start);
+		}
+		session->last_query_time = time_now;
+		session->protm_time = 0;
+	}
+
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+
+	for (i = session->num_prfcnts; i < num_values; i++)
+		values[i] = 0;
+
+	return 0;
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_query);
+
+void kbase_ipa_control_handle_gpu_power_off(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	size_t session_idx;
+	int ret;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	/* GPU should still be ready for use when this function gets called */
+	WARN_ON(!kbdev->pm.backend.gpu_ready);
+
+	/* Interrupts are already disabled and interrupt state is also saved */
+	spin_lock(&ipa_ctrl->lock);
+
+	/* First disable the automatic sampling through TIMER  */
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(TIMER), 0);
+	ret = wait_status(kbdev, STATUS_TIMER_ENABLED);
+	if (ret) {
+		dev_err(kbdev->dev,
+			"Wait for disabling of IPA control timer failed: %d",
+			ret);
+	}
+
+	/* Now issue the manual SAMPLE command */
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND), COMMAND_SAMPLE);
+	ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
+	if (ret) {
+		dev_err(kbdev->dev,
+			"Wait for the completion of manual sample failed: %d",
+			ret);
+	}
+
+	for (session_idx = 0; session_idx < ipa_ctrl->num_active_sessions;
+	     session_idx++) {
+		struct kbase_ipa_control_session *session =
+			&ipa_ctrl->sessions[session_idx];
+		size_t i;
+
+		for (i = 0; i < session->num_prfcnts; i++) {
+			struct kbase_ipa_control_prfcnt *prfcnt =
+				&session->prfcnts[i];
+
+			calc_prfcnt_delta(kbdev, prfcnt, true);
+		}
+	}
+
+	spin_unlock(&ipa_ctrl->lock);
+}
+
+void kbase_ipa_control_handle_gpu_power_on(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	int ret;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	/* GPU should have become ready for use when this function gets called */
+	WARN_ON(!kbdev->pm.backend.gpu_ready);
+
+	/* Interrupts are already disabled and interrupt state is also saved */
+	spin_lock(&ipa_ctrl->lock);
+
+	/* Re-issue the APPLY command, this is actually needed only for CSHW */
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND), COMMAND_APPLY);
+	ret = wait_status(kbdev, STATUS_COMMAND_ACTIVE);
+	if (ret) {
+		dev_err(kbdev->dev,
+			"Wait for the completion of apply command failed: %d",
+			ret);
+	}
+
+	/* Re-enable the timer for periodic sampling */
+	kbase_reg_write(kbdev, IPA_CONTROL_REG(TIMER),
+			timer_value(ipa_ctrl->cur_gpu_rate));
+
+	spin_unlock(&ipa_ctrl->lock);
+}
+
+void kbase_ipa_control_handle_gpu_reset_pre(struct kbase_device *kbdev)
+{
+	/* A soft reset is treated as a power down */
+	kbase_ipa_control_handle_gpu_power_off(kbdev);
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_pre);
+
+void kbase_ipa_control_handle_gpu_reset_post(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	int ret;
+	u32 status;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	/* GPU should have become ready for use when this function gets called */
+	WARN_ON(!kbdev->pm.backend.gpu_ready);
+
+	/* Interrupts are already disabled and interrupt state is also saved */
+	spin_lock(&ipa_ctrl->lock);
+
+	/* Check the status reset bit is set before acknowledging it */
+	status = kbase_reg_read(kbdev, IPA_CONTROL_REG(STATUS));
+	if (status & STATUS_RESET) {
+		/* Acknowledge the reset command */
+		kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND), COMMAND_RESET_ACK);
+		ret = wait_status(kbdev, STATUS_RESET);
+		if (ret) {
+			dev_err(kbdev->dev,
+				"Wait for the reset ack command failed: %d",
+				ret);
+		}
+	}
+
+	spin_unlock(&ipa_ctrl->lock);
+
+	kbase_ipa_control_handle_gpu_power_on(kbdev);
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_handle_gpu_reset_post);
+
+#if MALI_UNIT_TEST
+void kbase_ipa_control_rate_change_notify_test(struct kbase_device *kbdev,
+					       u32 clk_index, u32 clk_rate_hz)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	struct kbase_ipa_control_listener_data *listener_data =
+		ipa_ctrl->rtm_listener_data;
+
+	kbase_ipa_control_rate_change_notify(&listener_data->listener,
+					     clk_index, clk_rate_hz);
+}
+KBASE_EXPORT_TEST_API(kbase_ipa_control_rate_change_notify_test);
+#endif
+
+void kbase_ipa_control_protm_entered(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+	ipa_ctrl->protm_start = ktime_get_ns();
+}
+
+void kbase_ipa_control_protm_exited(struct kbase_device *kbdev)
+{
+	struct kbase_ipa_control *ipa_ctrl = &kbdev->csf.ipa_control;
+	size_t i;
+	u64 time_now = ktime_get_ns();
+	u32 status;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	for (i = 0; i < ipa_ctrl->num_active_sessions; i++) {
+		struct kbase_ipa_control_session *session =
+			&ipa_ctrl->sessions[i];
+		u64 protm_time = time_now - MAX(session->last_query_time,
+						ipa_ctrl->protm_start);
+
+		session->protm_time += protm_time;
+	}
+
+	/* Acknowledge the protected_mode bit in the IPA_CONTROL STATUS
+	 * register
+	 */
+	status = kbase_reg_read(kbdev, IPA_CONTROL_REG(STATUS));
+	if (status & STATUS_PROTECTED_MODE) {
+		int ret;
+
+		/* Acknowledge the protm command */
+		kbase_reg_write(kbdev, IPA_CONTROL_REG(COMMAND),
+				COMMAND_PROTECTED_ACK);
+		ret = wait_status(kbdev, STATUS_PROTECTED_MODE);
+		if (ret) {
+			dev_err(kbdev->dev,
+				"Wait for the protm ack command failed: %d",
+				ret);
+		}
+	}
+}
+
--- a/drivers/gpu/arm/bifrost/csf/ipa_control/mali_kbase_csf_ipa_control.h
+++ b/drivers/gpu/arm/bifrost/csf/ipa_control/mali_kbase_csf_ipa_control.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+#ifndef _KBASE_CSF_IPA_CONTROL_H_
+#define _KBASE_CSF_IPA_CONTROL_H_
+
+#include <mali_kbase.h>
+
+/**
+ * Maximum index accepted to configure an IPA Control performance counter.
+ */
+#define KBASE_IPA_CONTROL_CNT_MAX_IDX ((u8)64 * 3)
+
+/**
+ * struct kbase_ipa_control_perf_counter - Performance counter description
+ *
+ * @scaling_factor: Scaling factor by which the counter's value shall be
+ *                  multiplied. A scaling factor of 1 corresponds to units
+ *                  of 1 second if values are normalised by GPU frequency.
+ * @gpu_norm:       Indicating whether counter values shall be normalized by
+ *                  GPU frequency. If true, returned values represent
+ *                  an interval of time expressed in seconds (when the scaling
+ *                  factor is set to 1).
+ * @type:           Type of counter block for performance counter.
+ * @idx:            Index of the performance counter inside the block.
+ *                  It may be dependent on GPU architecture.
+ *                  It cannot be greater than KBASE_IPA_CONTROL_CNT_MAX_IDX.
+ *
+ * This structure is used by clients of the IPA Control component to describe
+ * a performance counter that they intend to read. The counter is identified
+ * by block and index. In addition to that, the client also specifies how
+ * values shall be represented. Raw values are a number of GPU cycles;
+ * if normalized, they are divided by GPU frequency and become an interval
+ * of time expressed in seconds, since the GPU frequency is given in Hz.
+ * The client may specify a scaling factor to multiply counter values before
+ * they are divided by frequency, in case the unit of time of 1 second is
+ * too low in resolution. For instance: a scaling factor of 1000 implies
+ * that the returned value is a time expressed in milliseconds; a scaling
+ * factor of 1000 * 1000 implies that the returned value is a time expressed
+ * in microseconds.
+ */
+struct kbase_ipa_control_perf_counter {
+	u64 scaling_factor;
+	bool gpu_norm;
+	enum kbase_ipa_core_type type;
+	u8 idx;
+};
+
+/**
+ * kbase_ipa_control_init - Initialize the IPA Control component
+ *
+ * @kbdev: Pointer to Kbase device.
+ */
+void kbase_ipa_control_init(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_term - Terminate the IPA Control component
+ *
+ * @kbdev: Pointer to Kbase device.
+ */
+void kbase_ipa_control_term(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_register - Register a client to the IPA Control component
+ *
+ * @kbdev:         Pointer to Kbase device.
+ * @perf_counters: Array of performance counters the client intends to read.
+ *                 For each counter the client specifies block, index,
+ *                 scaling factor and whether it must be normalized by GPU
+ *                 frequency.
+ * @num_counters:  Number of performance counters. It cannot exceed the total
+ *                 number of counters that exist on the IPA Control interface.
+ * @client:        Handle to an opaque structure set by IPA Control if
+ *                 the registration is successful. This handle identifies
+ *                 a client's session and shall be provided in its future
+ *                 queries.
+ *
+ * A client needs to subscribe to the IPA Control component by declaring which
+ * performance counters it intends to read, and specifying a scaling factor
+ * and whether normalization is requested for each performance counter.
+ * The function shall configure the IPA Control interface accordingly and start
+ * a session for the client that made the request. A unique handle is returned
+ * if registration is successful in order to identify the client's session
+ * and be used for future queries.
+ *
+ * Return: 0 on success, negative -errno on error
+ */
+int kbase_ipa_control_register(
+	struct kbase_device *kbdev,
+	const struct kbase_ipa_control_perf_counter *perf_counters,
+	size_t num_counters, void **client);
+
+/**
+ * kbase_ipa_control_unregister - Unregister a client from IPA Control
+ *
+ * @kbdev:  Pointer to kbase device.
+ * @client: Handle to an opaque structure that identifies the client session
+ *          to terminate, as returned by kbase_ipa_control_register.
+ *
+ * Return: 0 on success, negative -errno on error
+ */
+int kbase_ipa_control_unregister(struct kbase_device *kbdev,
+				 const void *client);
+
+/**
+ * kbase_ipa_control_query - Query performance counters
+ *
+ * @kbdev:          Pointer to kbase device.
+ * @client:         Handle to an opaque structure that identifies the client
+ *                  session, as returned by kbase_ipa_control_register.
+ * @values:         Array of values queried from performance counters, whose
+ *                  length depends on the number of counters requested at
+ *                  the time of registration. Values are scaled and normalized
+ *                  and represent the difference since the last query.
+ * @num_values:     Number of entries in the array of values that has been
+ *                  passed by the caller. It must be at least equal to the
+ *                  number of performance counters the client registered itself
+ *                  to read.
+ * @protected_time: Time spent in protected mode since last query,
+ *                  expressed in nanoseconds. This pointer may be NULL if the
+ *                  client doesn't want to know about this.
+ *
+ * A client that has already opened a session by registering itself to read
+ * some performance counters may use this function to query the values of
+ * those counters. The values returned are normalized by GPU frequency if
+ * requested and then multiplied by the scaling factor provided at the time
+ * of registration. Values always represent a difference since the last query.
+ *
+ * Performance counters are not updated while the GPU operates in protected
+ * mode. For this reason, returned values may be unreliable if the GPU has
+ * been in protected mode since the last query. The function returns success
+ * in that case, but it also gives a measure of how much time has been spent
+ * in protected mode.
+ *
+ * Return: 0 on success, negative -errno on error
+ */
+int kbase_ipa_control_query(struct kbase_device *kbdev, const void *client,
+			    u64 *values, size_t num_values,
+			    u64 *protected_time);
+
+/**
+ * kbase_ipa_control_handle_gpu_power_on - Handle the GPU power on event
+ *
+ * @kbdev:          Pointer to kbase device.
+ *
+ * This function is called after GPU has been powered and is ready for use.
+ * After the GPU power on, IPA Control component needs to ensure that the
+ * counters start incrementing again.
+ */
+void kbase_ipa_control_handle_gpu_power_on(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_handle_gpu_power_off - Handle the GPU power off event
+ *
+ * @kbdev:          Pointer to kbase device.
+ *
+ * This function is called just before the GPU is powered off when it is still
+ * ready for use.
+ * IPA Control component needs to be aware of the GPU power off so that it can
+ * handle the query from Clients appropriately and return meaningful values
+ * to them.
+ */
+void kbase_ipa_control_handle_gpu_power_off(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_handle_gpu_reset_pre - Handle the pre GPU reset event
+ *
+ * @kbdev:          Pointer to kbase device.
+ *
+ * This function is called when the GPU is about to be reset.
+ */
+void kbase_ipa_control_handle_gpu_reset_pre(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_handle_gpu_reset_post - Handle the post GPU reset event
+ *
+ * @kbdev:          Pointer to kbase device.
+ *
+ * This function is called after the GPU has been reset.
+ */
+void kbase_ipa_control_handle_gpu_reset_post(struct kbase_device *kbdev);
+
+#if MALI_UNIT_TEST
+/**
+ * kbase_ipa_control_rate_change_notify_test - Notify GPU rate change
+ *                                             (only for testing)
+ *
+ * @kbdev:       Pointer to kbase device.
+ * @clk_index:   Index of the clock for which the change has occurred.
+ * @clk_rate_hz: Clock frequency(Hz).
+ *
+ * Notify the IPA Control component about a GPU rate change.
+ */
+void kbase_ipa_control_rate_change_notify_test(struct kbase_device *kbdev,
+					       u32 clk_index, u32 clk_rate_hz);
+#endif /* MALI_UNIT_TEST */
+
+/**
+ * kbase_ipa_control_protm_entered - Tell IPA_CONTROL that protected mode
+ * has been entered.
+ *
+ * @kbdev:		Pointer to kbase device.
+ *
+ * This function provides a means through which IPA_CONTROL can be informed
+ * that the GPU has entered protected mode. Since the GPU cannot access
+ * performance counters while in this mode, this information is useful as
+ * it implies (a) the values of these registers cannot change, so theres no
+ * point trying to read them, and (b) IPA_CONTROL has a means through which
+ * to record the duration of time the GPU is in protected mode, which can
+ * then be forwarded on to clients, who may wish, for example, to assume
+ * that the GPU was busy 100% of the time while in this mode.
+ */
+void kbase_ipa_control_protm_entered(struct kbase_device *kbdev);
+
+/**
+ * kbase_ipa_control_protm_exited - Tell IPA_CONTROL that protected mode
+ * has been exited.
+ *
+ * @kbdev:		Pointer to kbase device
+ *
+ * This function provides a means through which IPA_CONTROL can be informed
+ * that the GPU has exited from protected mode.
+ */
+void kbase_ipa_control_protm_exited(struct kbase_device *kbdev);
+
+#endif /* _KBASE_CSF_IPA_CONTROL_H_ */
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf.c
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2018-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _KBASE_CSF_H_
@@ -28,11 +27,11 @@
 #include "mali_kbase_csf_firmware.h"
 #include "mali_kbase_csf_protected_memory.h"

-/* Indicate invalid command stream h/w interface
+/* Indicate invalid CS h/w interface
 */
 #define KBASEP_IF_NR_INVALID ((s8)-1)

-/* Indicate invalid command stream group number for a GPU command queue group
+/* Indicate invalid CSG number for a GPU command queue group
 */
 #define KBASEP_CSG_NR_INVALID ((s8)-1)

@@ -40,13 +39,10 @@
 */
 #define KBASEP_USER_DB_NR_INVALID ((s8)-1)

-/* Waiting timeout for global request completion acknowledgment */
-#define GLB_REQ_WAIT_TIMEOUT_MS (300) /* 300 milliseconds */
-
-#define CSG_REQ_EP_CFG (0x1 << CSG_REQ_EP_CFG_SHIFT)
-#define CSG_REQ_SYNC_UPDATE (0x1 << CSG_REQ_SYNC_UPDATE_SHIFT)
 #define FIRMWARE_PING_INTERVAL_MS (2000) /* 2 seconds */

+#define FIRMWARE_IDLE_HYSTERESIS_TIME_MS (10) /* Default 10 milliseconds */
+
 /**
 * enum kbase_csf_event_callback_action - return type for CSF event callbacks.
 *
@@ -124,9 +120,9 @@ void kbase_csf_event_wait_remove(struct kbase_context *kctx,
 void kbase_csf_event_wait_remove_all(struct kbase_context *kctx);

 /**
- * kbase_csf_read_error - Read command stream fatal error
+ * kbase_csf_read_error - Read CS fatal error
 *
- * This function takes the command stream fatal error from context's ordered
+ * This function takes the CS fatal error from context's ordered
 * error_list, copies its contents to @event_data.
 *
 * @kctx:       The kbase context to read fatal error from
@@ -150,8 +146,8 @@ bool kbase_csf_error_pending(struct kbase_context *kctx);
 * kbase_csf_event_signal - Signal a CSF event
 *
 * This function triggers all the CSF event callbacks that are registered to
- * a given Kbase context, and also signals the thread of userspace driver
- * (front-end), waiting for the CSF event.
+ * a given Kbase context, and also signals the event handling thread of
+ * userspace driver waiting for the CSF event.
 *
 * @kctx:  The kbase context whose CSF event callbacks shall be triggered.
 * @notify_gpu: Flag to indicate if CSF firmware should be notified of the
@@ -171,8 +167,7 @@ static inline void kbase_csf_event_signal_cpu_only(struct kbase_context *kctx)
 }

 /**
- * kbase_csf_ctx_init - Initialize the command-stream front-end for a GPU
- *                      address space.
+ * kbase_csf_ctx_init - Initialize the CSF interface for a GPU address space.
 *
 * @kctx:	Pointer to the kbase context which is being initialized.
 *
@@ -194,8 +189,7 @@ void kbase_csf_ctx_handle_fault(struct kbase_context *kctx,
 		struct kbase_fault *fault);

 /**
- * kbase_csf_ctx_term - Terminate the command-stream front-end for a GPU
- *                      address space.
+ * kbase_csf_ctx_term - Terminate the CSF interface for a GPU address space.
 *
 * This function terminates any remaining CSGs and CSs which weren't destroyed
 * before context termination.
@@ -268,6 +262,16 @@ int kbase_csf_queue_bind(struct kbase_context *kctx,
 */
 void kbase_csf_queue_unbind(struct kbase_queue *queue);

+/**
+ * kbase_csf_queue_unbind_stopped - Unbind a GPU command queue in the case
+ *                                  where it was never started.
+ * @queue:      Pointer to queue to be unbound.
+ *
+ * Variant of kbase_csf_queue_unbind() for use on error paths for cleaning up
+ * queues that failed to fully bind.
+ */
+void kbase_csf_queue_unbind_stopped(struct kbase_queue *queue);
+
 /**
 * kbase_csf_queue_kick - Schedule a GPU command queue on the firmware
 *
@@ -280,7 +284,9 @@ void kbase_csf_queue_unbind(struct kbase_queue *queue);
 int kbase_csf_queue_kick(struct kbase_context *kctx,
 			 struct kbase_ioctl_cs_queue_kick *kick);

-/** Find if given the queue group handle is valid.
+/**
+ * kbase_csf_queue_group_handle_is_valid - Find if the given queue group handle
+ *                                         is valid.
 *
 * This function is used to determine if the queue group handle is valid.
 *
@@ -340,7 +346,6 @@ void kbase_csf_term_descheduled_queue_group(struct kbase_queue_group *group);
 *			suspended.
 * @sus_buf:		Pointer to the structure which contains details of the
 *			user buffer and its kernel pinned pages.
- * @size:		The size in bytes for the user provided buffer.
 * @group_handle:	Handle for the group which uniquely identifies it within
 *			the context within which it was created.
 *
@@ -350,6 +355,16 @@ void kbase_csf_term_descheduled_queue_group(struct kbase_queue_group *group);
 int kbase_csf_queue_group_suspend(struct kbase_context *kctx,
 	struct kbase_suspend_copy_buffer *sus_buf, u8 group_handle);

+/**
+ * kbase_csf_add_group_fatal_error - Report a fatal group error to userspace
+ *
+ * @group:       GPU command queue group.
+ * @err_payload: Error payload to report.
+ */
+void kbase_csf_add_group_fatal_error(
+	struct kbase_queue_group *const group,
+	struct base_gpu_queue_group_error const *const err_payload);
+
 /**
 * kbase_csf_interrupt - Handle interrupts issued by CSF firmware.
 *
@@ -359,55 +374,96 @@ int kbase_csf_queue_group_suspend(struct kbase_context *kctx,
 void kbase_csf_interrupt(struct kbase_device *kbdev, u32 val);

 /**
- * kbase_csf_doorbell_mapping_init - Initialize the bitmap of Hw doorbell pages
- *                           used to track their availability.
+ * kbase_csf_doorbell_mapping_init - Initialize the fields that facilitates
+ *                                   the update of userspace mapping of HW
+ *                                   doorbell page.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * The function creates a file and allocates a dummy page to facilitate the
+ * update of userspace mapping to point to the dummy page instead of the real
+ * HW doorbell page after the suspend of queue group.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ *
+ * Return: 0 on success, or negative on failure.
 */
 int kbase_csf_doorbell_mapping_init(struct kbase_device *kbdev);

+/**
+ * kbase_csf_doorbell_mapping_term - Free the dummy page & close the file used
+ *                         to update the userspace mapping of HW doorbell page
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ */
 void kbase_csf_doorbell_mapping_term(struct kbase_device *kbdev);

 /**
- * kbase_csf_ring_csg_doorbell - ring the doorbell for a command stream group
- *                               interface.
+ * kbase_csf_setup_dummy_user_reg_page - Setup the dummy page that is accessed
+ *                                       instead of the User register page after
+ *                                       the GPU power down.
 *
- * The function kicks a notification on the command stream group interface to
- * firmware.
+ * The function allocates a dummy page which is used to replace the User
+ * register page in the userspace mapping after the power down of GPU.
+ * On the power up of GPU, the mapping is updated to point to the real
+ * User register page. The mapping is used to allow access to LATEST_FLUSH
+ * register from userspace.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
- * @slot: Index of command stream group interface for ringing the door-bell.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ *
+ * Return: 0 on success, or negative on failure.
+ */
+int kbase_csf_setup_dummy_user_reg_page(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_free_dummy_user_reg_page - Free the dummy page that was used
+ *                                 used to replace the User register page
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ */
+void kbase_csf_free_dummy_user_reg_page(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_ring_csg_doorbell - ring the doorbell for a CSG interface.
+ *
+ * The function kicks a notification on the CSG interface to firmware.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @slot: Index of CSG interface for ringing the door-bell.
 */
 void kbase_csf_ring_csg_doorbell(struct kbase_device *kbdev, int slot);

 /**
- * kbase_csf_ring_csg_slots_doorbell - ring the doorbell for a set of command
- *                                     stream group interfaces.
+ * kbase_csf_ring_csg_slots_doorbell - ring the doorbell for a set of CSG
+ *                                     interfaces.
 *
- * The function kicks a notification on a set of command stream group
- * interfaces to firmware.
+ * The function kicks a notification on a set of CSG interfaces to firmware.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 * @slot_bitmap: bitmap for the given slots, slot-0 on bit-0, etc.
 */
 void kbase_csf_ring_csg_slots_doorbell(struct kbase_device *kbdev,
 				       u32 slot_bitmap);

 /**
- * kbase_csf_ring_cs_kernel_doorbell - ring the kernel doorbell for a queue
+ * kbase_csf_ring_cs_kernel_doorbell - ring the kernel doorbell for a CSI
+ *                                     assigned to a GPU queue
 *
- * The function kicks a notification to the firmware for the command stream
- * interface to which the queue is bound.
+ * The function sends a doorbell interrupt notification to the firmware for
+ * a CSI assigned to a GPU queue.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
- * @queue: Pointer to the queue for ringing the door-bell.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @csi_index: ID of the CSI assigned to the GPU queue.
+ * @csg_nr:    Index of the CSG slot assigned to the queue
+ *             group to which the GPU queue is bound.
+ * @ring_csg_doorbell: Flag to indicate if the CSG doorbell needs to be rung
+ *                     after updating the CSG_DB_REQ. So if this flag is false
+ *                     the doorbell interrupt will not be sent to FW.
+ *                     The flag is supposed be false only when the input page
+ *                     for bound GPU queues is programmed at the time of
+ *                     starting/resuming the group on a CSG slot.
 */
 void kbase_csf_ring_cs_kernel_doorbell(struct kbase_device *kbdev,
-			struct kbase_queue *queue);
+				       int csi_index, int csg_nr,
+				       bool ring_csg_doorbell);

 /**
 * kbase_csf_ring_cs_user_doorbell - ring the user doorbell allocated for a
@@ -416,8 +472,7 @@ void kbase_csf_ring_cs_kernel_doorbell(struct kbase_device *kbdev,
 * The function kicks a notification to the firmware on the doorbell assigned
 * to the queue.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 * @queue: Pointer to the queue for ringing the door-bell.
 */
 void kbase_csf_ring_cs_user_doorbell(struct kbase_device *kbdev,
@@ -427,9 +482,8 @@ void kbase_csf_ring_cs_user_doorbell(struct kbase_device *kbdev,
 * kbase_csf_active_queue_groups_reset - Reset the state of all active GPU
 *                            command queue groups associated with the context.
 *
- * @kbdev:     Instance of a GPU platform device that implements a command
- *             stream front-end interface.
- * @kctx:      The kbase context.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @kctx:  The kbase context.
 *
 * This function will iterate through all the active/scheduled GPU command
 * queue groups associated with the context, deschedule and mark them as
@@ -441,4 +495,54 @@ void kbase_csf_ring_cs_user_doorbell(struct kbase_device *kbdev,
 void kbase_csf_active_queue_groups_reset(struct kbase_device *kbdev,
 			struct kbase_context *kctx);

+/**
+ * kbase_csf_priority_check - Check the priority requested
+ *
+ * @kbdev:        Device pointer
+ * @req_priority: Requested priority
+ *
+ * This will determine whether the requested priority can be satisfied.
+ *
+ * Return: The same or lower priority than requested.
+ */
+u8 kbase_csf_priority_check(struct kbase_device *kbdev, u8 req_priority);
+
+extern const u8 kbasep_csf_queue_group_priority_to_relative[BASE_QUEUE_GROUP_PRIORITY_COUNT];
+extern const u8 kbasep_csf_relative_to_queue_group_priority[KBASE_QUEUE_GROUP_PRIORITY_COUNT];
+
+/**
+ * kbase_csf_priority_relative_to_queue_group_priority - Convert relative to base priority
+ *
+ * @priority: kbase relative priority
+ *
+ * This will convert the monotonically increasing realtive priority to the
+ * fixed base priority list.
+ *
+ * Return: base_queue_group_priority priority.
+ */
+static inline u8 kbase_csf_priority_relative_to_queue_group_priority(u8 priority)
+{
+	if (priority >= KBASE_QUEUE_GROUP_PRIORITY_COUNT)
+		priority = KBASE_QUEUE_GROUP_PRIORITY_LOW;
+	return kbasep_csf_relative_to_queue_group_priority[priority];
+}
+
+/**
+ * kbase_csf_priority_queue_group_priority_to_relative - Convert base priority to relative
+ *
+ * @priority: base_queue_group_priority priority
+ *
+ * This will convert the fixed base priority list to monotonically increasing realtive priority.
+ *
+ * Return: kbase relative priority.
+ */
+static inline u8 kbase_csf_priority_queue_group_priority_to_relative(u8 priority)
+{
+	/* Apply low priority in case of invalid priority */
+	if (priority >= BASE_QUEUE_GROUP_PRIORITY_COUNT)
+		priority = BASE_QUEUE_GROUP_PRIORITY_LOW;
+	return kbasep_csf_queue_group_priority_to_relative[priority];
+}
+
+
 #endif /* _KBASE_CSF_H_ */
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_cpu_queue_debugfs.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_cpu_queue_debugfs.c
@@ -0,0 +1,191 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+#include "mali_kbase_csf_cpu_queue_debugfs.h"
+#include <mali_kbase.h>
+#include <linux/seq_file.h>
+
+#ifdef CONFIG_DEBUG_FS
+
+bool kbase_csf_cpu_queue_read_dump_req(struct kbase_context *kctx,
+					struct base_csf_notification *req)
+{
+	if (atomic_cmpxchg(&kctx->csf.cpu_queue.dump_req_status,
+			   BASE_CSF_CPU_QUEUE_DUMP_ISSUED,
+			   BASE_CSF_CPU_QUEUE_DUMP_PENDING) !=
+		BASE_CSF_CPU_QUEUE_DUMP_ISSUED) {
+		return false;
+	}
+
+	req->type = BASE_CSF_NOTIFICATION_CPU_QUEUE_DUMP;
+	return true;
+}
+
+/**
+ * kbasep_csf_cpu_queue_debugfs_show() - Print cpu queue information for per context
+ *
+ * @file: The seq_file for printing to
+ * @data: The debugfs dentry private data, a pointer to kbase_context
+ *
+ * Return: Negative error code or 0 on success.
+ */
+static int kbasep_csf_cpu_queue_debugfs_show(struct seq_file *file, void *data)
+{
+	struct kbase_context *kctx = file->private;
+
+	mutex_lock(&kctx->csf.lock);
+	if (atomic_read(&kctx->csf.cpu_queue.dump_req_status) !=
+				BASE_CSF_CPU_QUEUE_DUMP_COMPLETE) {
+		seq_printf(file, "Dump request already started! (try again)\n");
+		mutex_unlock(&kctx->csf.lock);
+		return -EBUSY;
+	}
+
+	atomic_set(&kctx->csf.cpu_queue.dump_req_status, BASE_CSF_CPU_QUEUE_DUMP_ISSUED);
+	init_completion(&kctx->csf.cpu_queue.dump_cmp);
+	kbase_event_wakeup(kctx);
+	mutex_unlock(&kctx->csf.lock);
+
+	seq_printf(file, "CPU Queues table (version:v%u):\n", MALI_CSF_CPU_QUEUE_DEBUGFS_VERSION);
+
+	wait_for_completion_timeout(&kctx->csf.cpu_queue.dump_cmp,
+			msecs_to_jiffies(3000));
+
+	mutex_lock(&kctx->csf.lock);
+	if (kctx->csf.cpu_queue.buffer) {
+		WARN_ON(atomic_read(&kctx->csf.cpu_queue.dump_req_status) !=
+				    BASE_CSF_CPU_QUEUE_DUMP_PENDING);
+
+		seq_printf(file, "%s\n", kctx->csf.cpu_queue.buffer);
+
+		kfree(kctx->csf.cpu_queue.buffer);
+		kctx->csf.cpu_queue.buffer = NULL;
+		kctx->csf.cpu_queue.buffer_size = 0;
+	}
+	else
+		seq_printf(file, "Dump error! (time out)\n");
+
+	atomic_set(&kctx->csf.cpu_queue.dump_req_status,
+			BASE_CSF_CPU_QUEUE_DUMP_COMPLETE);
+
+	mutex_unlock(&kctx->csf.lock);
+	return 0;
+}
+
+static int kbasep_csf_cpu_queue_debugfs_open(struct inode *in, struct file *file)
+{
+	return single_open(file, kbasep_csf_cpu_queue_debugfs_show, in->i_private);
+}
+
+static const struct file_operations kbasep_csf_cpu_queue_debugfs_fops = {
+	.open = kbasep_csf_cpu_queue_debugfs_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx)
+{
+	struct dentry *file;
+
+	if (WARN_ON(!kctx || IS_ERR_OR_NULL(kctx->kctx_dentry)))
+		return;
+
+	file = debugfs_create_file("cpu_queue", 0444, kctx->kctx_dentry,
+			kctx, &kbasep_csf_cpu_queue_debugfs_fops);
+
+	if (IS_ERR_OR_NULL(file)) {
+		dev_warn(kctx->kbdev->dev,
+				"Unable to create cpu queue debugfs entry");
+	}
+
+	kctx->csf.cpu_queue.buffer = NULL;
+	kctx->csf.cpu_queue.buffer_size = 0;
+	atomic_set(&kctx->csf.cpu_queue.dump_req_status,
+		   BASE_CSF_CPU_QUEUE_DUMP_COMPLETE);
+}
+
+int kbase_csf_cpu_queue_dump(struct kbase_context *kctx,
+		u64 buffer, size_t buf_size)
+{
+	int err = 0;
+
+	size_t alloc_size = buf_size;
+	char *dump_buffer;
+
+	if (!buffer || !alloc_size)
+		goto done;
+
+	alloc_size = (alloc_size + PAGE_SIZE) & ~(PAGE_SIZE - 1);
+	dump_buffer = kzalloc(alloc_size, GFP_KERNEL);
+	if (ZERO_OR_NULL_PTR(dump_buffer)) {
+		err = -ENOMEM;
+		goto done;
+	}
+
+	WARN_ON(kctx->csf.cpu_queue.buffer != NULL);
+
+	err = copy_from_user(dump_buffer,
+			u64_to_user_ptr(buffer),
+			buf_size);
+	if (err) {
+		kfree(dump_buffer);
+		err = -EFAULT;
+		goto done;
+	}
+
+	mutex_lock(&kctx->csf.lock);
+
+	kfree(kctx->csf.cpu_queue.buffer);
+
+	if (atomic_read(&kctx->csf.cpu_queue.dump_req_status) ==
+			BASE_CSF_CPU_QUEUE_DUMP_PENDING) {
+		kctx->csf.cpu_queue.buffer = dump_buffer;
+		kctx->csf.cpu_queue.buffer_size = buf_size;
+		complete_all(&kctx->csf.cpu_queue.dump_cmp);
+	} else {
+		kfree(dump_buffer);
+	}
+
+	mutex_unlock(&kctx->csf.lock);
+done:
+	return err;
+}
+#else
+/*
+ * Stub functions for when debugfs is disabled
+ */
+void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx)
+{
+}
+
+bool kbase_csf_cpu_queue_read_dump_req(struct kbase_context *kctx,
+					struct base_csf_notification *req)
+{
+	return false;
+}
+
+int kbase_csf_cpu_queue_dump(struct kbase_context *kctx,
+			u64 buffer, size_t buf_size)
+{
+	return 0;
+}
+#endif /* CONFIG_DEBUG_FS */
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_cpu_queue_debugfs.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_cpu_queue_debugfs.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
+ *
+ * This program is free software and is provided to you under the terms of the
+ * GNU General Public License version 2 as published by the Free Software
+ * Foundation, and any use by you of this program is subject to the terms
+ * of such GNU license.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ */
+
+#ifndef _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_
+#define _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_
+
+#include <asm/atomic.h>
+#include <linux/types.h>
+
+#include "mali_kbase.h"
+
+/* Forward declaration */
+struct base_csf_notification;
+
+#define MALI_CSF_CPU_QUEUE_DEBUGFS_VERSION 0
+
+/* CPU queue dump status */
+/* Dumping is done or no dumping is in progress. */
+#define BASE_CSF_CPU_QUEUE_DUMP_COMPLETE	0
+/* Dumping request is pending. */
+#define BASE_CSF_CPU_QUEUE_DUMP_PENDING		1
+/* Dumping request is issued to Userspace */
+#define BASE_CSF_CPU_QUEUE_DUMP_ISSUED		2
+
+
+/**
+ * kbase_csf_cpu_queue_debugfs_init() - Create a debugfs entry for per context cpu queue(s)
+ *
+ * @kctx: The kbase_context for which to create the debugfs entry
+ */
+void kbase_csf_cpu_queue_debugfs_init(struct kbase_context *kctx);
+
+/**
+ * kbase_csf_cpu_queue_read_dump_req - Read cpu queue dump request event
+ *
+ * @kctx: The kbase_context which cpu queue dumpped belongs to
+ * @req:  Notification with cpu queue dump request.
+ *
+ * Return: true if needs CPU queue dump, or false otherwise.
+ */
+bool kbase_csf_cpu_queue_read_dump_req(struct kbase_context *kctx,
+					struct base_csf_notification *req);
+
+/**
+ * kbase_csf_cpu_queue_dump_needed - Check the requirement for cpu queue dump
+ *
+ * @kctx: The kbase_context which cpu queue dumpped belongs to
+ *
+ * Return: true if it needs cpu queue dump, or false otherwise.
+ */
+static inline bool kbase_csf_cpu_queue_dump_needed(struct kbase_context *kctx)
+{
+#ifdef CONFIG_DEBUG_FS
+	return (atomic_read(&kctx->csf.cpu_queue.dump_req_status) ==
+		BASE_CSF_CPU_QUEUE_DUMP_ISSUED);
+#else
+	return false;
+#endif
+}
+
+/**
+ * kbase_csf_cpu_queue_dump - dump buffer containing cpu queue information to debugfs
+ *
+ * @kctx: The kbase_context which cpu queue dumpped belongs to
+ * @buffer: Buffer containing the cpu queue information.
+ * @buf_size: Buffer size.
+ *
+ * Return: Return 0 for dump successfully, or error code.
+ */
+int kbase_csf_cpu_queue_dump(struct kbase_context *kctx,
+		u64 buffer, size_t buf_size);
+#endif /* _KBASE_CSF_CPU_QUEUE_DEBUGFS_H_ */
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_csg_debugfs.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_csg_debugfs.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
 * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
@@ -5,7 +6,7 @@
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include "mali_kbase_csf_csg_debugfs.h"
@@ -29,12 +28,37 @@
 #ifdef CONFIG_DEBUG_FS
 #include "mali_kbase_csf_tl_reader.h"

+/**
+ * blocked_reason_to_string() - Convert blocking reason id to a string
+ *
+ * @reason_id: blocked_reason
+ *
+ * Return: Suitable string
+ */
+static const char *blocked_reason_to_string(u32 reason_id)
+{
+	/* possible blocking reasons of a cs */
+	static const char *const cs_blocked_reason[] = {
+		[CS_STATUS_BLOCKED_REASON_REASON_UNBLOCKED] = "UNBLOCKED",
+		[CS_STATUS_BLOCKED_REASON_REASON_WAIT] = "WAIT",
+		[CS_STATUS_BLOCKED_REASON_REASON_PROGRESS_WAIT] =
+			"PROGRESS_WAIT",
+		[CS_STATUS_BLOCKED_REASON_REASON_SYNC_WAIT] = "SYNC_WAIT",
+		[CS_STATUS_BLOCKED_REASON_REASON_DEFERRED] = "DEFERRED",
+		[CS_STATUS_BLOCKED_REASON_REASON_RESOURCE] = "RESOURCE",
+		[CS_STATUS_BLOCKED_REASON_REASON_FLUSH] = "FLUSH"
+	};
+
+	if (WARN_ON(reason_id >= ARRAY_SIZE(cs_blocked_reason)))
+		return "UNKNOWN_BLOCKED_REASON_ID";
+
+	return cs_blocked_reason[reason_id];
+}
+
 static void kbasep_csf_scheduler_dump_active_queue_cs_status_wait(
-		struct seq_file *file,
-		u32 wait_status,
-		u32 wait_sync_value,
-		u64 wait_sync_live_value,
-		u64 wait_sync_pointer)
+	struct seq_file *file, u32 wait_status, u32 wait_sync_value,
+	u64 wait_sync_live_value, u64 wait_sync_pointer, u32 sb_status,
+	u32 blocked_reason)
 {
 #define WAITING "Waiting"
 #define NOT_WAITING "Not waiting"
@@ -56,6 +80,11 @@ static void kbasep_csf_scheduler_dump_active_queue_cs_status_wait(
 	seq_printf(file, "SYNC_POINTER: 0x%llx\n", wait_sync_pointer);
 	seq_printf(file, "SYNC_VALUE: %d\n", wait_sync_value);
 	seq_printf(file, "SYNC_LIVE_VALUE: 0x%016llx\n", wait_sync_live_value);
+	seq_printf(file, "SB_STATUS: %u\n",
+		   CS_STATUS_SCOREBOARDS_NONZERO_GET(sb_status));
+	seq_printf(file, "BLOCKED_REASON: %s\n",
+		   blocked_reason_to_string(CS_STATUS_BLOCKED_REASON_REASON_GET(
+			   blocked_reason)));
 }

 /**
@@ -74,6 +103,8 @@ static void kbasep_csf_scheduler_dump_active_queue(struct seq_file *file,
 	u32 cs_active;
 	u64 wait_sync_pointer;
 	u32 wait_status, wait_sync_value;
+	u32 sb_status;
+	u32 blocked_reason;
 	struct kbase_vmap_struct *mapping;
 	u64 *evt;
 	u64 wait_sync_live_value;
@@ -109,6 +140,8 @@ static void kbasep_csf_scheduler_dump_active_queue(struct seq_file *file,
 			wait_status = queue->status_wait;
 			wait_sync_value = queue->sync_value;
 			wait_sync_pointer = queue->sync_ptr;
+			sb_status = queue->sb_status;
+			blocked_reason = queue->blocked_reason;

 			evt = (u64 *)kbase_phy_alloc_mapping_get(queue->kctx, wait_sync_pointer, &mapping);
 			if (evt) {
@@ -120,7 +153,8 @@ static void kbasep_csf_scheduler_dump_active_queue(struct seq_file *file,

 			kbasep_csf_scheduler_dump_active_queue_cs_status_wait(
 				file, wait_status, wait_sync_value,
-				wait_sync_live_value, wait_sync_pointer);
+				wait_sync_live_value, wait_sync_pointer,
+				sb_status, blocked_reason);
 		}
 	} else {
 		struct kbase_device const *const kbdev =
@@ -161,6 +195,11 @@ static void kbasep_csf_scheduler_dump_active_queue(struct seq_file *file,
 		wait_sync_pointer |= (u64)kbase_csf_firmware_cs_output(stream,
 					CS_STATUS_WAIT_SYNC_POINTER_HI) << 32;

+		sb_status = kbase_csf_firmware_cs_output(stream,
+							 CS_STATUS_SCOREBOARDS);
+		blocked_reason = kbase_csf_firmware_cs_output(
+			stream, CS_STATUS_BLOCKED_REASON);
+
 		evt = (u64 *)kbase_phy_alloc_mapping_get(queue->kctx, wait_sync_pointer, &mapping);
 		if (evt) {
 			wait_sync_live_value = evt[0];
@@ -171,7 +210,8 @@ static void kbasep_csf_scheduler_dump_active_queue(struct seq_file *file,

 		kbasep_csf_scheduler_dump_active_queue_cs_status_wait(
 			file, wait_status, wait_sync_value,
-			wait_sync_live_value, wait_sync_pointer);
+			wait_sync_live_value, wait_sync_pointer, sb_status,
+			blocked_reason);
 	}

 	seq_puts(file, "\n");
@@ -428,6 +468,61 @@ DEFINE_SIMPLE_ATTRIBUTE(kbasep_csf_debugfs_scheduling_timer_kick_fops,
 		&kbasep_csf_debugfs_scheduling_timer_kick_set,
 		"%llu\n");

+/**
+ * kbase_csf_debugfs_scheduler_suspend_get() - get if the scheduler is suspended.
+ *
+ * @data: The debugfs dentry private data, a pointer to kbase_device
+ * @val: The debugfs output value, boolean: 1 suspended, 0 otherwise
+ *
+ * Return: 0
+ */
+static int kbase_csf_debugfs_scheduler_suspend_get(
+		void *data, u64 *val)
+{
+	struct kbase_device *kbdev = data;
+	struct kbase_csf_scheduler *scheduler = &kbdev->csf.scheduler;
+
+	kbase_csf_scheduler_lock(kbdev);
+	*val = (scheduler->state == SCHED_SUSPENDED);
+	kbase_csf_scheduler_unlock(kbdev);
+
+	return 0;
+}
+
+/**
+ * kbase_csf_debugfs_scheduler_suspend_set() - set the scheduler to suspended.
+ *
+ * @data: The debugfs dentry private data, a pointer to kbase_device
+ * @val: The debugfs input value, boolean: 1 suspend, 0 otherwise
+ *
+ * Return: Negative value if already in requested state, 0 otherwise.
+ */
+static int kbase_csf_debugfs_scheduler_suspend_set(
+		void *data, u64 val)
+{
+	struct kbase_device *kbdev = data;
+	struct kbase_csf_scheduler *scheduler = &kbdev->csf.scheduler;
+	enum kbase_csf_scheduler_state state;
+
+	kbase_csf_scheduler_lock(kbdev);
+	state = scheduler->state;
+	kbase_csf_scheduler_unlock(kbdev);
+
+	if (val && (state != SCHED_SUSPENDED))
+		kbase_csf_scheduler_pm_suspend(kbdev);
+	else if (!val && (state == SCHED_SUSPENDED))
+		kbase_csf_scheduler_pm_resume(kbdev);
+	else
+		return -1;
+
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(kbasep_csf_debugfs_scheduler_suspend_fops,
+		&kbase_csf_debugfs_scheduler_suspend_get,
+		&kbase_csf_debugfs_scheduler_suspend_set,
+		"%llu\n");
+
 void kbase_csf_debugfs_init(struct kbase_device *kbdev)
 {
 	debugfs_create_file("active_groups", 0444,
@@ -440,6 +535,9 @@ void kbase_csf_debugfs_init(struct kbase_device *kbdev)
 	debugfs_create_file("scheduling_timer_kick", 0200,
 			kbdev->mali_debugfs_directory, kbdev,
 			&kbasep_csf_debugfs_scheduling_timer_kick_fops);
+	debugfs_create_file("scheduler_suspend", 0644,
+			kbdev->mali_debugfs_directory, kbdev,
+			&kbasep_csf_debugfs_scheduler_suspend_fops);

 	kbase_csf_tl_reader_debugfs_init(kbdev);
 	kbase_csf_firmware_trace_buffer_debugfs_init(kbdev);
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_csg_debugfs.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_csg_debugfs.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _KBASE_CSF_CSG_DEBUGFS_H_
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_defs.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_defs.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2018-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,11 +17,9 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

-/* Definitions (types, defines, etcs) common to the command stream frontend.
+/* Definitions (types, defines, etcs) common to the CSF.
 * They are placed here to allow the hierarchy of header files to work.
 */

@@ -46,6 +45,14 @@
 */
 #define MAX_TILER_HEAPS (128)

+#define CSF_FIRMWARE_ENTRY_READ       (1ul << 0)
+#define CSF_FIRMWARE_ENTRY_WRITE      (1ul << 1)
+#define CSF_FIRMWARE_ENTRY_EXECUTE    (1ul << 2)
+#define CSF_FIRMWARE_ENTRY_CACHE_MODE (3ul << 3)
+#define CSF_FIRMWARE_ENTRY_PROTECTED  (1ul << 5)
+#define CSF_FIRMWARE_ENTRY_SHARED     (1ul << 30)
+#define CSF_FIRMWARE_ENTRY_ZERO       (1ul << 31)
+
 /**
 * enum kbase_csf_bind_state - bind state of the queue
 *
@@ -66,18 +73,36 @@ enum kbase_csf_queue_bind_state {
 * enum kbase_csf_reset_gpu_state - state of the gpu reset
 *
 * @KBASE_CSF_RESET_GPU_NOT_PENDING: Set when the GPU reset isn't pending
+ *
+ * @KBASE_CSF_RESET_GPU_PREPARED: Set when kbase_prepare_to_reset_gpu() has
+ * been called. This is just for debugging checks to encourage callers to call
+ * kbase_prepare_to_reset_gpu() before kbase_reset_gpu().
+ *
+ * @KBASE_CSF_RESET_GPU_COMMITTED: Set when the GPU reset process has been
+ * committed and so will definitely happen, but the procedure to reset the GPU
+ * has not yet begun. Other threads must finish accessing the HW before we
+ * reach %KBASE_CSF_RESET_GPU_HAPPENING.
+ *
 * @KBASE_CSF_RESET_GPU_HAPPENING: Set when the GPU reset process is occurring
- * @KBASE_CSF_RESET_GPU_SILENT: Set when the GPU reset process is occurring,
- * used when resetting the GPU as part of normal behavior (e.g. when exiting
- * protected mode).
+ * (silent or otherwise), and is actively accessing the HW. Any changes to the
+ * HW in other threads might get lost, overridden, or corrupted.
+ *
+ * @KBASE_CSF_RESET_GPU_COMMITTED_SILENT: Set when the GPU reset process has
+ * been committed but has not started happening. This is used when resetting
+ * the GPU as part of normal behavior (e.g. when exiting protected mode).
+ * Other threads must finish accessing the HW before we reach
+ * %KBASE_CSF_RESET_GPU_HAPPENING.
+ *
 * @KBASE_CSF_RESET_GPU_FAILED: Set when an error is encountered during the
 * GPU reset process. No more work could then be executed on GPU, unloading
 * the Driver module is the only option.
 */
 enum kbase_csf_reset_gpu_state {
 	KBASE_CSF_RESET_GPU_NOT_PENDING,
+	KBASE_CSF_RESET_GPU_PREPARED,
+	KBASE_CSF_RESET_GPU_COMMITTED,
 	KBASE_CSF_RESET_GPU_HAPPENING,
-	KBASE_CSF_RESET_GPU_SILENT,
+	KBASE_CSF_RESET_GPU_COMMITTED_SILENT,
 	KBASE_CSF_RESET_GPU_FAILED,
 };

@@ -86,17 +111,17 @@ enum kbase_csf_reset_gpu_state {
 *
 * @KBASE_CSF_GROUP_INACTIVE:          Group is inactive and won't be
 *                                     considered by scheduler for running on
- *                                     command stream group slot.
+ *                                     CSG slot.
 * @KBASE_CSF_GROUP_RUNNABLE:          Group is in the list of runnable groups
 *                                     and is subjected to time-slice based
 *                                     scheduling. A start request would be
 *                                     sent (or already has been sent) if the
- *                                     group is assigned the command stream
+ *                                     group is assigned the CS
 *                                     group slot for the fist time.
- * @KBASE_CSF_GROUP_IDLE:              Group is currently on a command stream
- *                                     group slot but all the command streams
- *                                     bound to the group have become either
- *                                     idle or waiting on sync object.
+ * @KBASE_CSF_GROUP_IDLE:              Group is currently on a CSG slot
+ *                                     but all the CSs bound to the group have
+ *                                     become either idle or waiting on sync
+ *                                     object.
 *                                     Group could be evicted from the slot on
 *                                     the next tick if there are no spare
 *                                     slots left after scheduling non-idle
@@ -110,12 +135,11 @@ enum kbase_csf_reset_gpu_state {
 *                                     KBASE_CSF_GROUP_SUSPENDED_ON_IDLE or
 *                                     KBASE_CSF_GROUP_SUSPENDED_ON_WAIT_SYNC
 *                                     state.
- * @KBASE_CSF_GROUP_SUSPENDED:         Group was evicted from the command
- *                                     stream group slot and is not running but
- *                                     is still in the list of runnable groups
- *                                     and subjected to time-slice based
- *                                     scheduling. A resume request would be
- *                                     sent when a command stream group slot is
+ * @KBASE_CSF_GROUP_SUSPENDED:         Group was evicted from the CSG slot
+ *                                     and is not running but is still in the
+ *                                     list of runnable groups and subjected
+ *                                     to time-slice based scheduling. A resume
+ *                                     request would be sent when a CSG slot is
 *                                     re-assigned to the group and once the
 *                                     resume is complete group would be moved
 *                                     back to the RUNNABLE state.
@@ -128,8 +152,8 @@ enum kbase_csf_reset_gpu_state {
 *                                     bound to the group is kicked it would be
 *                                     moved to the SUSPENDED state.
 * @KBASE_CSF_GROUP_SUSPENDED_ON_WAIT_SYNC: Same as GROUP_SUSPENDED_ON_IDLE
- *                                          except that at least one command
- *                                          stream bound to this group was
+ *                                          except that at least one CS
+ *                                          bound to this group was
 *                                          waiting for synchronization object
 *                                          before the suspension.
 * @KBASE_CSF_GROUP_FAULT_EVICTED:     Group is evicted from the scheduler due
@@ -185,10 +209,10 @@ enum kbase_csf_csg_slot_state {
 * enum kbase_csf_scheduler_state - state of the scheduler operational phases.
 *
 * @SCHED_BUSY:         The scheduler is busy performing on tick schedule
- *                      operations, the state of command stream group slots
+ *                      operations, the state of CSG slots
 *                      can't be changed.
 * @SCHED_INACTIVE:     The scheduler is inactive, it is allowed to modify the
- *                      state of command stream group slots by in-cycle
+ *                      state of CSG slots by in-cycle
 *                      priority scheduling.
 * @SCHED_SUSPENDED:    The scheduler is in low-power mode with scheduling
 *                      operations suspended and is not holding the power
@@ -202,6 +226,24 @@ enum kbase_csf_scheduler_state {
 	SCHED_SUSPENDED,
 };

+/**
+ * enum kbase_queue_group_priority - Kbase internal relative priority list.
+ *
+ * @KBASE_QUEUE_GROUP_PRIORITY_REALTIME:  The realtime queue group priority.
+ * @KBASE_QUEUE_GROUP_PRIORITY_HIGH:      The high queue group priority.
+ * @KBASE_QUEUE_GROUP_PRIORITY_MEDIUM:    The medium queue group priority.
+ * @KBASE_QUEUE_GROUP_PRIORITY_LOW:       The low queue group priority.
+ * @KBASE_QUEUE_GROUP_PRIORITY_COUNT:     The number of priority levels.
+ */
+enum kbase_queue_group_priority {
+	KBASE_QUEUE_GROUP_PRIORITY_REALTIME = 0,
+	KBASE_QUEUE_GROUP_PRIORITY_HIGH,
+	KBASE_QUEUE_GROUP_PRIORITY_MEDIUM,
+	KBASE_QUEUE_GROUP_PRIORITY_LOW,
+	KBASE_QUEUE_GROUP_PRIORITY_COUNT
+};
+
+
 /**
 * struct kbase_csf_notification - Event or error generated as part of command
 *                                 queue execution
@@ -240,37 +282,43 @@ struct kbase_csf_notification {
 * @refcount:    Reference count, stands for the number of times the queue
 *               has been referenced. The reference is taken when it is
 *               created, when it is bound to the group and also when the
- *               @oom_event_work or @fault_event_work work item is queued
+ *               @oom_event_work work item is queued
 *               for it.
 * @group:       Pointer to the group to which this queue is bound.
- * @queue_reg:   Pointer to the VA region allocated for command
- *               stream buffer.
+ * @queue_reg:   Pointer to the VA region allocated for CS buffer.
 * @oom_event_work: Work item corresponding to the out of memory event for
 *                  chunked tiler heap being used for this queue.
- * @fault_event_work: Work item corresponding to the firmware fault event.
- * @base_addr:      Base address of the command stream buffer.
- * @size:           Size of the command stream buffer.
+ * @base_addr:      Base address of the CS buffer.
+ * @size:           Size of the CS buffer.
 * @priority:       Priority of this queue within the group.
- * @bind_state:     Bind state of the queue.
- * @csi_index:      The ID of the assigned command stream hardware interface.
- * @enabled:        Indicating whether the command stream is running, or not.
- * @status_wait:    Value of CS_STATUS_WAIT register of the command stream will
- *                  be kept when the command stream gets blocked by sync wait.
+ * @bind_state:     Bind state of the queue as enum @kbase_csf_queue_bind_state
+ * @csi_index:      The ID of the assigned CS hardware interface.
+ * @enabled:        Indicating whether the CS is running, or not.
+ * @status_wait:    Value of CS_STATUS_WAIT register of the CS will
+ *                  be kept when the CS gets blocked by sync wait.
 *                  CS_STATUS_WAIT provides information on conditions queue is
 *                  blocking on. This is set when the group, to which queue is
 *                  bound, is suspended after getting blocked, i.e. in
 *                  KBASE_CSF_GROUP_SUSPENDED_ON_WAIT_SYNC state.
- * @sync_ptr:       Value of CS_STATUS_WAIT_SYNC_POINTER register of the command
- *                  stream will be kept when the command stream gets blocked by
+ * @sync_ptr:       Value of CS_STATUS_WAIT_SYNC_POINTER register of the CS
+ *                  will be kept when the CS gets blocked by
 *                  sync wait. CS_STATUS_WAIT_SYNC_POINTER contains the address
 *                  of synchronization object being waited on.
 *                  Valid only when @status_wait is set.
- * @sync_value:     Value of CS_STATUS_WAIT_SYNC_VALUE register of the command
- *                  stream will be kept when the command stream gets blocked by
+ * @sync_value:     Value of CS_STATUS_WAIT_SYNC_VALUE register of the CS
+ *                  will be kept when the CS gets blocked by
 *                  sync wait. CS_STATUS_WAIT_SYNC_VALUE contains the value
 *                  tested against the synchronization object.
 *                  Valid only when @status_wait is set.
+ * @sb_status:      Value indicates which of the scoreboard entries in the queue
+ *                  are non-zero
+ * @blocked_reason: Value shows if the queue is blocked, and if so,
+ *                  the reason why it is blocked
 * @error:          GPU command queue fatal information to pass to user space.
+ * @fatal_event_work: Work item to handle the CS fatal event reported for this
+ *                    queue.
+ * @cs_fatal_info:    Records additional information about the CS fatal event.
+ * @cs_fatal:         Records information about the CS fatal event.
 */
 struct kbase_queue {
 	struct kbase_context *kctx;
@@ -285,17 +333,21 @@ struct kbase_queue {
 	struct kbase_queue_group *group;
 	struct kbase_va_region *queue_reg;
 	struct work_struct oom_event_work;
-	struct work_struct fault_event_work;
 	u64 base_addr;
 	u32 size;
 	u8 priority;
-	u8 bind_state;
 	s8 csi_index;
+	enum kbase_csf_queue_bind_state bind_state;
 	bool enabled;
 	u32 status_wait;
 	u64 sync_ptr;
 	u32 sync_value;
+	u32 sb_status;
+	u32 blocked_reason;
 	struct kbase_csf_notification error;
+	struct work_struct fatal_event_work;
+	u64 cs_fatal_info;
+	u32 cs_fatal;
 };

 /**
@@ -335,9 +387,9 @@ struct kbase_protected_suspend_buffer {
 *				buffer. Protected-mode suspend buffer that is
 *				used for group context switch.
 * @handle:         Handle which identifies this queue group.
- * @csg_nr:         Number/index of the command stream group to
- *                  which this queue group is mapped; KBASEP_CSG_NR_INVALID
- *                  indicates that the queue group is not scheduled.
+ * @csg_nr:         Number/index of the CSG to which this queue group is
+ *                  mapped; KBASEP_CSG_NR_INVALID indicates that the queue
+ *                  group is not scheduled.
 * @priority:       Priority of the queue group, 0 being the highest,
 *                  BASE_QUEUE_GROUP_PRIORITY_COUNT - 1 being the lowest.
 * @tiler_max:      Maximum number of tiler endpoints the group is allowed
@@ -349,18 +401,21 @@ struct kbase_protected_suspend_buffer {
 * @tiler_mask:     Mask of tiler endpoints the group is allowed to use.
 * @fragment_mask:  Mask of fragment endpoints the group is allowed to use.
 * @compute_mask:   Mask of compute endpoints the group is allowed to use.
+ * @group_uid:      32-bit wide unsigned identifier for the group, unique
+ *                  across all kbase devices and contexts.
 * @link:           Link to this queue group in the 'runnable_groups' list of
 *                  the corresponding kctx.
 * @link_to_schedule: Link to this queue group in the list of prepared groups
 *                    to be scheduled, if the group is runnable/suspended.
 *                    If the group is idle or waiting for CQS, it would be a
 *                    link to the list of idle/blocked groups list.
- * @timer_event_work: Work item corresponding to the event generated when a task
- *                    started by a queue in this group takes too long to execute
- *                    on an endpoint.
 * @run_state:      Current state of the queue group.
 * @prepared_seq_num: Indicates the position of queue group in the list of
 *                    prepared groups to be scheduled.
+ * @scan_seq_num:     Scan out sequence number before adjusting for dynamic
+ *                    idle conditions. It is used for setting a group's
+ *                    onslot priority. It could differ from prepared_seq_number
+ *                    when there are idle groups.
 * @faulted:          Indicates that a GPU fault occurred for the queue group.
 *                    This flag persists until the fault has been queued to be
 *                    reported to userspace.
@@ -369,7 +424,7 @@ struct kbase_protected_suspend_buffer {
 *                  group.
 * @protm_event_work:   Work item corresponding to the protected mode entry
 *                      event for this queue.
- * @protm_pending_bitmap:  Bit array to keep a track of command streams that
+ * @protm_pending_bitmap:  Bit array to keep a track of CSs that
 *                         have pending protected mode entry requests.
 * @error_fatal: An error of type BASE_GPU_QUEUE_GROUP_ERROR_FATAL to be
 *               returned to userspace if such an error has occurred.
@@ -377,6 +432,8 @@ struct kbase_protected_suspend_buffer {
 *                 to be returned to userspace if such an error has occurred.
 * @error_tiler_oom: An error of type BASE_GPU_QUEUE_GROUP_ERROR_TILER_HEAP_OOM
 *                   to be returned to userspace if such an error has occurred.
+ * @timer_event_work: Work item to handle the progress timeout fatal event
+ *                    for the group.
 */
 struct kbase_queue_group {
 	struct kbase_context *kctx;
@@ -394,11 +451,13 @@ struct kbase_queue_group {
 	u64 fragment_mask;
 	u64 compute_mask;

+	u32 group_uid;
+
 	struct list_head link;
 	struct list_head link_to_schedule;
-	struct work_struct timer_event_work;
 	enum kbase_csf_group_state run_state;
 	u32 prepared_seq_num;
+	u32 scan_seq_num;
 	bool faulted;

 	struct kbase_queue *bound_queues[MAX_SUPPORTED_STREAMS_PER_GROUP];
@@ -410,6 +469,8 @@ struct kbase_queue_group {
 	struct kbase_csf_notification error_fatal;
 	struct kbase_csf_notification error_timeout;
 	struct kbase_csf_notification error_tiler_oom;
+
+	struct work_struct timer_event_work;
 };

 /**
@@ -442,6 +503,22 @@ struct kbase_csf_kcpu_queue_context {
 	struct list_head jit_blocked_queues;
 };

+/**
+ * struct kbase_csf_cpu_queue_context - Object representing the cpu queue
+ *                                      information.
+ *
+ * @buffer:     Buffer containing CPU queue information provided by Userspace.
+ * @buffer_size: The size of @buffer.
+ * @dump_req_status:  Indicates the current status for CPU queues dump request.
+ * @dump_cmp:         Dumping cpu queue completion event.
+ */
+struct kbase_csf_cpu_queue_context {
+	char *buffer;
+	size_t buffer_size;
+	atomic_t dump_req_status;
+	struct completion dump_cmp;
+};
+
 /**
 * struct kbase_csf_heap_context_allocator - Allocator of heap contexts
 *
@@ -472,18 +549,21 @@ struct kbase_csf_heap_context_allocator {
 * struct kbase_csf_tiler_heap_context - Object representing the tiler heaps
 *                                       context for a GPU address space.
 *
- * This contains all of the command-stream front-end state relating to chunked
- * tiler heaps for one @kbase_context. It is not the same as a heap context
- * structure allocated by the kernel for use by the firmware.
+ * This contains all of the CSF state relating to chunked tiler heaps for one
+ * @kbase_context. It is not the same as a heap context structure allocated by
+ * the kernel for use by the firmware.
 *
- * @lock:      Lock preventing concurrent access to the tiler heaps.
- * @list:      List of tiler heaps.
- * @ctx_alloc: Allocator for heap context structures.
+ * @lock:        Lock preventing concurrent access to the tiler heaps.
+ * @list:        List of tiler heaps.
+ * @ctx_alloc:   Allocator for heap context structures.
+ * @nr_of_heaps: Total number of tiler heaps that were added during the
+ *               life time of the context.
 */
 struct kbase_csf_tiler_heap_context {
 	struct mutex lock;
 	struct list_head list;
 	struct kbase_csf_heap_context_allocator ctx_alloc;
+	u64 nr_of_heaps;
 };

 /**
@@ -491,7 +571,7 @@ struct kbase_csf_tiler_heap_context {
 *                                      context for a GPU address space.
 *
 * @runnable_groups:    Lists of runnable GPU command queue groups in the kctx,
- *                      one per queue group priority level.
+ *                      one per queue group  relative-priority level.
 * @num_runnable_grps:  Total number of runnable groups across all priority
 *                      levels in @runnable_groups.
 * @idle_wait_groups:   A list of GPU command queue groups in which all enabled
@@ -500,7 +580,7 @@ struct kbase_csf_tiler_heap_context {
 * @num_idle_wait_grps: Length of the @idle_wait_groups list.
 * @sync_update_wq:     Dedicated workqueue to process work items corresponding
 *                      to the sync_update events by sync_set/sync_add
- *                      instruction execution on command streams bound to groups
+ *                      instruction execution on CSs bound to groups
 *                      of @idle_wait_groups list.
 * @sync_update_work:   work item to process the sync_update events by
 *                      sync_set / sync_add instruction execution on command
@@ -509,7 +589,7 @@ struct kbase_csf_tiler_heap_context {
 *                      'groups_to_schedule' list of scheduler instance.
 */
 struct kbase_csf_scheduler_context {
-	struct list_head runnable_groups[BASE_QUEUE_GROUP_PRIORITY_COUNT];
+	struct list_head runnable_groups[KBASE_QUEUE_GROUP_PRIORITY_COUNT];
 	u32 num_runnable_grps;
 	struct list_head idle_wait_groups;
 	u32 num_idle_wait_grps;
@@ -519,8 +599,7 @@ struct kbase_csf_scheduler_context {
 };

 /**
- * struct kbase_csf_context - Object representing command-stream front-end
- *                            for a GPU address space.
+ * struct kbase_csf_context - Object representing CSF for a GPU address space.
 *
 * @event_pages_head: A list of pages allocated for the event memory used by
 *                    the synchronization objects. A separate list would help
@@ -534,7 +613,7 @@ struct kbase_csf_scheduler_context {
 *                    deferred manner of a pair of User mode input/output pages
 *                    & a hardware doorbell page.
 *                    The pages are allocated when a GPU command queue is
- *                    bound to a command stream group in kbase_csf_queue_bind.
+ *                    bound to a CSG in kbase_csf_queue_bind.
 *                    This helps returning unique handles to Userspace from
 *                    kbase_csf_queue_bind and later retrieving the pointer to
 *                    queue in the mmap handler.
@@ -550,7 +629,8 @@ struct kbase_csf_scheduler_context {
 *                    userspace mapping created for them on bind operation
 *                    hasn't been removed.
 * @kcpu_queues:      Kernel CPU command queues.
- * @event_lock:       Lock protecting access to @event_callback_list
+ * @event_lock:       Lock protecting access to @event_callback_list and
+ *                    @error_list.
 * @event_callback_list: List of callbacks which are registered to serve CSF
 *                       events.
 * @tiler_heaps:      Chunked tiler memory heaps.
@@ -563,10 +643,12 @@ struct kbase_csf_scheduler_context {
 *                    of the USER register page. Currently used only for sanity
 *                    checking.
 * @sched:            Object representing the scheduler's context
- * @error_list:       List for command stream fatal errors in this context.
+ * @error_list:       List for CS fatal errors in this context.
 *                    Link of fatal error is
 *                    &struct_kbase_csf_notification.link.
- *                    @lock needs to be held to access to this list.
+ *                    @event_lock needs to be held to access this list.
+ * @cpu_queue:        CPU queue information. Only be available when DEBUG_FS
+ *                    is enabled.
 */
 struct kbase_csf_context {
 	struct list_head event_pages_head;
@@ -585,6 +667,9 @@ struct kbase_csf_context {
 	struct vm_area_struct *user_reg_vma;
 	struct kbase_csf_scheduler_context sched;
 	struct list_head error_list;
+#ifdef CONFIG_DEBUG_FS
+	struct kbase_csf_cpu_queue_context cpu_queue;
+#endif
 };

 /**
@@ -593,23 +678,28 @@ struct kbase_csf_context {
 * @workq:         Workqueue to execute the GPU reset work item @work.
 * @work:          Work item for performing the GPU reset.
 * @wait:          Wait queue used to wait for the GPU reset completion.
+ * @sem:           RW Semaphore to ensure no other thread attempts to use the
+ *                 GPU whilst a reset is in process. Unlike traditional
+ *                 semaphores and wait queues, this allows Linux's lockdep
+ *                 mechanism to check for deadlocks involving reset waits.
 * @state:         Tracks if the GPU reset is in progress or not.
+ *                 The state is represented by enum @kbase_csf_reset_gpu_state.
 */
 struct kbase_csf_reset_gpu {
 	struct workqueue_struct *workq;
 	struct work_struct work;
 	wait_queue_head_t wait;
+	struct rw_semaphore sem;
 	atomic_t state;
 };

 /**
 * struct kbase_csf_csg_slot - Object containing members for tracking the state
- *                             of command stream group slots.
- * @resident_group:   pointer to the queue group that is resident on the
- *                    command stream group slot.
- * @state:            state of the slot as per enum kbase_csf_csg_slot_state.
+ *                             of CSG slots.
+ * @resident_group:   pointer to the queue group that is resident on the CSG slot.
+ * @state:            state of the slot as per enum @kbase_csf_csg_slot_state.
 * @trigger_jiffies:  value of jiffies when change in slot state is recorded.
- * @priority:         dynamic priority assigned to command stream group slot.
+ * @priority:         dynamic priority assigned to CSG slot.
 */
 struct kbase_csf_csg_slot {
 	struct kbase_queue_group *resident_group;
@@ -620,8 +710,7 @@ struct kbase_csf_csg_slot {

 /**
 * struct kbase_csf_scheduler - Object representing the scheduler used for
- *                              command-stream front-end for an instance of
- *                              GPU platform device.
+ *                              CSF for an instance of GPU platform device.
 * @lock:                  Lock to serialize the scheduler operations and
 *                         access to the data members.
 * @interrupt_lock:        Lock to protect members accessed by interrupt
@@ -632,26 +721,29 @@ struct kbase_csf_csg_slot {
 * @doorbell_inuse_bitmap: Bitmap of hardware doorbell pages keeping track of
 *                         which pages are currently available for assignment
 *                         to clients.
- * @csg_inuse_bitmap:      Bitmap to keep a track of command stream group slots
+ * @csg_inuse_bitmap:      Bitmap to keep a track of CSG slots
 *                         that are currently in use.
- * @csg_slots:             The array for tracking the state of command stream
+ * @csg_slots:             The array for tracking the state of CS
 *                         group slots.
 * @runnable_kctxs:        List of Kbase contexts that have runnable command
 *                         queue groups.
 * @groups_to_schedule:    List of runnable queue groups prepared on every
- *                         scheduler tick. The dynamic priority of the command
- *                         stream group slot assigned to a group will depend
- *                         upon the position of group in the list.
+ *                         scheduler tick. The dynamic priority of the CSG
+ *                         slot assigned to a group will depend upon the
+ *                         position of group in the list.
 * @ngrp_to_schedule:      Number of groups in the @groups_to_schedule list,
 *                         incremented when a group is added to the list, used
 *                         to record the position of group in the list.
 * @num_active_address_spaces: Number of GPU address space slots that would get
 *                             used to program the groups in @groups_to_schedule
- *                             list on all the available command stream group
+ *                             list on all the available CSG
 *                             slots.
- * @num_csg_slots_for_tick:  Number of command stream group slots that can be
+ * @num_csg_slots_for_tick:  Number of CSG slots that can be
 *                           active in the given tick/tock. This depends on the
 *                           value of @num_active_address_spaces.
+ * @remaining_tick_slots:    Tracking the number of remaining available slots
+ *                           for @num_csg_slots_for_tick during the scheduling
+ *                           operation in a tick/tock.
 * @idle_groups_to_schedule: List of runnable queue groups, in which all GPU
 *                           command queues became idle or are waiting for
 *                           synchronization object, prepared on every
@@ -659,11 +751,14 @@ struct kbase_csf_csg_slot {
 *                           appended to the tail of @groups_to_schedule list
 *                           after the scan out so that the idle groups aren't
 *                           preferred for scheduling over the non-idle ones.
+ * @csg_scan_count_for_tick: CSG scanout count for assign the scan_seq_num for
+ *                           each scanned out group during scheduling operation
+ *                           in a tick/tock.
 * @total_runnable_grps:     Total number of runnable groups across all KCTXs.
 * @csgs_events_enable_mask: Use for temporary masking off asynchronous events
 *                           from firmware (such as OoM events) before a group
 *                           is suspended.
- * @csg_slots_idle_mask:     Bit array for storing the mask of command stream
+ * @csg_slots_idle_mask:     Bit array for storing the mask of CS
 *                           group slots for which idle notification was
 *                           received.
 * @csg_slots_prio_update:  Bit array for tracking slots that have an on-slot
@@ -677,39 +772,53 @@ struct kbase_csf_csg_slot {
 *                          then it will only perform scheduling under the
 *                          influence of external factors e.g., IRQs, IOCTLs.
 * @wq:                     Dedicated workqueue to execute the @tick_work.
- * @tick_work:              Work item that would perform the schedule on tick
- *                          operation to implement the time slice based
- *                          scheduling.
+ * @tick_timer:             High-resolution timer employed to schedule tick
+ *                          workqueue items (kernel-provided delayed_work
+ *                          items do not use hrtimer and for some reason do
+ *                          not provide sufficiently reliable periodicity).
+ * @tick_work:              Work item that performs the "schedule on tick"
+ *                          operation to implement timeslice-based scheduling.
 * @tock_work:              Work item that would perform the schedule on tock
 *                          operation to implement the asynchronous scheduling.
 * @ping_work:              Work item that would ping the firmware at regular
- *                          intervals, only if there is a single active command
- *                          stream group slot, to check if firmware is alive
- *                          and would initiate a reset if the ping request
- *                          isn't acknowledged.
+ *                          intervals, only if there is a single active CSG
+ *                          slot, to check if firmware is alive and would
+ *                          initiate a reset if the ping request isn't
+ *                          acknowledged.
 * @top_ctx:                Pointer to the Kbase context corresponding to the
 *                          @top_grp.
 * @top_grp:                Pointer to queue group inside @groups_to_schedule
 *                          list that was assigned the highest slot priority.
- * @head_slot_priority:     The dynamic slot priority to be used for the
- *                          queue group at the head of @groups_to_schedule
- *                          list. Once the queue group is assigned a command
- *                          stream group slot, it is removed from the list and
- *                          priority is decremented.
 * @tock_pending_request:   A "tock" request is pending: a group that is not
 *                          currently on the GPU demands to be scheduled.
 * @active_protm_grp:       Indicates if firmware has been permitted to let GPU
 *                          enter protected mode with the given group. On exit
 *                          from protected mode the pointer is reset to NULL.
+ * @gpu_idle_fw_timer_enabled: Whether the CSF scheduler has activiated the
+ *                            firmware idle hysteresis timer for preparing a
+ *                            GPU suspend on idle.
 * @gpu_idle_work:          Work item for facilitating the scheduler to bring
 *                          the GPU to a low-power mode on becoming idle.
- * @non_idle_suspended_grps: Count of suspended queue groups not idle.
+ * @non_idle_offslot_grps:  Count of off-slot non-idle groups. Reset during
+ *                          the scheduler active phase in a tick. It then
+ *                          tracks the count of non-idle groups across all the
+ *                          other phases.
+ * @non_idle_scanout_grps:  Count on the non-idle groups in the scan-out
+ *                          list at the scheduling prepare stage.
 * @pm_active_count:        Count indicating if the scheduler is owning a power
 *                          management reference count. Reference is taken when
 *                          the count becomes 1 and is dropped when the count
 *                          becomes 0. It is used to enable the power up of MCU
 *                          after GPU and L2 cache have been powered up. So when
 *                          this count is zero, MCU will not be powered up.
+ * @csg_scheduling_period_ms: Duration of Scheduling tick in milliseconds.
+ * @tick_timer_active:      Indicates whether the @tick_timer is effectively
+ *                          active or not, as the callback function of
+ *                          @tick_timer will enqueue @tick_work only if this
+ *                          flag is true. This is mainly useful for the case
+ *                          when scheduling tick needs to be advanced from
+ *                          interrupt context, without actually deactivating
+ *                          the @tick_timer first and then enqueing @tick_work.
 */
 struct kbase_csf_scheduler {
 	struct mutex lock;
@@ -723,7 +832,9 @@ struct kbase_csf_scheduler {
 	u32 ngrp_to_schedule;
 	u32 num_active_address_spaces;
 	u32 num_csg_slots_for_tick;
+	u32 remaining_tick_slots;
 	struct list_head idle_groups_to_schedule;
+	u32 csg_scan_count_for_tick;
 	u32 total_runnable_grps;
 	DECLARE_BITMAP(csgs_events_enable_mask, MAX_SUPPORTED_CSGS);
 	DECLARE_BITMAP(csg_slots_idle_mask, MAX_SUPPORTED_CSGS);
@@ -731,17 +842,21 @@ struct kbase_csf_scheduler {
 	unsigned long last_schedule;
 	bool timer_enabled;
 	struct workqueue_struct *wq;
-	struct delayed_work tick_work;
+	struct hrtimer tick_timer;
+	struct work_struct tick_work;
 	struct delayed_work tock_work;
 	struct delayed_work ping_work;
 	struct kbase_context *top_ctx;
 	struct kbase_queue_group *top_grp;
-	u8 head_slot_priority;
 	bool tock_pending_request;
 	struct kbase_queue_group *active_protm_grp;
-	struct delayed_work gpu_idle_work;
-	atomic_t non_idle_suspended_grps;
+	bool gpu_idle_fw_timer_enabled;
+	struct work_struct gpu_idle_work;
+	atomic_t non_idle_offslot_grps;
+	u32 non_idle_scanout_grps;
 	u32 pm_active_count;
+	unsigned int csg_scheduling_period_ms;
+	bool tick_timer_active;
 };

 /**
@@ -758,8 +873,205 @@ struct kbase_csf_scheduler {
 	GLB_PROGRESS_TIMER_TIMEOUT_SCALE)

 /**
- * struct kbase_csf      -  Object representing command-stream front-end for an
- *                          instance of GPU platform device.
+ * Default GLB_PWROFF_TIMER_TIMEOUT value in unit of micro-seconds.
+ */
+#define DEFAULT_GLB_PWROFF_TIMEOUT_US (800)
+
+/**
+ * In typical operations, the management of the shader core power transitions
+ * is delegated to the MCU/firmware. However, if the host driver is configured
+ * to take direct control, one needs to disable the MCU firmware GLB_PWROFF
+ * timer.
+ */
+#define DISABLE_GLB_PWROFF_TIMER (0)
+
+/* Index of the GPU_ACTIVE counter within the CSHW counter block */
+#define GPU_ACTIVE_CNT_IDX (4)
+
+/**
+ * Maximum number of sessions that can be managed by the IPA Control component.
+ */
+#if MALI_UNIT_TEST
+#define KBASE_IPA_CONTROL_MAX_SESSIONS ((size_t)8)
+#else
+#define KBASE_IPA_CONTROL_MAX_SESSIONS ((size_t)2)
+#endif
+
+/**
+ * enum kbase_ipa_core_type - Type of counter block for performance counters
+ *
+ * @KBASE_IPA_CORE_TYPE_CSHW:   CS Hardware counters.
+ * @KBASE_IPA_CORE_TYPE_MEMSYS: Memory System counters.
+ * @KBASE_IPA_CORE_TYPE_TILER:  Tiler counters.
+ * @KBASE_IPA_CORE_TYPE_SHADER: Shader Core counters.
+ * @KBASE_IPA_CORE_TYPE_NUM:    Number of core types.
+ */
+enum kbase_ipa_core_type {
+	KBASE_IPA_CORE_TYPE_CSHW = 0,
+	KBASE_IPA_CORE_TYPE_MEMSYS,
+	KBASE_IPA_CORE_TYPE_TILER,
+	KBASE_IPA_CORE_TYPE_SHADER,
+	KBASE_IPA_CORE_TYPE_NUM
+};
+
+/**
+ * Number of configurable counters per type of block on the IPA Control
+ * interface.
+ */
+#define KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS ((size_t)8)
+
+/**
+ * Total number of configurable counters existing on the IPA Control interface.
+ */
+#define KBASE_IPA_CONTROL_MAX_COUNTERS                                         \
+	((size_t)KBASE_IPA_CORE_TYPE_NUM * KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS)
+
+/**
+ * struct kbase_ipa_control_prfcnt - Session for a single performance counter
+ *
+ * @latest_raw_value: Latest raw value read from the counter.
+ * @scaling_factor:   Factor raw value shall be multiplied by.
+ * @accumulated_diff: Partial sum of scaled and normalized values from
+ *                    previous samples. This represent all the values
+ *                    that were read before the latest raw value.
+ * @type:             Type of counter block for performance counter.
+ * @select_idx:       Index of the performance counter as configured on
+ *                    the IPA Control interface.
+ * @gpu_norm:         Indicating whether values shall be normalized by
+ *                    GPU frequency. If true, returned values represent
+ *                    an interval of time expressed in seconds (when the
+ *                    scaling factor is set to 1).
+ */
+struct kbase_ipa_control_prfcnt {
+	u64 latest_raw_value;
+	u64 scaling_factor;
+	u64 accumulated_diff;
+	enum kbase_ipa_core_type type;
+	u8 select_idx;
+	bool gpu_norm;
+};
+
+/**
+ * struct kbase_ipa_control_session - Session for an IPA Control client
+ *
+ * @prfcnts:        Sessions for individual performance counters.
+ * @num_prfcnts:    Number of performance counters.
+ * @active:         Indicates whether this slot is in use or not
+ * @last_query_time:     Time of last query, in ns
+ * @protm_time:     Amount of time (in ns) that GPU has been in protected
+ */
+struct kbase_ipa_control_session {
+	struct kbase_ipa_control_prfcnt prfcnts[KBASE_IPA_CONTROL_MAX_COUNTERS];
+	size_t num_prfcnts;
+	bool active;
+	u64 last_query_time;
+	u64 protm_time;
+};
+
+/**
+ * struct kbase_ipa_control_prfcnt_config - Performance counter configuration
+ *
+ * @idx:      Index of the performance counter inside the block, as specified
+ *            in the GPU architecture.
+ * @refcount: Number of client sessions bound to this counter.
+ *
+ * This structure represents one configurable performance counter of
+ * the IPA Control interface. The entry may be mapped to a specific counter
+ * by one or more client sessions. The counter is considered to be unused
+ * if it isn't part of any client session.
+ */
+struct kbase_ipa_control_prfcnt_config {
+	u8 idx;
+	u8 refcount;
+};
+
+/**
+ * struct kbase_ipa_control_prfcnt_block - Block of performance counters
+ *
+ * @select:                 Current performance counter configuration.
+ * @num_available_counters: Number of counters that are not already configured.
+ *
+ */
+struct kbase_ipa_control_prfcnt_block {
+	struct kbase_ipa_control_prfcnt_config
+		select[KBASE_IPA_CONTROL_NUM_BLOCK_COUNTERS];
+	size_t num_available_counters;
+};
+
+/**
+ * struct kbase_ipa_control - Manager of the IPA Control interface.
+ *
+ * @blocks:              Current configuration of performance counters
+ *                       for the IPA Control interface.
+ * @sessions:            State of client sessions, storing information
+ *                       like performance counters the client subscribed to
+ *                       and latest value read from each counter.
+ * @lock:                Spinlock to serialize access by concurrent clients.
+ * @rtm_listener_data:   Private data for allocating a GPU frequency change
+ *                       listener.
+ * @num_active_sessions: Number of sessions opened by clients.
+ * @cur_gpu_rate:        Current GPU top-level operating frequency, in Hz.
+ * @rtm_listener_data:   Private data for allocating a GPU frequency change
+ *                       listener.
+ * @protm_start:         Time (in ns) at which the GPU entered protected mode
+ */
+struct kbase_ipa_control {
+	struct kbase_ipa_control_prfcnt_block blocks[KBASE_IPA_CORE_TYPE_NUM];
+	struct kbase_ipa_control_session
+		sessions[KBASE_IPA_CONTROL_MAX_SESSIONS];
+	spinlock_t lock;
+	void *rtm_listener_data;
+	size_t num_active_sessions;
+	u32 cur_gpu_rate;
+	u64 protm_start;
+};
+
+/**
+ * struct kbase_csf_firmware_interface - Interface in the MCU firmware
+ *
+ * @node:  Interface objects are on the kbase_device:csf.firmware_interfaces
+ *         list using this list_head to link them
+ * @phys:  Array of the physical (tagged) addresses making up this interface
+ * @name:  NULL-terminated string naming the interface
+ * @num_pages: Number of entries in @phys and @pma (and length of the interface)
+ * @virtual: Starting GPU virtual address this interface is mapped at
+ * @flags: bitmask of CSF_FIRMWARE_ENTRY_* conveying the interface attributes
+ * @data_start: Offset into firmware image at which the interface data starts
+ * @data_end: Offset into firmware image at which the interface data ends
+ * @kernel_map: A kernel mapping of the memory or NULL if not required to be
+ *              mapped in the kernel
+ * @pma: Array of pointers to protected memory allocations.
+ */
+struct kbase_csf_firmware_interface {
+	struct list_head node;
+	struct tagged_addr *phys;
+	char *name;
+	u32 num_pages;
+	u32 virtual;
+	u32 flags;
+	u32 data_start;
+	u32 data_end;
+	void *kernel_map;
+	struct protected_memory_allocation **pma;
+};
+
+/*
+ * struct kbase_csf_hwcnt - Object containing members for handling the dump of
+ *                          HW counters.
+ *
+ * @request_pending:        Flag set when HWC requested and used for HWC sample
+ *                          done interrupt.
+ * @enable_pending:         Flag set when HWC enable status change and used for
+ *                          enable done interrupt.
+ */
+struct kbase_csf_hwcnt {
+	bool request_pending;
+	bool enable_pending;
+};
+
+/**
+ * struct kbase_csf_device - Object representing CSF for an instance of GPU
+ *                           platform device.
 *
 * @mcu_mmu:                MMU page tables for the MCU firmware
 * @firmware_interfaces:    List of interfaces defined in the firmware image
@@ -794,6 +1106,17 @@ struct kbase_csf_scheduler {
 *                          of the real Hw doorbell page for the active GPU
 *                          command queues after they are stopped or after the
 *                          GPU is powered down.
+ * @dummy_user_reg_page:    Address of the dummy page that is mapped in place
+ *                          of the real User register page just before the GPU
+ *                          is powered down. The User register page is mapped
+ *                          in the address space of every process, that created
+ *                          a Base context, to enable the access to LATEST_FLUSH
+ *                          register from userspace.
+ * @mali_file_inode:        Pointer to the inode corresponding to mali device
+ *                          file. This is needed in order to switch to the
+ *                          @dummy_user_reg_page on GPU power down.
+ *                          All instances of the mali device file will point to
+ *                          the same inode.
 * @reg_lock:               Lock to serialize the MCU firmware related actions
 *                          that affect all contexts such as allocation of
 *                          regions from shared interface area, assignment of
@@ -806,7 +1129,7 @@ struct kbase_csf_scheduler {
 * @global_iface:           The result of parsing the global interface
 *                          structure set up by the firmware, including the
 *                          CSGs, CSs, and their properties
- * @scheduler:              The command stream scheduler instance.
+ * @scheduler:              The CS scheduler instance.
 * @reset:                  Contain members required for GPU reset handling.
 * @progress_timeout:       Maximum number of GPU clock cycles without forward
 *                          progress to allow, for all tasks running on
@@ -820,11 +1143,39 @@ struct kbase_csf_scheduler {
 *                          in GPU reset has completed.
 * @firmware_reload_needed: Flag for indicating that the firmware needs to be
 *                          reloaded as part of the GPU reset action.
+ * @firmware_hctl_core_pwr: Flag for indicating that the host diver is in
+ *                          charge of the shader core's power transitions, and
+ *                          the mcu_core_pwroff timeout feature is disabled
+ *                          (i.e. configured 0 in the register field). If
+ *                          false, the control is delegated to the MCU.
 * @firmware_reload_work:   Work item for facilitating the procedural actions
 *                          on reloading the firmware.
 * @glb_init_request_pending: Flag to indicate that Global requests have been
 *                            sent to the FW after MCU was re-enabled and their
 *                            acknowledgement is pending.
+ * @fw_error_work:          Work item for handling the firmware internal error
+ *                          fatal event.
+ * @ipa_control:            IPA Control component manager.
+ * @mcu_core_pwroff_dur_us: Sysfs attribute for the glb_pwroff timeout input
+ *                          in unit of micro-seconds. The firmware does not use
+ *                          it directly.
+ * @mcu_core_pwroff_dur_count: The counterpart of the glb_pwroff timeout input
+ *                             in interface required format, ready to be used
+ *                             directly in the firmware.
+ * @mcu_core_pwroff_reg_shadow: The actual value that has been programed into
+ *                              the glb_pwoff register. This is separated from
+ *                              the @p mcu_core_pwroff_dur_count as an update
+ *                              to the latter is asynchronous.
+ * @gpu_idle_hysteresis_ms: Sysfs attribute for the idle hysteresis time
+ *                          window in unit of ms. The firmware does not use it
+ *                          directly.
+ * @gpu_idle_dur_count:     The counterpart of the hysteresis time window in
+ *                          interface required format, ready to be used
+ *                          directly in the firmware.
+ * @fw_timeout_ms:          Timeout value (in milliseconds) used when waiting
+ *                          for any request sent to the firmware.
+ * @hwcnt:                  Contain members required for handling the dump of
+ *                          HW counters.
 */
 struct kbase_csf_device {
 	struct kbase_mmu_table mcu_mmu;
@@ -838,6 +1189,8 @@ struct kbase_csf_device {
 	struct file *db_filp;
 	u32 db_file_offsets;
 	struct tagged_addr dummy_db_page;
+	struct tagged_addr dummy_user_reg_page;
+	struct inode *mali_file_inode;
 	struct mutex reg_lock;
 	wait_queue_head_t event_wait;
 	bool interrupt_received;
@@ -849,8 +1202,18 @@ struct kbase_csf_device {
 	bool firmware_inited;
 	bool firmware_reloaded;
 	bool firmware_reload_needed;
+	bool firmware_hctl_core_pwr;
 	struct work_struct firmware_reload_work;
 	bool glb_init_request_pending;
+	struct work_struct fw_error_work;
+	struct kbase_ipa_control ipa_control;
+	u32 mcu_core_pwroff_dur_us;
+	u32 mcu_core_pwroff_dur_count;
+	u32 mcu_core_pwroff_reg_shadow;
+	u32 gpu_idle_hysteresis_ms;
+	u32 gpu_idle_dur_count;
+	unsigned int fw_timeout_ms;
+	struct kbase_csf_hwcnt hwcnt;
 };

 /**
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware.c
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2018-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,15 +17,13 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _KBASE_CSF_FIRMWARE_H_
 #define _KBASE_CSF_FIRMWARE_H_

 #include "device/mali_kbase_device.h"
-#include "mali_gpu_csf_registers.h"
+#include <uapi/gpu/arm/bifrost/csf/mali_gpu_csf_registers.h>

 /*
 * PAGE_KERNEL_RO was only defined on 32bit ARM in 4.19 in:
@@ -71,14 +70,17 @@
 /* All implementations of the host interface with major version 0 must comply
 * with these restrictions:
 */
-/* GLB_GROUP_NUM: At least 3 command stream groups, but no more than 31 */
+/* GLB_GROUP_NUM: At least 3 CSGs, but no more than 31 */
 #define MIN_SUPPORTED_CSGS 3
 #define MAX_SUPPORTED_CSGS 31
-/* GROUP_STREAM_NUM: At least 8 command streams per CSG, but no more than 32 */
+/* GROUP_STREAM_NUM: At least 8 CSs per CSG, but no more than 32 */
 #define MIN_SUPPORTED_STREAMS_PER_GROUP 8
-/* Maximum command streams per csg. */
+/* Maximum CSs per csg. */
 #define MAX_SUPPORTED_STREAMS_PER_GROUP 32

+/* Waiting timeout for status change acknowledgment, in milliseconds */
+#define CSF_FIRMWARE_TIMEOUT_MS (800) /* Relaxed to 800ms from 100ms */
+
 struct kbase_device;


@@ -111,16 +113,15 @@ struct kbase_csf_trace_buffers {
 };

 /**
- * struct kbase_csf_cmd_stream_info - Command stream interface provided by the
- *                                    firmware.
+ * struct kbase_csf_cmd_stream_info - CSI provided by the firmware.
 *
 * @kbdev: Address of the instance of a GPU platform device that implements
 *         this interface.
- * @features: Bit field of command stream features (e.g. which types of jobs
+ * @features: Bit field of CS features (e.g. which types of jobs
 *            are supported). Bits 7:0 specify the number of work registers(-1).
 *            Bits 11:8 specify the number of scoreboard entries(-1).
- * @input: Address of command stream interface input page.
- * @output: Address of command stream interface output page.
+ * @input: Address of CSI input page.
+ * @output: Address of CSI output page.
 */
 struct kbase_csf_cmd_stream_info {
 	struct kbase_device *kbdev;
@@ -130,9 +131,9 @@ struct kbase_csf_cmd_stream_info {
 };

 /**
- * kbase_csf_firmware_cs_input() - Set a word in a command stream's input page
+ * kbase_csf_firmware_cs_input() - Set a word in a CS's input page
 *
- * @info: Command stream interface provided by the firmware.
+ * @info: CSI provided by the firmware.
 * @offset: Offset of the word to be written, in bytes.
 * @value: Value to be written.
 */
@@ -140,22 +141,20 @@ void kbase_csf_firmware_cs_input(
 	const struct kbase_csf_cmd_stream_info *info, u32 offset, u32 value);

 /**
- * kbase_csf_firmware_cs_input_read() - Read a word in a command stream's input
- *                                      page
+ * kbase_csf_firmware_cs_input_read() - Read a word in a CS's input page
 *
- * Return: Value of the word read from the command stream's input page.
+ * Return: Value of the word read from the CS's input page.
 *
- * @info: Command stream interface provided by the firmware.
+ * @info: CSI provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_cs_input_read(
 	const struct kbase_csf_cmd_stream_info *const info, const u32 offset);

 /**
- * kbase_csf_firmware_cs_input_mask() - Set part of a word in a command stream's
- *                                      input page
+ * kbase_csf_firmware_cs_input_mask() - Set part of a word in a CS's input page
 *
- * @info: Command stream interface provided by the firmware.
+ * @info: CSI provided by the firmware.
 * @offset: Offset of the word to be modified, in bytes.
 * @value: Value to be written.
 * @mask: Bitmask with the bits to be modified set.
@@ -165,19 +164,18 @@ void kbase_csf_firmware_cs_input_mask(
 	u32 value, u32 mask);

 /**
- * kbase_csf_firmware_cs_output() - Read a word in a command stream's output
- *                                  page
+ * kbase_csf_firmware_cs_output() - Read a word in a CS's output page
 *
- * Return: Value of the word read from the command stream's output page.
+ * Return: Value of the word read from the CS's output page.
 *
- * @info: Command stream interface provided by the firmware.
+ * @info: CSI provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_cs_output(
 	const struct kbase_csf_cmd_stream_info *info, u32 offset);
 /**
- * struct kbase_csf_cmd_stream_group_info - Command stream group interface
- *                                          provided by the firmware.
+ * struct kbase_csf_cmd_stream_group_info - CSG interface provided by the
+ *                                          firmware.
 *
 * @kbdev: Address of the instance of a GPU platform device that implements
 *         this interface.
@@ -185,14 +183,13 @@ u32 kbase_csf_firmware_cs_output(
 *            be ignored.
 * @input: Address of global interface input page.
 * @output: Address of global interface output page.
- * @suspend_size: Size in bytes for normal suspend buffer for the command
- *                stream group.
+ * @suspend_size: Size in bytes for normal suspend buffer for the CSG
 * @protm_suspend_size: Size in bytes for protected mode suspend buffer
- *                      for the command stream group.
- * @stream_num: Number of command streams in the command stream group.
+ *                      for the CSG.
+ * @stream_num: Number of CSs in the CSG.
 * @stream_stride: Stride in bytes in JASID0 virtual address between
- *                 command stream capability structures.
- * @streams: Address of an array of command stream capability structures.
+ *                 CS capability structures.
+ * @streams: Address of an array of CS capability structures.
 */
 struct kbase_csf_cmd_stream_group_info {
 	struct kbase_device *kbdev;
@@ -207,10 +204,9 @@ struct kbase_csf_cmd_stream_group_info {
 };

 /**
- * kbase_csf_firmware_csg_input() - Set a word in a command stream group's
- *                                  input page
+ * kbase_csf_firmware_csg_input() - Set a word in a CSG's input page
 *
- * @info: Command stream group interface provided by the firmware.
+ * @info: CSG interface provided by the firmware.
 * @offset: Offset of the word to be written, in bytes.
 * @value: Value to be written.
 */
@@ -219,22 +215,21 @@ void kbase_csf_firmware_csg_input(
 	u32 value);

 /**
- * kbase_csf_firmware_csg_input_read() - Read a word in a command stream group's
- *                                       input page
+ * kbase_csf_firmware_csg_input_read() - Read a word in a CSG's input page
 *
- * Return: Value of the word read from the command stream group's input page.
+ * Return: Value of the word read from the CSG's input page.
 *
- * @info: Command stream group interface provided by the firmware.
+ * @info: CSG interface provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_csg_input_read(
 	const struct kbase_csf_cmd_stream_group_info *info, u32 offset);

 /**
- * kbase_csf_firmware_csg_input_mask() - Set part of a word in a command stream
- *                                       group's input page
+ * kbase_csf_firmware_csg_input_mask() - Set part of a word in a CSG's
+ *                                       input page
 *
- * @info: Command stream group interface provided by the firmware.
+ * @info: CSG interface provided by the firmware.
 * @offset: Offset of the word to be modified, in bytes.
 * @value: Value to be written.
 * @mask: Bitmask with the bits to be modified set.
@@ -244,19 +239,18 @@ void kbase_csf_firmware_csg_input_mask(
 	u32 value, u32 mask);

 /**
- * kbase_csf_firmware_csg_output()- Read a word in a command stream group's
- *                                  output page
+ * kbase_csf_firmware_csg_output()- Read a word in a CSG's output page
 *
- * Return: Value of the word read from the command stream group's output page.
+ * Return: Value of the word read from the CSG's output page.
 *
- * @info: Command stream group interface provided by the firmware.
+ * @info: CSG interface provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_csg_output(
 	const struct kbase_csf_cmd_stream_group_info *info, u32 offset);

 /**
- * struct kbase_csf_global_iface - Global command stream front-end interface
+ * struct kbase_csf_global_iface - Global CSF interface
 *                                 provided by the firmware.
 *
 * @kbdev: Address of the instance of a GPU platform device that implements
@@ -268,11 +262,12 @@ u32 kbase_csf_firmware_csg_output(
 *            be suspended). Reserved bits should be 0, and should be ignored.
 * @input: Address of global interface input page.
 * @output: Address of global interface output page.
- * @group_num: Number of command stream groups supported.
+ * @group_num: Number of CSGs supported.
 * @group_stride: Stride in bytes in JASID0 virtual address between
- *                command stream group capability structures.
+ *                CSG capability structures.
 * @prfcnt_size: Performance counters size.
- * @groups: Address of an array of command stream group capability structures.
+ * @instr_features: Instrumentation features.
+ * @groups: Address of an array of CSG capability structures.
 */
 struct kbase_csf_global_iface {
 	struct kbase_device *kbdev;
@@ -283,13 +278,14 @@ struct kbase_csf_global_iface {
 	u32 group_num;
 	u32 group_stride;
 	u32 prfcnt_size;
+	u32 instr_features;
 	struct kbase_csf_cmd_stream_group_info *groups;
 };

 /**
 * kbase_csf_firmware_global_input() - Set a word in the global input page
 *
- * @iface: Command stream front-end interface provided by the firmware.
+ * @iface: CSF interface provided by the firmware.
 * @offset: Offset of the word to be written, in bytes.
 * @value: Value to be written.
 */
@@ -300,7 +296,7 @@ void kbase_csf_firmware_global_input(
 * kbase_csf_firmware_global_input_mask() - Set part of a word in the global
 *                                          input page
 *
- * @iface: Command stream front-end interface provided by the firmware.
+ * @iface: CSF interface provided by the firmware.
 * @offset: Offset of the word to be modified, in bytes.
 * @value: Value to be written.
 * @mask: Bitmask with the bits to be modified set.
@@ -314,7 +310,7 @@ void kbase_csf_firmware_global_input_mask(
 *
 * Return: Value of the word read from the global input page.
 *
- * @info: Command stream group interface provided by the firmware.
+ * @info: CSG interface provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_global_input_read(
@@ -325,7 +321,7 @@ u32 kbase_csf_firmware_global_input_read(
 *
 * Return: Value of the word read from the global output page.
 *
- * @iface: Command stream front-end interface provided by the firmware.
+ * @iface: CSF interface provided by the firmware.
 * @offset: Offset of the word to be read, in bytes.
 */
 u32 kbase_csf_firmware_global_output(
@@ -403,20 +399,28 @@ void kbase_csf_firmware_term(struct kbase_device *kbdev);
 /**
 * kbase_csf_firmware_ping - Send the ping request to firmware.
 *
- * The function sends the ping request to firmware to confirm it is alive.
+ * The function sends the ping request to firmware.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ */
+void kbase_csf_firmware_ping(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_firmware_ping_wait - Send the ping request to firmware and waits.
+ *
+ * The function sends the ping request to firmware and waits to confirm it is
+ * alive.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 *
 * Return: 0 on success, or negative on failure.
 */
-int kbase_csf_firmware_ping(struct kbase_device *kbdev);
+int kbase_csf_firmware_ping_wait(struct kbase_device *kbdev);

 /**
 * kbase_csf_firmware_set_timeout - Set a hardware endpoint progress timeout.
 *
- * @kbdev:   Instance of a GPU platform device that implements a command
- *           stream front-end interface.
+ * @kbdev:   Instance of a GPU platform device that implements a CSF interface.
 * @timeout: The maximum number of GPU cycles that is allowed to elapse
 *           without forward progress before the driver terminates a GPU
 *           command queue group.
@@ -433,8 +437,7 @@ int kbase_csf_firmware_set_timeout(struct kbase_device *kbdev, u64 timeout);
 *                                  enter protected mode and wait for its
 *                                  completion.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 void kbase_csf_enter_protected_mode(struct kbase_device *kbdev);

@@ -454,16 +457,14 @@ static inline bool kbase_csf_firmware_mcu_halted(struct kbase_device *kbdev)
 *                                       into a known internal state for warm
 *                                       boot later.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 void kbase_csf_firmware_trigger_mcu_halt(struct kbase_device *kbdev);

 /**
 * kbase_csf_firmware_enable_mcu - Send the command to enable MCU
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 static inline void kbase_csf_firmware_enable_mcu(struct kbase_device *kbdev)
 {
@@ -477,8 +478,7 @@ static inline void kbase_csf_firmware_enable_mcu(struct kbase_device *kbdev)
 /**
 * kbase_csf_firmware_disable_mcu - Send the command to disable MCU
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 static inline void kbase_csf_firmware_disable_mcu(struct kbase_device *kbdev)
 {
@@ -489,8 +489,7 @@ static inline void kbase_csf_firmware_disable_mcu(struct kbase_device *kbdev)
 * kbase_csf_firmware_disable_mcu_wait - Wait for the MCU to reach disabled
 *                                       status.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 void kbase_csf_firmware_disable_mcu_wait(struct kbase_device *kbdev);

@@ -499,8 +498,7 @@ void kbase_csf_firmware_disable_mcu_wait(struct kbase_device *kbdev);
 *                                 cold boot case firmware image would be
 *                                 reloaded from filesystem into memory.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 void kbase_csf_firmware_trigger_reload(struct kbase_device *kbdev);

@@ -508,8 +506,7 @@ void kbase_csf_firmware_trigger_reload(struct kbase_device *kbdev);
 * kbase_csf_firmware_reload_completed - The reboot of MCU firmware has
 *                                       completed.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 void kbase_csf_firmware_reload_completed(struct kbase_device *kbdev);

@@ -517,10 +514,11 @@ void kbase_csf_firmware_reload_completed(struct kbase_device *kbdev);
 * kbase_csf_firmware_global_reinit - Send the Global configuration requests
 *                                    after the reboot of MCU firmware.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @core_mask: Mask of the enabled shader cores.
 */
-void kbase_csf_firmware_global_reinit(struct kbase_device *kbdev);
+void kbase_csf_firmware_global_reinit(struct kbase_device *kbdev,
+				      u64 core_mask);

 /**
 * kbase_csf_firmware_global_reinit_complete - Check the Global configuration
@@ -529,45 +527,69 @@ void kbase_csf_firmware_global_reinit(struct kbase_device *kbdev);
 *
 * Return: true if the Global configuration requests completed otherwise false.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 */
 bool kbase_csf_firmware_global_reinit_complete(struct kbase_device *kbdev);

+/**
+ * kbase_csf_firmware_update_core_attr - Send the Global configuration request
+ *                                       to update the requested core attribute
+ *                                       changes.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @update_core_pwroff_timer: If true, signal the firmware needs to update
+ *                            the MCU power-off timer value.
+ * @update_core_mask:         If true, need to do the core_mask update with
+ *                            the supplied core_mask value.
+ * @core_mask:                New core mask value if update_core_mask is true,
+ *                            otherwise unused.
+ */
+void kbase_csf_firmware_update_core_attr(struct kbase_device *kbdev,
+		bool update_core_pwroff_timer, bool update_core_mask, u64 core_mask);
+
+/**
+ * kbase_csf_firmware_core_attr_updated - Check the Global configuration
+ *                  request has completed or not, that was sent to update
+ *                  the core attributes.
+ *
+ * Return: true if the Global configuration request to update the core
+ *         attributes has completed, otherwise false.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ */
+bool kbase_csf_firmware_core_attr_updated(struct kbase_device *kbdev);
+
 /**
 * Request the global control block of CSF interface capabilities
 *
- * Return: Total number of command streams, summed across all groups.
+ * Return: Total number of CSs, summed across all groups.
 *
 * @kbdev:                 Kbase device.
 * @group_data:            Pointer where to store all the group data
 *                         (sequentially).
 * @max_group_num:         The maximum number of groups to be read.
 *                         Can be 0, in which case group_data is unused.
- * @stream_data:           Pointer where to store all the stream data
+ * @stream_data:           Pointer where to store all the CS data
 *                         (sequentially).
- * @max_total_stream_num:  The maximum number of streams to be read.
+ * @max_total_stream_num:  The maximum number of CSs to be read.
 *                         Can be 0, in which case stream_data is unused.
 * @glb_version:           Where to store the global interface version.
- *                         Bits 31:16 hold the major version number and
- *                         15:0 hold the minor version number.
- *                         A higher minor version is backwards-compatible
- *                         with a lower minor version for the same major
- *                         version.
 * @features:              Where to store a bit mask of features (e.g.
 *                         whether certain types of job can be suspended).
- * @group_num:             Where to store the number of command stream groups
+ * @group_num:             Where to store the number of CSGs
 *                         supported.
 * @prfcnt_size:           Where to store the size of CSF performance counters,
 *                         in bytes. Bits 31:16 hold the size of firmware
 *                         performance counter data and 15:0 hold the size of
 *                         hardware performance counter data.
+ * @instr_features:        Instrumentation features. Bits 7:4 hold the max size
+ *                         of events. Bits 3:0 hold the offset update rate.
 */
-u32 kbase_csf_firmware_get_glb_iface(struct kbase_device *kbdev,
-	struct basep_cs_group_control *group_data, u32 max_group_num,
-	struct basep_cs_stream_control *stream_data, u32 max_total_stream_num,
-	u32 *glb_version, u32 *features, u32 *group_num, u32 *prfcnt_size);
-
+u32 kbase_csf_firmware_get_glb_iface(
+	struct kbase_device *kbdev, struct basep_cs_group_control *group_data,
+	u32 max_group_num, struct basep_cs_stream_control *stream_data,
+	u32 max_total_stream_num, u32 *glb_version, u32 *features,
+	u32 *group_num, u32 *prfcnt_size, u32 *instr_features);

 /**
 * Get CSF firmware header timeline metadata content
@@ -660,4 +682,125 @@ static inline long kbase_csf_timeout_in_jiffies(const unsigned int msecs)
 #endif
 }

+/**
+ * kbase_csf_firmware_enable_gpu_idle_timer() - Activate the idle hysteresis
+ *                                              monitoring operation
+ *
+ * Program the firmware interface with its configured hysteresis count value
+ * and enable the firmware to act on it. The Caller is
+ * assumed to hold the kbdev->csf.scheduler.interrupt_lock.
+ *
+ * @kbdev: Kbase device structure
+ */
+void kbase_csf_firmware_enable_gpu_idle_timer(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_firmware_disable_gpu_idle_timer() - Disable the idle time
+ *                                             hysteresis monitoring operation
+ *
+ * Program the firmware interface to disable the idle hysteresis timer. The
+ * Caller is assumed to hold the kbdev->csf.scheduler.interrupt_lock.
+ *
+ * @kbdev: Kbase device structure
+ */
+void kbase_csf_firmware_disable_gpu_idle_timer(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_firmware_get_gpu_idle_hysteresis_time - Get the firmware GPU idle
+ *                                               detection hysteresis duration
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ *
+ * Return: the internally recorded hysteresis (nominal) value.
+ */
+u32 kbase_csf_firmware_get_gpu_idle_hysteresis_time(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_firmware_set_gpu_idle_hysteresis_time - Set the firmware GPU idle
+ *                                               detection hysteresis duration
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ * @dur:     The duration value (unit: milliseconds) for the configuring
+ *           hysteresis field for GPU idle detection
+ *
+ * The supplied value will be recorded internally without any change. But the
+ * actual field value will be subject to hysteresis source frequency scaling
+ * and maximum value limiting. The default source will be SYSTEM_TIMESTAMP
+ * counter. But in case the platform is not able to supply it, the GPU
+ * CYCLE_COUNTER source will be used as an alternative. Bit-31 on the
+ * returned value is the source configuration flag, and it is set to '1'
+ * when CYCLE_COUNTER alternative source is used.
+ *
+ * Return: the actual internally configured hysteresis field value.
+ */
+u32 kbase_csf_firmware_set_gpu_idle_hysteresis_time(struct kbase_device *kbdev, u32 dur);
+
+/**
+ * kbase_csf_firmware_get_mcu_core_pwroff_time - Get the MCU core power-off
+ *                                               time value
+ *
+ * @kbdev:   Instance of a GPU platform device that implements a CSF interface.
+ *
+ * Return: the internally recorded MCU core power-off (nominal) value. The unit
+ *         of the value is in micro-seconds.
+ */
+u32 kbase_csf_firmware_get_mcu_core_pwroff_time(struct kbase_device *kbdev);
+
+/**
+ * kbase_csf_firmware_set_mcu_core_pwroff_time - Set the MCU core power-off
+ *                                               time value
+ *
+ * @kbdev:   Instance of a GPU platform device that implements a CSF interface.
+ * @dur:     The duration value (unit: micro-seconds) for configuring MCU
+ *           core power-off timer, when the shader cores' power
+ *           transitions are delegated to the MCU (normal operational
+ *           mode)
+ *
+ * The supplied value will be recorded internally without any change. But the
+ * actual field value will be subject to core power-off timer source frequency
+ * scaling and maximum value limiting. The default source will be
+ * SYSTEM_TIMESTAMP counter. But in case the platform is not able to supply it,
+ * the GPU CYCLE_COUNTER source will be used as an alternative. Bit-31 on the
+ * returned value is the source configuration flag, and it is set to '1'
+ * when CYCLE_COUNTER alternative source is used.
+ *
+ * The configured MCU core power-off timer will only have effect when the host
+ * driver has delegated the shader cores' power management to MCU.
+ *
+ * Return: the actual internal core power-off timer value in register defined
+ *         format.
+ */
+u32 kbase_csf_firmware_set_mcu_core_pwroff_time(struct kbase_device *kbdev, u32 dur);
+
+/**
+ * kbase_csf_interface_version - Helper function to build the full firmware
+ *                               interface version in a format compatible with
+ *                               with GLB_VERSION register
+ *
+ * @major:     major version of csf interface
+ * @minor:     minor version of csf interface
+ * @patch:     patch version of csf interface
+ *
+ * Return: firmware interface version
+ */
+static inline u32 kbase_csf_interface_version(u32 major, u32 minor, u32 patch)
+{
+	return ((major << GLB_VERSION_MAJOR_SHIFT) |
+		(minor << GLB_VERSION_MINOR_SHIFT) |
+		(patch << GLB_VERSION_PATCH_SHIFT));
+}
+
+/**
+ * kbase_csf_trigger_firmware_config_update - Send a firmware config update.
+ *
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
+ *
+ * Any changes done to firmware configuration entry or tracebuffer entry
+ * requires a GPU silent reset to reflect the configuration changes
+ * requested, but if Firmware.header.entry.bit(30) is set then we can request a
+ * FIRMWARE_CONFIG_UPDATE rather than doing a silent reset.
+ *
+ * Return: 0 if success, or negative error code on failure.
+ */
+int kbase_csf_trigger_firmware_config_update(struct kbase_device *kbdev);
 #endif
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_cfg.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_cfg.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
@@ -41,6 +40,8 @@
 *               inside CSF_FIRMWARE_CFG_SYSFS_DIR_NAME directory,
 *               representing the configuration option @name.
 * @kobj_inited: kobject initialization state
+ * @updatable:   Indicates whether config items can be updated with
+ *               FIRMWARE_CONFIG_UPDATE
 * @name:        NUL-terminated string naming the option
 * @address:     The address in the firmware image of the configuration option
 * @min:         The lowest legal value of the configuration option
@@ -52,6 +53,7 @@ struct firmware_config {
 	struct kbase_device *kbdev;
 	struct kobject kobj;
 	bool kobj_inited;
+	bool updatable;
 	char *name;
 	u32 address;
 	u32 min;
@@ -142,14 +144,20 @@ static ssize_t store_fw_cfg(struct kobject *kobj,
 			return count;
 		}

-		/*
-		 * If there is already a GPU reset pending then inform
-		 * the User to retry the write.
+		/* If configuration update cannot be performed with
+		 * FIRMWARE_CONFIG_UPDATE then we need to do a
+		 * silent reset before we update the memory.
 		 */
-		if (kbase_reset_gpu_silent(kbdev)) {
-			spin_unlock_irqrestore(
-				&kbdev->hwaccess_lock, flags);
-			return -EAGAIN;
+		if (!config->updatable) {
+			/*
+			 * If there is already a GPU reset pending then inform
+			 * the User to retry the write.
+			 */
+			if (kbase_reset_gpu_silent(kbdev)) {
+				spin_unlock_irqrestore(&kbdev->hwaccess_lock,
+						       flags);
+				return -EAGAIN;
+			}
 		}

 		/*
@@ -165,10 +173,21 @@ static ssize_t store_fw_cfg(struct kobject *kobj,
 			kbdev, config->address, val);

 		config->cur_val = val;
+
 		spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);

+		/* If we can update the config without firmware reset then
+		 * we need to just trigger FIRMWARE_CONFIG_UPDATE.
+		 */
+		if (config->updatable) {
+			ret = kbase_csf_trigger_firmware_config_update(kbdev);
+			if (ret)
+				return ret;
+		}
+
 		/* Wait for the config update to take effect */
-		kbase_reset_gpu_wait(kbdev);
+		if (!config->updatable)
+			kbase_reset_gpu_wait(kbdev);
 	} else {
 		dev_warn(kbdev->dev,
 			"Unexpected write to entry %s/%s",
@@ -254,8 +273,9 @@ void kbase_csf_firmware_cfg_term(struct kbase_device *kbdev)
 }

 int kbase_csf_firmware_cfg_option_entry_parse(struct kbase_device *kbdev,
-		const struct firmware *fw,
-		const u32 *entry, unsigned int size)
+					      const struct firmware *fw,
+					      const u32 *entry,
+					      unsigned int size, bool updatable)
 {
 	const char *name = (char *)&entry[3];
 	struct firmware_config *config;
@@ -270,6 +290,7 @@ int kbase_csf_firmware_cfg_option_entry_parse(struct kbase_device *kbdev,
 		return -ENOMEM;

 	config->kbdev = kbdev;
+	config->updatable = updatable;
 	config->name = (char *)(config+1);
 	config->address = entry[0];
 	config->min = entry[1];
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_cfg.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_cfg.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2020-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #ifndef _KBASE_CSF_FIRMWARE_CFG_H_
@@ -61,12 +60,15 @@ void kbase_csf_firmware_cfg_term(struct kbase_device *kbdev);
 *
 * Return: 0 if successful, negative error code on failure
 *
- * @kbdev: Kbase device structure
- * @fw:    Firmware image containing the section
- * @entry: Pointer to the section
- * @size:  Size (in bytes) of the section
+ * @kbdev:     Kbase device structure
+ * @fw:        Firmware image containing the section
+ * @entry:     Pointer to the section
+ * @size:      Size (in bytes) of the section
+ * @updatable: Indicates if entry can be updated with FIRMWARE_CONFIG_UPDATE
 */
 int kbase_csf_firmware_cfg_option_entry_parse(struct kbase_device *kbdev,
-		const struct firmware *fw,
-		const u32 *entry, unsigned int size);
+					      const struct firmware *fw,
+					      const u32 *entry,
+					      unsigned int size,
+					      bool updatable);
 #endif /* _KBASE_CSF_FIRMWARE_CFG_H_ */
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_no_mali.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_firmware_no_mali.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2018-2020 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2018-2021 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include "mali_kbase.h"
@@ -26,19 +25,23 @@
 #include "mali_kbase_csf_timeout.h"
 #include "mali_kbase_mem.h"
 #include "mali_kbase_reset_gpu.h"
+#include "mali_kbase_ctx_sched.h"
 #include "device/mali_kbase_device.h"
 #include "backend/gpu/mali_kbase_pm_internal.h"
 #include "mali_kbase_csf_scheduler.h"
 #include "mmu/mali_kbase_mmu.h"
+#include "backend/gpu/mali_kbase_clk_rate_trace_mgr.h"

 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/firmware.h>
 #include <linux/mman.h>
 #include <linux/string.h>
+#include <linux/mutex.h>
 #if (KERNEL_VERSION(4, 13, 0) <= LINUX_VERSION_CODE)
 #include <linux/set_memory.h>
 #endif
+#include <asm/arch_timer.h>

 #ifdef CONFIG_MALI_BIFROST_DEBUG
 /* Makes Driver wait indefinitely for an acknowledgment for the different
@@ -56,7 +59,7 @@ MODULE_PARM_DESC(fw_debug,
 #define DUMMY_FW_PAGE_SIZE SZ_4K

 /**
- * struct dummy_firmware_csi - Represents a dummy interface for MCU firmware streams
+ * struct dummy_firmware_csi - Represents a dummy interface for MCU firmware CSs
 *
 * @cs_kernel_input:  CS kernel input memory region
 * @cs_kernel_output: CS kernel output memory region
@@ -67,7 +70,7 @@ struct dummy_firmware_csi {
 };

 /**
- * struct dummy_firmware_csg - Represents a dummy interface for MCU firmware stream groups
+ * struct dummy_firmware_csg - Represents a dummy interface for MCU firmware CSGs
 *
 * @csg_input:  CSG kernel input memory region
 * @csg_output: CSG kernel output memory region
@@ -95,8 +98,9 @@ struct dummy_firmware_interface {
 	struct list_head node;
 } dummy_firmware_interface;

-#define CSF_GLB_REQ_CFG_MASK \
-	(GLB_REQ_CFG_ALLOC_EN_MASK | GLB_REQ_CFG_PROGRESS_TIMER_MASK)
+#define CSF_GLB_REQ_CFG_MASK                                                   \
+	(GLB_REQ_CFG_ALLOC_EN_MASK | GLB_REQ_CFG_PROGRESS_TIMER_MASK |         \
+	 GLB_REQ_CFG_PWROFF_TIMER_MASK)

 static inline u32 input_page_read(const u32 *const input, const u32 offset)
 {
@@ -233,6 +237,9 @@ static int invent_capabilities(struct kbase_device *kbdev)
 	iface->kbdev = kbdev;
 	iface->features = 0;
 	iface->prfcnt_size = 64;
+	iface->instr_features =
+		0x81; /* update rate=1, max event size = 1<<8 = 256 */
+
 	iface->group_num = ARRAY_SIZE(interface->csg);
 	iface->group_stride = 0;

@@ -416,6 +423,69 @@ u32 kbase_csf_firmware_global_output(
 	return val;
 }

+/**
+ * handle_internal_firmware_fatal - Handler for CS internal firmware fault.
+ *
+ * @kbdev:  Pointer to kbase device
+ *
+ * Report group fatal error to user space for all GPU command queue groups
+ * in the device, terminate them and reset GPU.
+ */
+static void handle_internal_firmware_fatal(struct kbase_device *const kbdev)
+{
+	int as;
+
+	for (as = 0; as < kbdev->nr_hw_address_spaces; as++) {
+		unsigned long flags;
+		struct kbase_context *kctx;
+		struct kbase_fault fault;
+
+		if (as == MCU_AS_NR)
+			continue;
+
+		/* Only handle the fault for an active address space. Lock is
+		 * taken here to atomically get reference to context in an
+		 * active address space and retain its refcount.
+		 */
+		spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+		kctx = kbase_ctx_sched_as_to_ctx_nolock(kbdev, as);
+
+		if (kctx) {
+			kbase_ctx_sched_retain_ctx_refcount(kctx);
+			spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+		} else {
+			spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+			continue;
+		}
+
+		fault = (struct kbase_fault) {
+			.status = GPU_EXCEPTION_TYPE_SW_FAULT_1,
+		};
+
+		kbase_csf_ctx_handle_fault(kctx, &fault);
+		kbase_ctx_sched_release_ctx_lock(kctx);
+	}
+
+	if (kbase_prepare_to_reset_gpu(kbdev,
+				       RESET_FLAGS_HWC_UNRECOVERABLE_ERROR))
+		kbase_reset_gpu(kbdev);
+}
+
+/**
+ * firmware_error_worker - Worker function for handling firmware internal error
+ *
+ * @data: Pointer to a work_struct embedded in kbase device.
+ *
+ * Handle the CS internal firmware error
+ */
+static void firmware_error_worker(struct work_struct *const data)
+{
+	struct kbase_device *const kbdev =
+		container_of(data, struct kbase_device, csf.fw_error_work);
+
+	handle_internal_firmware_fatal(kbdev);
+}
+
 static bool global_request_complete(struct kbase_device *const kbdev,
 				    u32 const req_mask)
 {
@@ -441,7 +511,7 @@ static int wait_for_global_request(struct kbase_device *const kbdev,
 				   u32 const req_mask)
 {
 	const long wait_timeout =
-		kbase_csf_timeout_in_jiffies(GLB_REQ_WAIT_TIMEOUT_MS);
+		kbase_csf_timeout_in_jiffies(kbdev->csf.fw_timeout_ms);
 	long remaining;
 	int err = 0;

@@ -464,7 +534,7 @@ static void set_global_request(
 {
 	u32 glb_req;

-	lockdep_assert_held(&global_iface->kbdev->csf.reg_lock);
+	kbase_csf_scheduler_spin_lock_assert_held(global_iface->kbdev);

 	glb_req = kbase_csf_firmware_global_output(global_iface, GLB_ACK);
 	glb_req ^= req_mask;
@@ -484,6 +554,26 @@ static void enable_endpoints_global(
 	set_global_request(global_iface, GLB_REQ_CFG_ALLOC_EN_MASK);
 }

+static void enable_shader_poweroff_timer(struct kbase_device *const kbdev,
+		const struct kbase_csf_global_iface *const global_iface)
+{
+	u32 pwroff_reg;
+
+	if (kbdev->csf.firmware_hctl_core_pwr)
+		pwroff_reg =
+		    GLB_PWROFF_TIMER_TIMER_SOURCE_SET(DISABLE_GLB_PWROFF_TIMER,
+			GLB_PWROFF_TIMER_TIMER_SOURCE_SYSTEM_TIMESTAMP);
+	else
+		pwroff_reg = kbdev->csf.mcu_core_pwroff_dur_count;
+
+	kbase_csf_firmware_global_input(global_iface, GLB_PWROFF_TIMER,
+					pwroff_reg);
+	set_global_request(global_iface, GLB_REQ_CFG_PWROFF_TIMER_MASK);
+
+	/* Save the programed reg value in its shadow field */
+	kbdev->csf.mcu_core_pwroff_reg_shadow = pwroff_reg;
+}
+
 static void set_timeout_global(
 	const struct kbase_csf_global_iface *const global_iface,
 	u64 const timeout)
@@ -494,13 +584,16 @@ static void set_timeout_global(
 	set_global_request(global_iface, GLB_REQ_CFG_PROGRESS_TIMER_MASK);
 }

-static void global_init(struct kbase_device *const kbdev, u32 req_mask)
+static void global_init(struct kbase_device *const kbdev, u64 core_mask)
 {
-	u32 const ack_irq_mask = GLB_ACK_IRQ_MASK_CFG_ALLOC_EN_MASK  |
-			GLB_ACK_IRQ_MASK_PING_MASK |
-			GLB_ACK_IRQ_MASK_CFG_PROGRESS_TIMER_MASK |
-			GLB_ACK_IRQ_MASK_PROTM_ENTER_MASK |
-			GLB_ACK_IRQ_MASK_PROTM_EXIT_MASK;
+	u32 const ack_irq_mask = GLB_ACK_IRQ_MASK_CFG_ALLOC_EN_MASK |
+				 GLB_ACK_IRQ_MASK_PING_MASK |
+				 GLB_ACK_IRQ_MASK_CFG_PROGRESS_TIMER_MASK |
+				 GLB_ACK_IRQ_MASK_PROTM_ENTER_MASK |
+				 GLB_ACK_IRQ_MASK_FIRMWARE_CONFIG_UPDATE_MASK |
+				 GLB_ACK_IRQ_MASK_PROTM_EXIT_MASK |
+				 GLB_ACK_IRQ_MASK_CFG_PWROFF_TIMER_MASK |
+				 GLB_ACK_IRQ_MASK_IDLE_EVENT_MASK;

 	const struct kbase_csf_global_iface *const global_iface =
 		&kbdev->csf.global_iface;
@@ -508,9 +601,9 @@ static void global_init(struct kbase_device *const kbdev, u32 req_mask)

 	kbase_csf_scheduler_spin_lock(kbdev, &flags);

-	/* Enable endpoints on all present shader cores */
-	enable_endpoints_global(global_iface,
-		kbase_pm_get_present_cores(kbdev, KBASE_PM_CORE_SHADER));
+	/* Update shader core allocation enable mask */
+	enable_endpoints_global(global_iface, core_mask);
+	enable_shader_poweroff_timer(kbdev, global_iface);

 	set_timeout_global(global_iface, kbase_csf_timeout_get(kbdev));

@@ -526,8 +619,7 @@ static void global_init(struct kbase_device *const kbdev, u32 req_mask)
 /**
 * global_init_on_boot - Sends a global request to control various features.
 *
- * @kbdev: Instance of a GPU platform device that implements a command
- *         stream front-end interface.
+ * @kbdev: Instance of a GPU platform device that implements a CSF interface.
 *
 * Currently only the request to enable endpoints and cycle counter is sent.
 *
@@ -535,19 +627,29 @@ static void global_init(struct kbase_device *const kbdev, u32 req_mask)
 */
 static int global_init_on_boot(struct kbase_device *const kbdev)
 {
-	u32 const req_mask = CSF_GLB_REQ_CFG_MASK;
+	unsigned long flags;
+	u64 core_mask;

-	global_init(kbdev, req_mask);
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	core_mask = kbase_pm_ca_get_core_mask(kbdev);
+	kbdev->csf.firmware_hctl_core_pwr =
+				kbase_pm_no_mcu_core_pwroff(kbdev);
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);

-	return wait_for_global_request(kbdev, req_mask);
+	global_init(kbdev, core_mask);
+
+	return wait_for_global_request(kbdev, CSF_GLB_REQ_CFG_MASK);
 }

-void kbase_csf_firmware_global_reinit(struct kbase_device *kbdev)
+void kbase_csf_firmware_global_reinit(struct kbase_device *kbdev,
+				      u64 core_mask)
 {
 	lockdep_assert_held(&kbdev->hwaccess_lock);

 	kbdev->csf.glb_init_request_pending = true;
-	global_init(kbdev, CSF_GLB_REQ_CFG_MASK);
+	kbdev->csf.firmware_hctl_core_pwr =
+				kbase_pm_no_mcu_core_pwroff(kbdev);
+	global_init(kbdev, core_mask);
 }

 bool kbase_csf_firmware_global_reinit_complete(struct kbase_device *kbdev)
@@ -561,6 +663,31 @@ bool kbase_csf_firmware_global_reinit_complete(struct kbase_device *kbdev)
 	return !kbdev->csf.glb_init_request_pending;
 }

+void kbase_csf_firmware_update_core_attr(struct kbase_device *kbdev,
+		bool update_core_pwroff_timer, bool update_core_mask, u64 core_mask)
+{
+	unsigned long flags;
+
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	kbase_csf_scheduler_spin_lock(kbdev, &flags);
+	if (update_core_mask)
+		enable_endpoints_global(&kbdev->csf.global_iface, core_mask);
+	if (update_core_pwroff_timer)
+		enable_shader_poweroff_timer(kbdev, &kbdev->csf.global_iface);
+
+	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
+	kbase_csf_scheduler_spin_unlock(kbdev, flags);
+}
+
+bool kbase_csf_firmware_core_attr_updated(struct kbase_device *kbdev)
+{
+	lockdep_assert_held(&kbdev->hwaccess_lock);
+
+	return global_request_complete(kbdev, GLB_REQ_CFG_ALLOC_EN_MASK |
+					      GLB_REQ_CFG_PWROFF_TIMER_MASK);
+}
+
 static void kbase_csf_firmware_reload_worker(struct work_struct *work)
 {
 	struct kbase_device *kbdev = container_of(work, struct kbase_device,
@@ -604,6 +731,129 @@ void kbase_csf_firmware_reload_completed(struct kbase_device *kbdev)
 	kbase_pm_update_state(kbdev);
 }

+static u32 convert_dur_to_idle_count(struct kbase_device *kbdev, const u32 dur_ms)
+{
+#define HYSTERESIS_VAL_UNIT_SHIFT (10)
+	/* Get the cntfreq_el0 value, which drives the SYSTEM_TIMESTAMP */
+	u64 freq = arch_timer_get_cntfrq();
+	u64 dur_val = dur_ms;
+	u32 cnt_val_u32, reg_val_u32;
+	bool src_system_timestamp = freq > 0;
+
+	if (!src_system_timestamp) {
+		/* Get the cycle_counter source alternative */
+		spin_lock(&kbdev->pm.clk_rtm.lock);
+		if (kbdev->pm.clk_rtm.clks[0])
+			freq = kbdev->pm.clk_rtm.clks[0]->clock_val;
+		else
+			dev_warn(kbdev->dev, "No GPU clock, unexpected intregration issue!");
+		spin_unlock(&kbdev->pm.clk_rtm.lock);
+
+		dev_info(kbdev->dev, "Can't get the timestamp frequency, "
+			 "use cycle counter format with firmware idle hysteresis!");
+	}
+
+	/* Formula for dur_val = ((dur_ms/1000) * freq_HZ) >> 10) */
+	dur_val = (dur_val * freq) >> HYSTERESIS_VAL_UNIT_SHIFT;
+	dur_val = div_u64(dur_val, 1000);
+
+	/* Interface limits the value field to S32_MAX */
+	cnt_val_u32 = (dur_val > S32_MAX) ? S32_MAX : (u32)dur_val;
+
+	reg_val_u32 = GLB_IDLE_TIMER_TIMEOUT_SET(0, cnt_val_u32);
+	/* add the source flag */
+	if (src_system_timestamp)
+		reg_val_u32 = GLB_IDLE_TIMER_TIMER_SOURCE_SET(reg_val_u32,
+				GLB_IDLE_TIMER_TIMER_SOURCE_SYSTEM_TIMESTAMP);
+	else
+		reg_val_u32 = GLB_IDLE_TIMER_TIMER_SOURCE_SET(reg_val_u32,
+				GLB_IDLE_TIMER_TIMER_SOURCE_GPU_COUNTER);
+
+	return reg_val_u32;
+}
+
+u32 kbase_csf_firmware_get_gpu_idle_hysteresis_time(struct kbase_device *kbdev)
+{
+	return kbdev->csf.gpu_idle_hysteresis_ms;
+}
+
+u32 kbase_csf_firmware_set_gpu_idle_hysteresis_time(struct kbase_device *kbdev, u32 dur)
+{
+	unsigned long flags;
+	const u32 hysteresis_val = convert_dur_to_idle_count(kbdev, dur);
+
+	kbase_csf_scheduler_spin_lock(kbdev, &flags);
+	kbdev->csf.gpu_idle_hysteresis_ms = dur;
+	kbdev->csf.gpu_idle_dur_count = hysteresis_val;
+	kbase_csf_scheduler_spin_unlock(kbdev, flags);
+
+	dev_dbg(kbdev->dev, "CSF set firmware idle hysteresis count-value: 0x%.8x",
+		hysteresis_val);
+
+	return hysteresis_val;
+}
+
+static u32 convert_dur_to_core_pwroff_count(struct kbase_device *kbdev, const u32 dur_us)
+{
+#define PWROFF_VAL_UNIT_SHIFT (10)
+	/* Get the cntfreq_el0 value, which drives the SYSTEM_TIMESTAMP */
+	u64 freq = arch_timer_get_cntfrq();
+	u64 dur_val = dur_us;
+	u32 cnt_val_u32, reg_val_u32;
+	bool src_system_timestamp = freq > 0;
+
+	if (!src_system_timestamp) {
+		/* Get the cycle_counter source alternative */
+		spin_lock(&kbdev->pm.clk_rtm.lock);
+		if (kbdev->pm.clk_rtm.clks[0])
+			freq = kbdev->pm.clk_rtm.clks[0]->clock_val;
+		else
+			dev_warn(kbdev->dev, "No GPU clock, unexpected integration issue!");
+		spin_unlock(&kbdev->pm.clk_rtm.lock);
+
+		dev_info(kbdev->dev, "Can't get the timestamp frequency, "
+			 "use cycle counter with MCU Core Poweroff timer!");
+	}
+
+	/* Formula for dur_val = ((dur_us/1e6) * freq_HZ) >> 10) */
+	dur_val = (dur_val * freq) >> HYSTERESIS_VAL_UNIT_SHIFT;
+	dur_val = div_u64(dur_val, 1000000);
+
+	/* Interface limits the value field to S32_MAX */
+	cnt_val_u32 = (dur_val > S32_MAX) ? S32_MAX : (u32)dur_val;
+
+	reg_val_u32 = GLB_PWROFF_TIMER_TIMEOUT_SET(0, cnt_val_u32);
+	/* add the source flag */
+	if (src_system_timestamp)
+		reg_val_u32 = GLB_PWROFF_TIMER_TIMER_SOURCE_SET(reg_val_u32,
+				GLB_PWROFF_TIMER_TIMER_SOURCE_SYSTEM_TIMESTAMP);
+	else
+		reg_val_u32 = GLB_PWROFF_TIMER_TIMER_SOURCE_SET(reg_val_u32,
+				GLB_PWROFF_TIMER_TIMER_SOURCE_GPU_COUNTER);
+
+	return reg_val_u32;
+}
+
+u32 kbase_csf_firmware_get_mcu_core_pwroff_time(struct kbase_device *kbdev)
+{
+	return kbdev->csf.mcu_core_pwroff_dur_us;
+}
+
+u32 kbase_csf_firmware_set_mcu_core_pwroff_time(struct kbase_device *kbdev, u32 dur)
+{
+	unsigned long flags;
+	const u32 pwroff = convert_dur_to_core_pwroff_count(kbdev, dur);
+
+	spin_lock_irqsave(&kbdev->hwaccess_lock, flags);
+	kbdev->csf.mcu_core_pwroff_dur_us = dur;
+	kbdev->csf.mcu_core_pwroff_dur_count = pwroff;
+	spin_unlock_irqrestore(&kbdev->hwaccess_lock, flags);
+
+	dev_dbg(kbdev->dev, "MCU Core Poweroff input update: 0x%.8x", pwroff);
+
+	return pwroff;
+}
+
 int kbase_csf_firmware_init(struct kbase_device *kbdev)
 {
 	int ret;
@@ -623,15 +873,21 @@ int kbase_csf_firmware_init(struct kbase_device *kbdev)

 	init_waitqueue_head(&kbdev->csf.event_wait);
 	kbdev->csf.interrupt_received = false;
+	kbdev->csf.fw_timeout_ms = CSF_FIRMWARE_TIMEOUT_MS;

 	INIT_LIST_HEAD(&kbdev->csf.firmware_interfaces);
 	INIT_LIST_HEAD(&kbdev->csf.firmware_config);
 	INIT_LIST_HEAD(&kbdev->csf.firmware_trace_buffers.list);
 	INIT_WORK(&kbdev->csf.firmware_reload_work,
 		  kbase_csf_firmware_reload_worker);
+	INIT_WORK(&kbdev->csf.fw_error_work, firmware_error_worker);

 	mutex_init(&kbdev->csf.reg_lock);

+	kbdev->csf.gpu_idle_hysteresis_ms = FIRMWARE_IDLE_HYSTERESIS_TIME_MS;
+	kbdev->csf.gpu_idle_dur_count = convert_dur_to_idle_count(kbdev,
+						FIRMWARE_IDLE_HYSTERESIS_TIME_MS);
+
 	ret = kbase_mcu_shared_interface_region_tracker_init(kbdev);
 	if (ret != 0) {
 		dev_err(kbdev->dev, "Failed to setup the rb tree for managing shared interface segment\n");
@@ -659,6 +915,10 @@ int kbase_csf_firmware_init(struct kbase_device *kbdev)
 	if (ret != 0)
 		goto error;

+	ret = kbase_csf_setup_dummy_user_reg_page(kbdev);
+	if (ret != 0)
+		goto error;
+
 	ret = kbase_csf_scheduler_init(kbdev);
 	if (ret != 0)
 		goto error;
@@ -680,6 +940,8 @@ error:

 void kbase_csf_firmware_term(struct kbase_device *kbdev)
 {
+	cancel_work_sync(&kbdev->csf.fw_error_work);
+
 	kbase_csf_timeout_term(kbdev);

 	/* NO_MALI: Don't stop firmware or unload MMU tables */
@@ -688,6 +950,8 @@ void kbase_csf_firmware_term(struct kbase_device *kbdev)

 	kbase_csf_scheduler_term(kbdev);

+	kbase_csf_free_dummy_user_reg_page(kbdev);
+
 	kbase_csf_doorbell_mapping_term(kbdev);

 	free_global_iface(kbdev);
@@ -721,7 +985,51 @@ void kbase_csf_firmware_term(struct kbase_device *kbdev)
 	kbase_mcu_shared_interface_region_tracker_term(kbdev);
 }

-int kbase_csf_firmware_ping(struct kbase_device *const kbdev)
+void kbase_csf_firmware_enable_gpu_idle_timer(struct kbase_device *kbdev)
+{
+	struct kbase_csf_global_iface *global_iface = &kbdev->csf.global_iface;
+	u32 glb_req;
+
+	kbase_csf_scheduler_spin_lock_assert_held(kbdev);
+
+	/* The scheduler is assumed to only call the enable when its internal
+	 * state indicates that the idle timer has previously been disabled. So
+	 * on entry the expected field values are:
+	 *   1. GLOBAL_INPUT_BLOCK.GLB_REQ.IDLE_ENABLE: 0
+	 *   2. GLOBAL_OUTPUT_BLOCK.GLB_ACK.IDLE_ENABLE: 0, or, on 1 -> 0
+	 */
+
+	glb_req = kbase_csf_firmware_global_input_read(global_iface, GLB_REQ);
+	if (glb_req & GLB_REQ_IDLE_ENABLE_MASK)
+		dev_err(kbdev->dev, "Incoherent scheduler state on REQ_IDLE_ENABLE!");
+
+	kbase_csf_firmware_global_input(global_iface, GLB_IDLE_TIMER,
+					kbdev->csf.gpu_idle_dur_count);
+
+	kbase_csf_firmware_global_input_mask(global_iface, GLB_REQ,
+				GLB_REQ_REQ_IDLE_ENABLE, GLB_REQ_IDLE_ENABLE_MASK);
+
+	dev_dbg(kbdev->dev, "Enabling GPU idle timer with count-value: 0x%.8x",
+		kbdev->csf.gpu_idle_dur_count);
+	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
+}
+
+void kbase_csf_firmware_disable_gpu_idle_timer(struct kbase_device *kbdev)
+{
+	struct kbase_csf_global_iface *global_iface = &kbdev->csf.global_iface;
+
+	kbase_csf_scheduler_spin_lock_assert_held(kbdev);
+
+	kbase_csf_firmware_global_input_mask(global_iface, GLB_REQ,
+					GLB_REQ_REQ_IDLE_DISABLE,
+					GLB_REQ_IDLE_DISABLE_MASK);
+
+	dev_dbg(kbdev->dev, "Sending request to disable gpu idle timer");
+
+	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
+}
+
+void kbase_csf_firmware_ping(struct kbase_device *const kbdev)
 {
 	const struct kbase_csf_global_iface *const global_iface =
 		&kbdev->csf.global_iface;
@@ -731,7 +1039,11 @@ int kbase_csf_firmware_ping(struct kbase_device *const kbdev)
 	set_global_request(global_iface, GLB_REQ_PING_MASK);
 	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
 	kbase_csf_scheduler_spin_unlock(kbdev, flags);
+}

+int kbase_csf_firmware_ping_wait(struct kbase_device *const kbdev)
+{
+	kbase_csf_firmware_ping(kbdev);
 	return wait_for_global_request(kbdev, GLB_REQ_PING_MASK);
 }

@@ -763,13 +1075,9 @@ void kbase_csf_enter_protected_mode(struct kbase_device *kbdev)
 {
 	struct kbase_csf_global_iface *global_iface = &kbdev->csf.global_iface;
 	unsigned long flags;
-	unsigned int value;

 	kbase_csf_scheduler_spin_lock(kbdev, &flags);
-	value = kbase_csf_firmware_global_output(global_iface, GLB_ACK);
-	value ^= GLB_REQ_PROTM_ENTER_MASK;
-	kbase_csf_firmware_global_input_mask(global_iface, GLB_REQ, value,
-					     GLB_REQ_PROTM_ENTER_MASK);
+	set_global_request(global_iface, GLB_REQ_PROTM_ENTER_MASK);
 	dev_dbg(kbdev->dev, "Sending request to enter protected mode");
 	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
 	kbase_csf_scheduler_spin_unlock(kbdev, flags);
@@ -781,22 +1089,41 @@ void kbase_csf_firmware_trigger_mcu_halt(struct kbase_device *kbdev)
 {
 	struct kbase_csf_global_iface *global_iface = &kbdev->csf.global_iface;
 	unsigned long flags;
-	unsigned int value;

 	kbase_csf_scheduler_spin_lock(kbdev, &flags);
-	value = kbase_csf_firmware_global_output(global_iface, GLB_ACK);
-	value ^= GLB_REQ_HALT_MASK;
-	kbase_csf_firmware_global_input_mask(global_iface, GLB_REQ, value,
-					     GLB_REQ_HALT_MASK);
+	set_global_request(global_iface, GLB_REQ_HALT_MASK);
 	dev_dbg(kbdev->dev, "Sending request to HALT MCU");
 	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
 	kbase_csf_scheduler_spin_unlock(kbdev, flags);
 }

+int kbase_csf_trigger_firmware_config_update(struct kbase_device *kbdev)
+{
+	struct kbase_csf_global_iface *global_iface = &kbdev->csf.global_iface;
+	unsigned long flags;
+	int err = 0;
+
+	/* The 'reg_lock' is also taken and is held till the update is
+	 * complete, to ensure the config update gets serialized.
+	 */
+	mutex_lock(&kbdev->csf.reg_lock);
+	kbase_csf_scheduler_spin_lock(kbdev, &flags);
+
+	set_global_request(global_iface, GLB_REQ_FIRMWARE_CONFIG_UPDATE_MASK);
+	dev_dbg(kbdev->dev, "Sending request for FIRMWARE_CONFIG_UPDATE");
+	kbase_csf_ring_doorbell(kbdev, CSF_KERNEL_DOORBELL_NR);
+	kbase_csf_scheduler_spin_unlock(kbdev, flags);
+
+	err = wait_for_global_request(kbdev,
+				      GLB_REQ_FIRMWARE_CONFIG_UPDATE_MASK);
+	mutex_unlock(&kbdev->csf.reg_lock);
+	return err;
+}
+
 /**
- * copy_grp_and_stm - Copy command stream and/or group data
+ * copy_grp_and_stm - Copy CS and/or group data
 *
- * @iface:                Global command stream front-end interface provided by
+ * @iface:                Global CSF interface provided by
 *                        the firmware.
 * @group_data:           Pointer where to store all the group data
 *                        (sequentially).
@@ -807,7 +1134,7 @@ void kbase_csf_firmware_trigger_mcu_halt(struct kbase_device *kbdev)
 * @max_total_stream_num: The maximum number of streams to be read.
 *                        Can be 0, in which case stream_data is unused.
 *
- * Return: Total number of command streams, summed across all groups.
+ * Return: Total number of CSs, summed across all groups.
 */
 static u32 copy_grp_and_stm(
 	const struct kbase_csf_global_iface * const iface,
@@ -830,6 +1157,8 @@ static u32 copy_grp_and_stm(
 		if (i < max_group_num) {
 			group_data[i].features = iface->groups[i].features;
 			group_data[i].stream_num = iface->groups[i].stream_num;
+			group_data[i].suspend_size =
+				iface->groups[i].suspend_size;
 		}
 		for (j = 0; j < iface->groups[i].stream_num; j++) {
 			if (total_stream_num < max_total_stream_num)
@@ -842,26 +1171,28 @@ static u32 copy_grp_and_stm(
 	return total_stream_num;
 }

-u32 kbase_csf_firmware_get_glb_iface(struct kbase_device *kbdev,
+u32 kbase_csf_firmware_get_glb_iface(
+	struct kbase_device *kbdev,
 	struct basep_cs_group_control *const group_data,
 	u32 const max_group_num,
 	struct basep_cs_stream_control *const stream_data,
 	u32 const max_total_stream_num, u32 *const glb_version,
-	u32 *const features, u32 *const group_num, u32 *const prfcnt_size)
+	u32 *const features, u32 *const group_num, u32 *const prfcnt_size,
+	u32 *const instr_features)
 {
 	const struct kbase_csf_global_iface * const iface =
 		&kbdev->csf.global_iface;

-	if (WARN_ON(!glb_version) ||
-		WARN_ON(!features) ||
-		WARN_ON(!group_num) ||
-		WARN_ON(!prfcnt_size))
+	if (WARN_ON(!glb_version) || WARN_ON(!features) ||
+	    WARN_ON(!group_num) || WARN_ON(!prfcnt_size) ||
+	    WARN_ON(!instr_features))
 		return 0;

 	*glb_version = iface->version;
 	*features = iface->features;
 	*group_num = iface->group_num;
 	*prfcnt_size = iface->prfcnt_size;
+	*instr_features = iface->instr_features;

 	return copy_grp_and_stm(iface, group_data, max_group_num,
 		stream_data, max_total_stream_num);
@@ -941,9 +1272,9 @@ int kbase_csf_firmware_mcu_shared_mapping_init(
 	mutex_lock(&kbdev->csf.reg_lock);
 	ret = kbase_add_va_region_rbtree(kbdev, va_reg, 0, num_pages, 1);
 	va_reg->flags &= ~KBASE_REG_FREE;
-	mutex_unlock(&kbdev->csf.reg_lock);
 	if (ret)
 		goto va_region_add_error;
+	mutex_unlock(&kbdev->csf.reg_lock);

 	gpu_map_properties &= (KBASE_REG_GPU_RD | KBASE_REG_GPU_WR);
 	gpu_map_properties |= gpu_map_prot;
@@ -965,9 +1296,9 @@ int kbase_csf_firmware_mcu_shared_mapping_init(
 mmu_insert_pages_error:
 	mutex_lock(&kbdev->csf.reg_lock);
 	kbase_remove_va_region(va_reg);
-	mutex_unlock(&kbdev->csf.reg_lock);
 va_region_add_error:
 	kbase_free_alloced_region(va_reg);
+	mutex_unlock(&kbdev->csf.reg_lock);
 va_region_alloc_error:
 	vunmap(cpu_addr);
 vmap_error:
@@ -981,7 +1312,8 @@ page_list_alloc_error:
 	kfree(phys);
 out:
 	/* Zero-initialize the mapping to make sure that the termination
-	 * function doesn't try to unmap or free random addresses. */
+	 * function doesn't try to unmap or free random addresses.
+	 */
 	csf_mapping->phys = NULL;
 	csf_mapping->cpu_addr = NULL;
 	csf_mapping->va_reg = NULL;
@@ -996,8 +1328,8 @@ void kbase_csf_firmware_mcu_shared_mapping_term(
 	if (csf_mapping->va_reg) {
 		mutex_lock(&kbdev->csf.reg_lock);
 		kbase_remove_va_region(csf_mapping->va_reg);
-		mutex_unlock(&kbdev->csf.reg_lock);
 		kbase_free_alloced_region(csf_mapping->va_reg);
+		mutex_unlock(&kbdev->csf.reg_lock);
 	}

 	if (csf_mapping->phys) {
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_heap_context_alloc.c
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_heap_context_alloc.c
@@ -1,11 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
--- a/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_heap_context_alloc.h
+++ b/drivers/gpu/arm/bifrost/csf/mali_kbase_csf_heap_context_alloc.h
@@ -1,11 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
 *
- * (C) COPYRIGHT 2019 ARM Limited. All rights reserved.
+ * (C) COPYRIGHT 2019-2020 ARM Limited. All rights reserved.
 *
 * This program is free software and is provided to you under the terms of the
 * GNU General Public License version 2 as published by the Free Software
 * Foundation, and any use by you of this program is subject to the terms
- * of such GNU licence.
+ * of such GNU license.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -16,8 +17,6 @@
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
- * SPDX-License-Identifier: GPL-2.0
- *
 */

 #include <mali_kbase.h>
--- a/Show More
+++ b/Show More