Commit Graph

1059212 Commits

Author SHA1 Message Date
Mark Brown
166253e5b2 UPSTREAM: arm64: Always use individual bits in CPACR floating point enables
CPACR_EL1 has several bitfields for controlling traps for floating point
features to EL1, each of which has a separate bits for EL0 and EL1. Marc
Zyngier noted that we are not consistent in our use of defines to
manipulate these, sometimes using a define covering the whole field and
sometimes using defines for the individual bits. Make this consistent by
expanding the whole field defines where they are used (currently only in
the KVM code) and deleting them so that no further uses can be
introduced.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220207152109.197566-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 3bb72d86d8)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I2e42e5c18bfdcbeac99eec5121b1ef5117c05ec8
2022-08-10 08:57:55 +01:00
Mark Brown
38a89f0844 UPSTREAM: arm64: Define CPACR_EL1_FPEN similarly to other floating point controls
The base floating point, SVE and SME all have enable controls for EL0 and
EL1 in CPACR_EL1 which have a similar layout and function. Currently the
basic floating point enable FPEN is defined differently to the SVE control,
specified as a single define in kvm_arm.h rather than in sysreg.h. Move the
define to sysreg.h and provide separate EL0 and EL1 control bits so code
managing the different floating point enables can look consistent.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220207152109.197566-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 879358fc67)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I622d3bee721ed51bd776324493757e2823465c62
2022-08-10 08:57:55 +01:00
Mark Brown
8c267b0a1f UPSTREAM: arm64/sve: Minor clarification of ABI documentation
As suggested by Luis for the SME version of this explicitly say that the
vector length should be extracted from the return value of a set vector
length prctl() with a bitwise and rather than just any old and.

Suggested-by: Luis Machado <Luis.Machado@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210184133.320748-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit aed34d9e52)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5b938f8ddf49f996677bea0d5ecbc4f4e5f93e9c
2022-08-10 08:57:54 +01:00
Mark Brown
20e98ae0e8 UPSTREAM: arm64/sve: Generalise vector length configuration prctl() for SME
In preparation for adding SME support update the bulk of the implementation
for the vector length configuration prctl() calls to be independent of
vector type.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210184133.320748-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 30c43e73b3)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id8ce9334a36e061d6df18d749a2e7875bd124f4e
2022-08-10 08:57:53 +01:00
Mark Brown
948ced36f9 UPSTREAM: arm64/sve: Make sysctl interface for SVE reusable by SME
The vector length configuration for SME is very similar to that for SVE
so in order to allow reuse refactor the SVE configuration so that it takes
the vector type from the struct ctl_table. Since there's no dedicated space
for this we repurpose the extra1 field to store the vector type, this is
otherwise unused for integer sysctls.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210184133.320748-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 97bcbee404)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5f29b5c91e812fb723b3fd84fc681ccb7bbd87ef
2022-08-10 08:57:52 +01:00
Mark Brown
6d05d0e9d2 UPSTREAM: arm64/sve: Track vector lengths for tasks in an array
As for SVE we will track a per task SME vector length for tasks. Convert
the existing storage for the vector length into an array and update
fpsimd_flush_task() to initialise this in a function.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-10-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 5838a15579)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6cd068a6cb7508b84e3f202750569a20c5f2bd77
2022-08-10 08:57:51 +01:00
Mark Brown
cdd7a7096a UPSTREAM: arm64/sve: Explicitly load vector length when restoring SVE state
Currently when restoring the SVE state we supply the SVE vector length
as an argument to sve_load_state() and the underlying macros. This becomes
inconvenient with the addition of SME since we may need to restore any
combination of SVE and SME vector lengths, and we already separately
restore the vector length in the KVM code. We don't need to know the vector
length during the actual register load since the SME load instructions can
index into the data array for us.

Refactor the interface so we explicitly set the vector length separately
to restoring the SVE registers in preparation for adding SME support, no
functional change should be involved.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit ddc806b5c4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6b546ddda30845a1296084e8a5867e2a2c30f1b6
2022-08-10 08:57:51 +01:00
Mark Brown
554140513f UPSTREAM: arm64/sve: Put system wide vector length information into structs
With the introduction of SME we will have a second vector length in the
system, enumerated and configured in a very similar fashion to the
existing SVE vector length.  While there are a few differences in how
things are handled this is a relatively small portion of the overall
code so in order to avoid code duplication we factor out

We create two structs, one vl_info for the static hardware properties
and one vl_config for the runtime configuration, with an array
instantiated for each and update all the users to reference these. Some
accessor functions are provided where helpful for readability, and the
write to set the vector length is put into a function since the system
register being updated needs to be chosen at compile time.

This is a mostly mechanical replacement, further work will be required
to actually make things generic, ensuring that we handle those places
where there are differences properly.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit b5bc00ffdd)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ief5d1a8373f5eedcc75c3435b103383fac1f77f2
2022-08-10 08:57:50 +01:00
Mark Brown
1b9538ce70 UPSTREAM: arm64/sve: Use accessor functions for vector lengths in thread_struct
In a system with SME there are parallel vector length controls for SVE and
SME vectors which function in much the same way so it is desirable to
share the code for handling them as much as possible. In order to prepare
for doing this add a layer of accessor functions for the various VL related
operations on tasks.

Since almost all current interactions are actually via task->thread rather
than directly with the thread_info the accessors use that. Accessors are
provided for both generic and SVE specific usage, the generic accessors
should be used for cases where register state is being manipulated since
the registers are shared between streaming and regular SVE so we know that
when SME support is implemented we will always have to be in the appropriate
mode already and hence can generalise now.

Since we are using task_struct and we don't want to cause widespread
inclusion of sched.h the acessors are all out of line, it is hoped that
none of the uses are in a sufficiently critical path for this to be an
issue. Those that are most likely to present an issue are in the same
translation unit so hopefully the compiler may be able to inline anyway.

This is purely adding the layer of abstraction, additional work will be
needed to support tasks using SME.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 0423eedcf4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I08f2a81d1082f1ebd128fe1cda82883e38f3cd39
2022-08-10 08:57:49 +01:00
Mark Brown
38bf07d1fb UPSTREAM: arm64/sve: Rename find_supported_vector_length()
The function has SVE specific checks in it and it will be more trouble
to add conditional code for SME than it is to simply rename it to be SVE
specific.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 059613f546)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I792399b5230454842df34ab56d5883e3dfe95554
2022-08-10 08:57:48 +01:00
Mark Brown
baecdda35c BACKPORT: arm64/sve: Make access to FFR optional
SME introduces streaming SVE mode in which FFR is not present and the
instructions for accessing it UNDEF. In preparation for handling this
update the low level SVE state access functions to take a flag specifying
if FFR should be handled. When saving the register state we store a zero
for FFR to guard against uninitialized data being read. No behaviour change
should be introduced by this patch.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 9f58486657)
[willdeacon@: Dropped hunk from removed __sve_save_state() function]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I105e81a3283bdaeb06291dc9f249998613a39cc3
2022-08-10 08:57:47 +01:00
Mark Brown
e2cb66c112 UPSTREAM: arm64/sve: Make sve_state_size() static
There are no users outside fpsimd.c so make sve_state_size() static.
KVM open codes an equivalent.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 12cc2352bf)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I9b18fedf61a21c66b88b69fe9a0bab8ffea6596b
2022-08-10 08:57:47 +01:00
Mark Brown
d1c5ae0523 UPSTREAM: arm64/sve: Remove sve_load_from_fpsimd_state()
Following optimisations of the SVE register handling we no longer load the
SVE state from a saved copy of the FPSIMD registers, we convert directly
in registers or from one saved state to another. Remove the function so we
don't need to update it during further refactoring.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit b53223e0a4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ifa825d452d32d0721e42d58fb615a10c0ef705c0
2022-08-10 08:57:46 +01:00
Mark Brown
5262dd763a UPSTREAM: arm64/fp: Reindent fpsimd_save()
Currently all the active code in fpsimd_save() is inside a check for
TIF_FOREIGN_FPSTATE. Reduce the indentation level by changing to return
from the function if TIF_FOREIGN_FPSTATE is set.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 2d481bd3b6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I8174634596e5f1e09afa0d53cd86c26fbb9edfd9
2022-08-10 08:57:45 +01:00
Marc Zyngier
84319a03ee UPSTREAM: arm64: Add a capability for FEAT_ECV
Add a new capability to detect the Enhanced Counter Virtualization
feature (FEAT_ECV).

Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-15-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit fdf865988b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iddc3774372b5ed8ede4b9056cdecd698348673f2
2022-08-10 08:57:44 +01:00
Vladimir Murzin
974b9501d4 BACKPORT: arm64: Add support of PAuth QARMA3 architected algorithm
QARMA3 is relaxed version of the QARMA5 algorithm which expected to
reduce the latency of calculation while still delivering a suitable
level of security.

Support for QARMA3 can be discovered via ID_AA64ISAR2_EL1

    APA3, bits [15:12] Indicates whether the QARMA3 algorithm is
                       implemented in the PE for address
                       authentication in AArch64 state.

    GPA3, bits [11:8]  Indicates whether the QARMA3 algorithm is
                       implemented in the PE for generic code
                       authentication in AArch64 state.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220224124952.119612-4-vladimir.murzin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit def8c222f0)
[willdeacon@: Fix context sysreg/cpufeature context conflicts due to CLEARBHB
 mitigation backport]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I7829af89cc97efe7ef2c8763c8fee8fffe54b7c9
2022-08-10 08:57:43 +01:00
Vladimir Murzin
0af2daf761 UPSTREAM: arm64: cpufeature: Mark existing PAuth architected algorithm as QARMA5
In preparation of supporting PAuth QARMA3 architected algorithm mark
existing one as QARMA5, so we can distingwish between two.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220224124952.119612-3-vladimir.murzin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit be3256a086)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I89e7e70c4d7712538da6a6cdf0c4605366b78616
2022-08-10 08:57:43 +01:00
Vladimir Murzin
e7540b13f8 UPSTREAM: arm64: cpufeature: Account min_field_value when cheking secondaries for PAuth
In case, both boot_val and sec_val have value below min_field_value we
would wrongly report that address authentication is supported. It is
not a big issue because we enable address authentication based on boot
cpu (and check there is correct).

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220224124952.119612-2-vladimir.murzin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit da844beb6d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ib03730d7573dc9bea4732eaf930912441439c13c
2022-08-10 08:57:42 +01:00
Marc Zyngier
c151f88c42 UPSTREAM: KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
When adding support for the slightly wonky Apple M1, we had to
populate ID_AA64PFR0_EL1.GIC==1 to present something to the guest,
as the HW itself doesn't advertise the feature.

However, we gated this on the in-kernel irqchip being created.
This causes some trouble for QEMU, which snapshots the state of
the registers before creating a virtual GIC, and then tries to
restore these registers once the GIC has been created.  Obviously,
between the two stages, ID_AA64PFR0_EL1.GIC has changed value,
and the write fails.

The fix is to actually emulate the HW, and always populate the
field if the HW is capable of it.

Fixes: 562e530fd7 ("KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Oliver Upton <oupton@google.com>
Link: https://lore.kernel.org/r/20220503211424.3375263-1-maz@kernel.org
(cherry picked from commit 5163373af1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6888e25ed0ac5b96727ae1f8f728bf523aac5650
2022-08-10 07:29:13 +00:00
Marc Zyngier
288f56fd19 UPSTREAM: KVM: arm64: Inject exception on out-of-IPA-range translation fault
When taking a translation fault for an IPA that is outside of
the range defined by the hypervisor (between the HW PARange and
the IPA range), we stupidly treat it as an IO and forward the access
to userspace. Of course, userspace can't do much with it, and things
end badly.

Arguably, the guest is braindead, but we should at least catch the
case and inject an exception.

Check the faulting IPA against:
- the sanitised PARange: inject an address size fault
- the IPA size: inject an abort

Reported-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 85ea6b1ec9)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iaad029b26f5ae91ed6ff06db53f23107beffc573
2022-08-10 07:29:13 +00:00
Alexandru Elisei
2d5941c041 UPSTREAM: KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set
kvm->arch.arm_pmu is set when userspace attempts to set the first PMU
attribute. As certain attributes are mandatory, arm_pmu ends up always
being set to a valid arm_pmu, otherwise KVM will refuse to run the VCPU.
However, this only happens if the VCPU has the PMU feature. If the VCPU
doesn't have the feature bit set, kvm->arch.arm_pmu will be left
uninitialized and equal to NULL.

KVM doesn't do ID register emulation for 32-bit guests and accesses to the
PMU registers aren't gated by the pmu_visibility() function. This is done
to prevent injecting unexpected undefined exceptions in guests which have
detected the presence of a hardware PMU. But even though the VCPU feature
is missing, KVM still attempts to emulate certain aspects of the PMU when
PMU registers are accessed. This leads to a NULL pointer dereference like
this one, which happens on an odroid-c4 board when running the
kvm-unit-tests pmu-cycle-counter test with kvmtool and without the PMU
feature being set:

[  454.402699] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000150
[  454.405865] Mem abort info:
[  454.408596]   ESR = 0x96000004
[  454.411638]   EC = 0x25: DABT (current EL), IL = 32 bits
[  454.416901]   SET = 0, FnV = 0
[  454.419909]   EA = 0, S1PTW = 0
[  454.423010]   FSC = 0x04: level 0 translation fault
[  454.427841] Data abort info:
[  454.430687]   ISV = 0, ISS = 0x00000004
[  454.434484]   CM = 0, WnR = 0
[  454.437404] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000c924000
[  454.443800] [0000000000000150] pgd=0000000000000000, p4d=0000000000000000
[  454.450528] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[  454.456036] Modules linked in:
[  454.459053] CPU: 1 PID: 267 Comm: kvm-vcpu-0 Not tainted 5.18.0-rc4 #113
[  454.465697] Hardware name: Hardkernel ODROID-C4 (DT)
[  454.470612] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  454.477512] pc : kvm_pmu_event_mask.isra.0+0x14/0x74
[  454.482427] lr : kvm_pmu_set_counter_event_type+0x2c/0x80
[  454.487775] sp : ffff80000a9839c0
[  454.491050] x29: ffff80000a9839c0 x28: ffff000000a83a00 x27: 0000000000000000
[  454.498127] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000a510000
[  454.505198] x23: ffff000000a83a00 x22: ffff000003b01000 x21: 0000000000000000
[  454.512271] x20: 000000000000001f x19: 00000000000003ff x18: 0000000000000000
[  454.519343] x17: 000000008003fe98 x16: 0000000000000000 x15: 0000000000000000
[  454.526416] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[  454.533489] x11: 000000008003fdbc x10: 0000000000009d20 x9 : 000000000000001b
[  454.540561] x8 : 0000000000000000 x7 : 0000000000000d00 x6 : 0000000000009d00
[  454.547633] x5 : 0000000000000037 x4 : 0000000000009d00 x3 : 0d09000000000000
[  454.554705] x2 : 000000000000001f x1 : 0000000000000000 x0 : 0000000000000000
[  454.561779] Call trace:
[  454.564191]  kvm_pmu_event_mask.isra.0+0x14/0x74
[  454.568764]  kvm_pmu_set_counter_event_type+0x2c/0x80
[  454.573766]  access_pmu_evtyper+0x128/0x170
[  454.577905]  perform_access+0x34/0x80
[  454.581527]  kvm_handle_cp_32+0x13c/0x160
[  454.585495]  kvm_handle_cp15_32+0x1c/0x30
[  454.589462]  handle_exit+0x70/0x180
[  454.592912]  kvm_arch_vcpu_ioctl_run+0x1c4/0x5e0
[  454.597485]  kvm_vcpu_ioctl+0x23c/0x940
[  454.601280]  __arm64_sys_ioctl+0xa8/0xf0
[  454.605160]  invoke_syscall+0x48/0x114
[  454.608869]  el0_svc_common.constprop.0+0xd4/0xfc
[  454.613527]  do_el0_svc+0x28/0x90
[  454.616803]  el0_svc+0x34/0xb0
[  454.619822]  el0t_64_sync_handler+0xa4/0x130
[  454.624049]  el0t_64_sync+0x18c/0x190
[  454.627675] Code: a9be7bfd 910003fd f9000bf3 52807ff3 (b9415001)
[  454.633714] ---[ end trace 0000000000000000 ]---

In this particular case, Linux hasn't detected the presence of a hardware
PMU because the PMU node is missing from the DTB, so userspace would have
been unable to set the VCPU PMU feature even if it attempted it. What
happens is that the 32-bit guest reads ID_DFR0, which advertises the
presence of the PMU, and when it tries to program a counter, it triggers
the NULL pointer dereference because kvm->arch.arm_pmu is NULL.

kvm-arch.arm_pmu was introduced by commit 46b1878214 ("KVM: arm64:
Keep a per-VM pointer to the default PMU"). Until that commit, this
error would be triggered instead:

[   73.388140] ------------[ cut here ]------------
[   73.388189] Unknown PMU version 0
[   73.390420] WARNING: CPU: 1 PID: 264 at arch/arm64/kvm/pmu-emul.c:36 kvm_pmu_event_mask.isra.0+0x6c/0x74
[   73.399821] Modules linked in:
[   73.402835] CPU: 1 PID: 264 Comm: kvm-vcpu-0 Not tainted 5.17.0 #114
[   73.409132] Hardware name: Hardkernel ODROID-C4 (DT)
[   73.414048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   73.420948] pc : kvm_pmu_event_mask.isra.0+0x6c/0x74
[   73.425863] lr : kvm_pmu_event_mask.isra.0+0x6c/0x74
[   73.430779] sp : ffff80000a8db9b0
[   73.434055] x29: ffff80000a8db9b0 x28: ffff000000dbaac0 x27: 0000000000000000
[   73.441131] x26: ffff000000dbaac0 x25: 00000000c600000d x24: 0000000000180720
[   73.448203] x23: ffff800009ffbe10 x22: ffff00000b612000 x21: 0000000000000000
[   73.455276] x20: 000000000000001f x19: 0000000000000000 x18: ffffffffffffffff
[   73.462348] x17: 000000008003fe98 x16: 0000000000000000 x15: 0720072007200720
[   73.469420] x14: 0720072007200720 x13: ffff800009d32488 x12: 00000000000004e6
[   73.476493] x11: 00000000000001a2 x10: ffff800009d32488 x9 : ffff800009d32488
[   73.483565] x8 : 00000000ffffefff x7 : ffff800009d8a488 x6 : ffff800009d8a488
[   73.490638] x5 : ffff0000f461a9d8 x4 : 0000000000000000 x3 : 0000000000000001
[   73.497710] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000000dbaac0
[   73.504784] Call trace:
[   73.507195]  kvm_pmu_event_mask.isra.0+0x6c/0x74
[   73.511768]  kvm_pmu_set_counter_event_type+0x2c/0x80
[   73.516770]  access_pmu_evtyper+0x128/0x16c
[   73.520910]  perform_access+0x34/0x80
[   73.524532]  kvm_handle_cp_32+0x13c/0x160
[   73.528500]  kvm_handle_cp15_32+0x1c/0x30
[   73.532467]  handle_exit+0x70/0x180
[   73.535917]  kvm_arch_vcpu_ioctl_run+0x20c/0x6e0
[   73.540489]  kvm_vcpu_ioctl+0x2b8/0x9e0
[   73.544283]  __arm64_sys_ioctl+0xa8/0xf0
[   73.548165]  invoke_syscall+0x48/0x114
[   73.551874]  el0_svc_common.constprop.0+0xd4/0xfc
[   73.556531]  do_el0_svc+0x28/0x90
[   73.559808]  el0_svc+0x28/0x80
[   73.562826]  el0t_64_sync_handler+0xa4/0x130
[   73.567054]  el0t_64_sync+0x1a0/0x1a4
[   73.570676] ---[ end trace 0000000000000000 ]---
[   73.575382] kvm: pmu event creation failed -2

The root cause remains the same: kvm->arch.pmuver was never set to
something sensible because the VCPU feature itself was never set.

The odroid-c4 is somewhat of a special case, because Linux doesn't probe
the PMU. But the above errors can easily be reproduced on any hardware,
with or without a PMU driver, as long as userspace doesn't set the PMU
feature.

Work around the fact that KVM advertises a PMU even when the VCPU feature
is not set by gating all PMU emulation on the feature. The guest can still
access the registers without KVM injecting an undefined exception.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425145530.723858-1-alexandru.elisei@arm.com
(cherry picked from commit 8f6379e207)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6840d8312a60673344e08ab99527d0a2106384d1
2022-08-10 07:29:13 +00:00
Will Deacon
3f9f9dda5f UPSTREAM: KVM: arm64: Handle host stage-2 faults from 32-bit EL0
When pKVM is enabled, host memory accesses are translated by an identity
mapping at stage-2, which is populated lazily in response to synchronous
exceptions from 64-bit EL1 and EL0.

Extend this handling to cover exceptions originating from 32-bit EL0 as
well. Although these are very unlikely to occur in practice, as the
kernel typically ensures that user pages are initialised before mapping
them in, drivers could still map previously untouched device pages into
userspace and expect things to work rather than panic the system.

Cc: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org
(cherry picked from commit 2a50fc5fd0)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ibc6ffba7e8b01f940dbe39aefee10be1a7135b5e
2022-08-10 07:29:13 +00:00
Oliver Upton
462ab06b7e UPSTREAM: KVM: Don't create VM debugfs files outside of the VM directory
Unfortunately, there is no guarantee that KVM was able to instantiate a
debugfs directory for a particular VM. To that end, KVM shouldn't even
attempt to create new debugfs files in this case. If the specified
parent dentry is NULL, debugfs_create_file() will instantiate files at
the root of debugfs.

For arm64, it is possible to create the vgic-state file outside of a
VM directory, the file is not cleaned up when a VM is destroyed.
Nonetheless, the corresponding struct kvm is freed when the VM is
destroyed.

Nip the problem in the bud for all possible errant debugfs file
creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing,
debugfs_create_file() will fail instead of creating the file in the root
directory.

Cc: stable@kernel.org
Fixes: 929f45e324 ("kvm: no need to check return value of debugfs_create functions")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com
(cherry picked from commit a44a4cc1c9)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I1926ef85a8efb3ff6f4490454f5ce336c1d4e443
2022-08-10 07:29:13 +00:00
Reiji Watanabe
b6e7475a71 UPSTREAM: KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs
for a guest.  At vCPU reset, vcpu_allowed_register_width() checks
if the vcpu's register width is consistent with all other vCPUs'.
Since the checking is done even against vCPUs that are not initialized
(KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs
are erroneously treated as 64bit vCPU, which causes the function to
incorrectly detect a mixed-width VM.

Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED
bits for kvm->arch.flags.  A value of the EL1_32BIT bit indicates that
the guest needs to be configured with all 32bit or 64bit vCPUs, and
a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the
EL1_32BIT bit is valid (already set up). Values in those bits are set at
the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT
configuration for the vCPU.

Check vcpu's register width against those new bits at the vcpu's
KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width).

Fixes: 66e94d5caf ("KVM: arm64: Prevent mixed-width VM creation")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com
(cherry picked from commit 26bf74bd9f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ife1dd629e091b9bf8ee62d1c3a998f363e3fd05d
2022-08-10 07:29:13 +00:00
Yu Zhe
db08ea5d66 UPSTREAM: KVM: arm64: vgic: Remove unnecessary type castings
Remove unnecessary casts.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220329102059.268983-1-yuzhe@nfschina.com
(cherry picked from commit c707663e81)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I1e7c278cf73c5dbcbad889e37289dde9cda6d1d3
2022-08-10 07:29:13 +00:00
Oliver Upton
62ddcd9fe2 UPSTREAM: KVM: arm64: Don't split hugepages outside of MMU write lock
It is possible to take a stage-2 permission fault on a page larger than
PAGE_SIZE. For example, when running a guest backed by 2M HugeTLB, KVM
eagerly maps at the largest possible block size. When dirty logging is
enabled on a memslot, KVM does *not* eagerly split these 2M stage-2
mappings and instead clears the write bit on the pte.

Since dirty logging is always performed at PAGE_SIZE granularity, KVM
lazily splits these 2M block mappings down to PAGE_SIZE in the stage-2
fault handler. This operation must be done under the write lock. Since
commit f783ef1c0e ("KVM: arm64: Add fast path to handle permission
relaxation during dirty logging"), the stage-2 fault handler
conditionally takes the read lock on permission faults with dirty
logging enabled. To that end, it is possible to split a 2M block mapping
while only holding the read lock.

The problem is demonstrated by running kvm_page_table_test with 2M
anonymous HugeTLB, which splats like so:

  WARNING: CPU: 5 PID: 15276 at arch/arm64/kvm/hyp/pgtable.c:153 stage2_map_walk_leaf+0x124/0x158

  [...]

  Call trace:
  stage2_map_walk_leaf+0x124/0x158
  stage2_map_walker+0x5c/0xf0
  __kvm_pgtable_walk+0x100/0x1d4
  __kvm_pgtable_walk+0x140/0x1d4
  __kvm_pgtable_walk+0x140/0x1d4
  kvm_pgtable_walk+0xa0/0xf8
  kvm_pgtable_stage2_map+0x15c/0x198
  user_mem_abort+0x56c/0x838
  kvm_handle_guest_abort+0x1fc/0x2a4
  handle_exit+0xa4/0x120
  kvm_arch_vcpu_ioctl_run+0x200/0x448
  kvm_vcpu_ioctl+0x588/0x664
  __arm64_sys_ioctl+0x9c/0xd4
  invoke_syscall+0x4c/0x144
  el0_svc_common+0xc4/0x190
  do_el0_svc+0x30/0x8c
  el0_svc+0x28/0xcc
  el0t_64_sync_handler+0x84/0xe4
  el0t_64_sync+0x1a4/0x1a8

Fix the issue by only acquiring the read lock if the guest faulted on a
PAGE_SIZE granule w/ dirty logging enabled. Add a WARN to catch locking
bugs in future changes.

Fixes: f783ef1c0e ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging")
Cc: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220401194652.950240-1-oupton@google.com
(cherry picked from commit f587661f21)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I51bdae0fe469e920d8f3396cb44c3936cdc41ee3
2022-08-10 07:29:13 +00:00
Oliver Upton
de161ce831 UPSTREAM: KVM: arm64: Drop unneeded minor version check from PSCI v1.x handler
We already sanitize the guest's PSCI version when it is being written by
userspace, rejecting unsupported version numbers. Additionally, the
'minor' parameter to kvm_psci_1_x_call() is a constant known at compile
time for all callsites.

Though it is benign, the additional check against the
PSCI kvm_psci_1_x_call() is unnecessary and likely to be missed the next
time KVM raises its maximum PSCI version. Drop the check altogether and
rely on sanitization when the PSCI version is set by userspace.

No functional change intended.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220322183538.2757758-4-oupton@google.com
(cherry picked from commit 73b725c7a6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I2fc6b4a9cbbbba0971af62b989f2b21071472b46
2022-08-10 07:29:13 +00:00
Oliver Upton
f35a34bebb UPSTREAM: KVM: arm64: Actually prevent SMC64 SYSTEM_RESET2 from AArch32
The SMCCC does not allow the SMC64 calling convention to be used from
AArch32. While KVM checks to see if the calling convention is allowed in
PSCI_1_0_FN_PSCI_FEATURES, it does not actually prevent calls to
unadvertised PSCI v1.0+ functions.

Hoist the check to see if the requested function is allowed into
kvm_psci_call(), thereby preventing SMC64 calls from AArch32 for all
PSCI versions.

Fixes: d43583b890 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest")
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220322183538.2757758-3-oupton@google.com
(cherry picked from commit 827c2ab331)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I695cf2f43a95fcd5bc5ed5f800f566cf8b553213
2022-08-10 07:29:13 +00:00
Oliver Upton
27fb2678fc UPSTREAM: KVM: arm64: Generally disallow SMC64 for AArch32 guests
The only valid calling SMC calling convention from an AArch32 state is
SMC32. Disallow any PSCI function that sets the SMC64 function ID bit
when called from AArch32 rather than comparing against known SMC64 PSCI
functions.

Note that without this change KVM advertises the SMC64 flavor of
SYSTEM_RESET2 to AArch32 guests.

Fixes: d43583b890 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest")
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220322183538.2757758-2-oupton@google.com
(cherry picked from commit 2da0aebc74)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I537be243b769e3437431cfbd1a64143da55526c2
2022-08-10 07:29:13 +00:00
Julia Lawall
f056206781 UPSTREAM: KVM: arm64: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220318103729.157574-24-Julia.Lawall@inria.fr
(cherry picked from commit 21ea457842)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I63a2e4bcee78df961b3f8f97b07d6d031eff28a2
2022-08-10 07:29:13 +00:00
Marc Zyngier
3eda60ad43 UPSTREAM: KVM: arm64: Generalise VM features into a set of flags
We currently deal with a set of booleans for VM features,
while they could be better represented as set of flags
contained in an unsigned long, similarily to what we are
doing on the CPU side.

Signed-off-by: Marc Zyngier <maz@kernel.org>
[Oliver: Flag-ify the 'ran_once' boolean]
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220311174001.605719-2-oupton@google.com
(cherry picked from commit 06394531b4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I288c6f4b4d6f6fade0ed597547d4b127f3f3e355
2022-08-10 07:29:13 +00:00
Will Deacon
4873c7f1b7 UPSTREAM: KVM: arm64: Really propagate PSCI SYSTEM_RESET2 arguments to userspace
Commit d43583b890 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the
guest") hooked up the SYSTEM_RESET2 PSCI call for guests but failed to
preserve its arguments for userspace, instead overwriting them with
zeroes via smccc_set_retval(). As Linux only passes zeroes for these
arguments, this appeared to be working for Linux guests. Oh well.

Don't call smccc_set_retval() for a SYSTEM_RESET2 heading to userspace
and instead set X0 (and only X0) explicitly to PSCI_RET_INTERNAL_FAILURE
just in case the vCPU re-enters the guest.

Fixes: d43583b890 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest")
Reported-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220309181308.982-1-will@kernel.org
(cherry picked from commit 9d3e7b7c82)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id3a1e33bdb5531c487ada602e26efeb4abc61edf
2022-08-10 07:29:13 +00:00
Oliver Upton
ba2641081f BACKPORT: Documentation: KVM: Update documentation to indicate KVM is arm64-only
KVM support for 32-bit ARM hosts (KVM/arm) has been removed from the
kernel since commit 541ad0150c ("arm: Remove 32bit KVM host
support"). There still exists some remnants of the old architecture in
the KVM documentation.

Remove all traces of 32-bit host support from the documentation. Note
that AArch32 guests are still supported.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220308172856.2997250-1-oupton@google.com
(cherry picked from commit 3fbf4207dc)
[willdeacon@: Dropped all references to riscv from api.rst]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I626e3d6473b266566ca22a612bd04312f26b1144
2022-08-10 07:29:13 +00:00
Marc Zyngier
e21027c244 UPSTREAM: KVM: arm64: Only open the interrupt window on exit due to an interrupt
Now that we properly account for interrupts taken whilst the guest
was running, it becomes obvious that there is no need to open
this accounting window if we didn't exit because of an interrupt.

This saves a number of system register accesses and other barriers
if we exited for any other reason (such as a trap, for example).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220304135914.1464721-1-maz@kernel.org
(cherry picked from commit f7659f8bcd)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ie1eef3d5644c955a6e575fb42010054cc16578ac
2022-08-10 07:29:13 +00:00
Changcheng Deng
c2dfd5f852 UPSTREAM: KVM: arm64: Remove unneeded semicolons
Fix the following coccicheck review:
./arch/arm64/kvm/psci.c: 379: 3-4: Unneeded semicolon

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
[maz: squashed another instance of the same issue in the patch]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220223092750.1934130-1-deng.changcheng@zte.com.cn
Link: https://lore.kernel.org/r/20220225122922.GA19390@willie-the-truck
(cherry picked from commit ae82047e97)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I4bf43b4c06fa43a04cacd93f406ff2d927d97c45
2022-08-10 07:29:13 +00:00
Will Deacon
7188e3f98a BACKPORT: KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field
When handling reset and power-off PSCI calls from the guest, we
initialise X0 to PSCI_RET_INTERNAL_FAILURE in case the VMM tries to
re-run the vCPU after issuing the call.

Unfortunately, this also means that the VMM cannot see which PSCI call
was issued and therefore cannot distinguish between PSCI SYSTEM_RESET
and SYSTEM_RESET2 calls, which is necessary in order to determine the
validity of the "reset_type" in X1.

Allocate bit 0 of the previously unused 'flags' field of the
system_event structure so that we can indicate the PSCI call used to
initiate the reset.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-4-will@kernel.org
(cherry picked from commit 34739fd95f)
[willdeacon@: Resolved conflict in arm64 uapi/asm/kvm.h the same way as
 upstream merge commit 1a48ce9264 ("Merge branch kvm-arm64/psci-1.1 into
 kvmarm-master/next")]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I1840b2c4d148f2a1d17b5d181e49cab52485c712
2022-08-10 07:29:13 +00:00
Will Deacon
dd68dbc82b UPSTREAM: KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest
PSCI v1.1 introduces the optional SYSTEM_RESET2 call, which allows the
caller to provide a vendor-specific "reset type" and "cookie" to request
a particular form of reset or shutdown.

Expose this call to the guest and handle it in the same way as PSCI
SYSTEM_RESET, along with some basic range checking on the type argument.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-3-will@kernel.org
(cherry picked from commit d43583b890)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ief05340203492a807a6dec27022947aea8635434
2022-08-10 07:29:13 +00:00
Will Deacon
eba129ae15 BACKPORT: KVM: arm64: Bump guest PSCI version to 1.1
Expose PSCI version v1.1 to the guest by default. The only difference
for now is that an updated version number is reported by PSCI_VERSION.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-2-will@kernel.org
(cherry picked from commit 512865d83f)
[willdeacon@: Resolved conflict in arm64/kvm/psci.c the same way as upstream
 merge commit 1a48ce9264 ("Merge branch kvm-arm64/psci-1.1 into
 kvmarm-master/next")]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icb7bfc49936ab5214370cb2c53087d6cca1b6154
2022-08-10 07:29:13 +00:00
Alexandru Elisei
b8674d4af5 UPSTREAM: KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-7-alexandru.elisei@arm.com
(cherry picked from commit 583cda1b0e)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3941823ed094a09abf97039be93c1eabf62bf077
2022-08-10 07:29:13 +00:00
Alexandru Elisei
0a3836875e UPSTREAM: KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in this list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that on heteregeneous systems all
virtual machines that KVM creates will use the same PMU. This might cause
unexpected behaviour for userspace: when a VCPU is executing on the
physical CPU that uses this default PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share the PMU, leaving it up to userspace to manage
the VCPU threads' affinity accordingly.

To ensure that KVM doesn't expose an asymmetric system to the guest, the
PMU set for one VCPU will be used by all other VCPUs. Once a VCPU has run,
the PMU cannot be changed in order to avoid changing the list of available
events for a VCPU, or to change the semantics of existing events.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-6-alexandru.elisei@arm.com
(cherry picked from commit 6ee7fca2a4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I92de5ab30bfce6d0ae7fdced3cedd93eef0da19b
2022-08-10 07:29:13 +00:00
Alexandru Elisei
064ec137e9 UPSTREAM: KVM: arm64: Keep a list of probed PMUs
The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-5-alexandru.elisei@arm.com
(cherry picked from commit db858060b1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5ec338a629e2d3bc1e6a34d99c21b63b0d4ac029
2022-08-10 07:29:13 +00:00
Marc Zyngier
661c13aa61 UPSTREAM: KVM: arm64: Keep a per-VM pointer to the default PMU
As we are about to allow selection of the PMU exposed to a guest, start by
keeping track of the default one instead of only the PMU version.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20220127161759.53553-4-alexandru.elisei@arm.com
(cherry picked from commit 46b1878214)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I8185797f4afacc9282f1d7c5762978e4f5186959
2022-08-10 07:29:13 +00:00
Alexandru Elisei
9144aafdb6 UPSTREAM: perf: Fix wrong name in comment for struct perf_cpu_context
Commit 0793a61d4d ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9 ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-3-alexandru.elisei@arm.com
(cherry picked from commit 2093057ab8)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I37fd3f756d3745635dd34d1adf0a638f4057f706
2022-08-10 07:29:13 +00:00
Marc Zyngier
3469bc4f23 UPSTREAM: KVM: arm64: Do not change the PMU event filter after a VCPU has run
Userspace can specify which events a guest is allowed to use with the
KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be
identified by a guest from reading the PMCEID{0,1}_EL0 registers.

Changing the PMU event filter after a VCPU has run can cause reads of the
registers performed before the filter is changed to return different values
than reads performed with the new event filter in place. The architecture
defines the two registers as read-only, and this behaviour contradicts
that.

Keep track when the first VCPU has run and deny changes to the PMU event
filter to prevent this from happening.

Signed-off-by: Marc Zyngier <maz@kernel.org>
[ Alexandru E: Added commit message, updated ioctl documentation ]
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com
(cherry picked from commit 5177fe91e4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ia1454460bf1eec029d8263755ddacc632391015f
2022-08-10 07:29:13 +00:00
Keir Fraser
1db69a5abf UPSTREAM: KVM: arm64: pkvm: Implement CONFIG_DEBUG_LIST at EL2
Currently the check functions are stubbed out at EL2. Implement
versions suitable for the constrained EL2 environment.

Signed-off-by: Keir Fraser <keirf@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220131124114.3103337-1-keirf@google.com
(cherry picked from commit 4c68d6c0a1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: If607de1f70d2d8faf2a1c6bcf368c8f57bd96d63
2022-08-10 07:29:13 +00:00
Oliver Upton
8ef2ec618a UPSTREAM: KVM: arm64: Drop unused param from kvm_psci_version()
kvm_psci_version() consumes a pointer to struct kvm in addition to a
vcpu pointer. Drop the kvm pointer as it is unused. While the comment
suggests the explicit kvm pointer was useful for calling from hyp, there
exist no such callsite in hyp.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220208012705.640444-1-oupton@google.com
(cherry picked from commit dfefa04a90)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I7c28e5d30f917b3684abaea71f653cf333ccf220
2022-08-10 07:29:13 +00:00
Shameer Kolothum
4185198669 UPSTREAM: KVM: arm64: Make active_vmids invalid on vCPU schedule out
Like ASID allocator, we copy the active_vmids into the
reserved_vmids on a rollover. But it's unlikely that
every CPU will have a vCPU as current task and we may
end up unnecessarily reserving the VMID space.

Hence, set active_vmids to an invalid one when scheduling
out a vCPU.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211122121844.867-5-shameerali.kolothum.thodi@huawei.com
(cherry picked from commit 100b4f092f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I4b393fa187a93630ba3c964532caebbaf087b8c7
2022-08-10 07:29:13 +00:00
Julien Grall
0d3cbdb765 UPSTREAM: KVM: arm64: Align the VMID allocation with the arm64 ASID
At the moment, the VMID algorithm will send an SGI to all the
CPUs to force an exit and then broadcast a full TLB flush and
I-Cache invalidation.

This patch uses the new VMID allocator. The benefits are:
   - Aligns with arm64 ASID algorithm.
   - CPUs are not forced to exit at roll-over. Instead,
     the VMID will be marked reserved and context invalidation
     is broadcasted. This will reduce the IPIs traffic.
   - More flexible to add support for pinned KVM VMIDs in
     the future.
   
With the new algo, the code is now adapted:
    - The call to update_vmid() will be done with preemption
      disabled as the new algo requires to store information
      per-CPU.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211122121844.867-4-shameerali.kolothum.thodi@huawei.com
(cherry picked from commit 3248136b36)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3c0023d222add01585c4fe41df9bb7e9a48abab8
2022-08-10 07:29:13 +00:00
Shameer Kolothum
8a856ad6da UPSTREAM: KVM: arm64: Make VMID bits accessible outside of allocator
Since we already set the kvm_arm_vmid_bits in the VMID allocator
init function, make it accessible outside as well so that it can
be used in the subsequent patch.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211122121844.867-3-shameerali.kolothum.thodi@huawei.com
(cherry picked from commit f8051e9609)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ic7c2ce90a07de86aee9c4961809e5420ccd67199
2022-08-10 07:29:13 +00:00
Shameer Kolothum
ec7b6f63d7 UPSTREAM: KVM: arm64: Introduce a new VMID allocator for KVM
A new VMID allocator for arm64 KVM use. This is based on
arm64 ASID allocator algorithm.

One major deviation from the ASID allocator is the way we
flush the context. Unlike ASID allocator, we expect less
frequent rollover in the case of VMIDs. Hence, instead of
marking the CPU as flush_pending and issuing a local context
invalidation on the next context switch, we  broadcast TLB
flush + I-cache invalidation over the inner shareable domain
on rollover.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211122121844.867-2-shameerali.kolothum.thodi@huawei.com
(cherry picked from commit 417838392f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ice9ac0c4d71bc4a425f636297fc18298bce00f37
2022-08-10 07:29:13 +00:00