Commit Graph

1061201 Commits

Author SHA1 Message Date
Marc Zyngier
b179219cfa UPSTREAM: KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers
Align the GICv2 MMIO accesses from userspace with the way the GICv3
code is now structured.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 7e9f723c2a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ib9b41554ce9fca09d6c49e5fbd019f52abf3793a
2022-09-28 17:16:17 +00:00
Marc Zyngier
120e61668b UPSTREAM: KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers
For userspace accesses to GICv3 MMIO registers (and related data),
vgic_v3_{get,set}_attr are littered with {get,put}_user() calls,
making it hard to audit and reason about.

Consolidate all userspace accesses in vgic_v3_attr_regs_access(),
making the code far simpler to audit.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit e1246f3f2d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ifb160606efc1202dfda24f7b1984de12bbc44fb8
2022-09-28 17:16:17 +00:00
Marc Zyngier
c62f7a4ab7 UPSTREAM: KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace
Despite the userspace ABI clearly defining the bits dealt with by
KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO as a __u32, the kernel uses a u64.

Use a u32 to match the userspace ABI, which will subsequently lead
to some simplifications.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 38cf0bb762)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I68a4ec9614c7e428666d3cfc9f488cce34ca1d47
2022-09-28 17:16:17 +00:00
Marc Zyngier
bd962474ad UPSTREAM: KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP
The GICv3 userspace accessors are all about dealing with conversion
between fields from architectural registers and internal representations.

However, and owing to the age of this code, the accessors use
a combination of shift/mask that is hard to read. It is nonetheless
easy to make it better by using the FIELD_{GET,PREP} macros that solely
rely on a mask.

This results in somewhat nicer looking code, and is probably easier
to maintain.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 71c3c7753c)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iabf5c8311a8e60ab21609e3e91e869e1b35515ff
2022-09-28 17:16:17 +00:00
Marc Zyngier
81f1a33755 UPSTREAM: KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API
The vgic-v3 sysreg accessors have been ignored as the rest of the
sysreg internal API was evolving, and are stuck with the .access
method (which is normally reserved to the guest's own access)
for the userspace accesses (which should use the .set/.get_user()
methods).

Catch up with the program and repaint all the accessors so that
they fit into the normal userspace model, and plug the result into
the helpers that have been introduced earlier.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit cbcf14dd23)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I05cc1efd4024483c38d4abceddd9facf33972c2d
2022-09-28 17:16:17 +00:00
Marc Zyngier
aaaa4feb08 UPSTREAM: KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess()
In order to start making the vgic sysreg access from userspace
similar to all the other sysregs, push the userspace memory
access one level down into vgic_v3_cpu_sysregs_uaccess().

The next step will be to rely on the sysreg infrastructure
to perform this task.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit db25081e14)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icbe79690a6fa396331db126c0e3573da0ccfe563
2022-09-28 17:16:17 +00:00
Marc Zyngier
29e2ef9013 UPSTREAM: KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr()
Finding out whether a sysreg exists has little to do with that
register being accessed, so drop the is_write parameter.

Also, the reg pointer is completely unused, and we're better off
just passing the attr pointer to the function.

This result in a small cleanup of the calling site, with a new
helper converting the vGIC view of a sysreg into the canonical
one (this is purely cosmetic, as the encoding is the same).

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit b61fc0857a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I2c47fc989c226caedd39b57930dc1dba12a2a9d3
2022-09-28 17:16:17 +00:00
Marc Zyngier
8556a326d0 UPSTREAM: KVM: arm64: Get rid of reg_from/to_user()
These helpers are only used by the invariant stuff now, and while
they pretend to support non-64bit registers, this only serves as
a way to scare the casual reviewer...

Replace these helpers with our good friends get/put_user(), and
don't look back.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 5a420ed964)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I4ae31fcabb90f3fd884a40bd8afff4b9a34d3709
2022-09-28 17:16:17 +00:00
Marc Zyngier
1e10e1a112 UPSTREAM: KVM: arm64: Consolidate sysreg userspace accesses
Until now, the .set_user and .get_user callbacks have to implement
(directly or not) the userspace memory accesses. Although this gives
us maximem flexibility, this is also a maintenance burden, making it
hard to audit, and I'd feel much better if it was all located in
a single place.

So let's do just that, simplifying most of the function signatures
in the process (the callbacks are now only concerned with the
data itself, and not with userspace).

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 978ceeb3e4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I934fb851ea38cb29575745b24148e93682af6a3b
2022-09-28 17:16:17 +00:00
Marc Zyngier
28d86f0619 UPSTREAM: KVM: arm64: Rely on index_to_param() for size checks on userspace access
index_to_param() already checks that we use 64bit accesses for all
registers accessed from userspace.

However, we have extra checks in other places (such as index_to_params),
which is pretty confusing. Get rid off these redundant checks.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit e48407ff97)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ie86050cd2eed3e0ca65c8604df87ff632fce44cf
2022-09-28 17:16:17 +00:00
Marc Zyngier
b13d4b4ac0 UPSTREAM: KVM: arm64: Introduce generic get_user/set_user helpers for system registers
The userspace access to the system registers is done using helpers
that hardcode the table that is looked up. extract some generic
helpers from this, moving the handling of hidden sysregs into
the core code.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit ba23aec9f4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I67f89790d2f2eb2917186756e13fb2b21fc1faa2
2022-09-28 17:16:17 +00:00
Marc Zyngier
53d3c57fcb UPSTREAM: KVM: arm64: Reorder handling of invariant sysregs from userspace
In order to allow some further refactor of the sysreg helpers,
move the handling of invariant sysreg to occur before we handle
all the other ones.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 1deeffb559)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I02a4bcaabc6231de96a7f1dbda1d9fdcc094a8ce
2022-09-28 17:16:17 +00:00
Marc Zyngier
b6dbdd5d32 UPSTREAM: KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper
find_reg_by_id() requires a sys_reg_param as input, which most
users provide as a on-stack variable, but don't make any use of
the result.

Provide a helper that doesn't have this requirement and simplify
the callers (all but one).

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit da8d120fba)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I34cc2371664523ff4f40d309208088b31de258d6
2022-09-28 17:16:17 +00:00
Kalesh Singh
74ff699967 UPSTREAM: KVM: arm64: Fix hypervisor address symbolization
With CONFIG_RANDOMIZE_BASE=y vmlinux addresses will resolve incorrectly
from kallsyms. Fix this by adding the KASLR offset before printing the
symbols.

Fixes: 6ccf9cb557 ("KVM: arm64: Symbolize the nVHE HYP addresses")
Reported-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220715235824.2549012-1-kaleshsingh@google.com
(cherry picked from commit ed6313a93f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I1b93e2c97ad1927b1dcba2a4bc43637f9427c41f
2022-09-28 17:16:17 +00:00
Masahiro Yamada
f8262b6564 UPSTREAM: KVM: arm64: nvhe: Add intermediates to 'targets' instead of extra-y
These are generated on demand. Adding them to 'targets' is enough.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220613092026.1705630-2-masahiroy@kernel.org
(cherry picked from commit 40c56bd8e1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I30ca3ba84475b3d3fcef0599f7a9fb7ceb5afe93
2022-09-28 17:16:17 +00:00
Masahiro Yamada
bc4f57213d UPSTREAM: KVM: arm64: nvhe: Rename confusing obj-y
This Makefile appends several objects to obj-y from line 15, but none
of them is linked to vmlinux in an ordinary way.

obj-y is overwritten at line 30:

  obj-y := kvm_nvhe.o

So, kvm_nvhe.o is the only object directly linked to vmlinux.

Replace the abused obj-y with hyp-obj-y.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220613092026.1705630-1-masahiroy@kernel.org
(cherry picked from commit 3d5697f95e)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ie295aa59ac7d679e2dd4f53cc721de7493629b8f
2022-09-28 17:16:17 +00:00
Marc Zyngier
b80bb78b43 UPSTREAM: KVM: arm64: Move the handling of !FP outside of the fast path
We currently start by assuming that the host owns the FP unit
at load time, then check again whether this is the case as
we are about to run. Only at this point do we account for the
fact that there is a (vanishingly small) chance that we're running
on a system without a FPSIMD unit (yes, this is madness).

We can actually move this FPSIMD check as early as load-time,
and drop the check at run time.

No intended change in behaviour.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit b4da91879e)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ibbf22b88983a7f13a0a0be711f5d1c36b3cbbf62
2022-09-28 17:16:17 +00:00
Marc Zyngier
e6da869409 UPSTREAM: KVM: arm64: Document why pause cannot be turned into a flag
It would be tempting to turn the 'pause' state into a flag.

However, this cannot easily be done as it is updated out of context,
while all the flags expect to only be updated from the vcpu thread.
Turning it into a flag would require to make all flag updates
atomic, which isn't necessary desireable.

Document this, and take this opportunity to move the field next
to the flag sets, filling a hole in the vcpu structure.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 0fa4a3137e)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icbe091a7af25cb3db7acab281e830cd1472b3821
2022-09-28 17:16:17 +00:00
Marc Zyngier
8640b09352 UPSTREAM: KVM: arm64: Reduce the size of the vcpu flag members
Now that we can detect flags overflowing their container, reduce
the size of all flag set members in the vcpu struct, turning them
into 8bit quantities.

Even with the FP state enum occupying 32bit, the whole of the state
that was represented by flags is smaller by one byte. Profit!

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 54ddda919c)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3ccd004f5507a72229d4afefbac98abc486d845c
2022-09-28 17:16:17 +00:00
Marc Zyngier
e46c3a4c3d UPSTREAM: KVM: arm64: Add build-time sanity checks for flags
Flags are great, but flags can also be dangerous: it is easy
to encode a flag that is bigger than its container (unless the
container is a u64), and it is easy to construct a flag value
that doesn't fit in the mask that is associated with it.

Add a couple of build-time sanity checks that ensure we catch
these two cases.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 5a3984f4ec)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id8a9f9f07192b69527ac74ddf0815a869cece648
2022-09-28 17:16:17 +00:00
Marc Zyngier
1f77784474 UPSTREAM: KVM: arm64: Warn when PENDING_EXCEPTION and INCREMENT_PC are set together
We really don't want PENDING_EXCEPTION and INCREMENT_PC to ever be
set at the same time, as they are mutually exclusive. Add checks
that will generate a warning should this ever happen.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit e19f2c6cd1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I20efdbeafc1b641f7827f5b3d24870f3e1aff4fa
2022-09-28 17:16:17 +00:00
Marc Zyngier
5859f30226 UPSTREAM: KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag
The aptly named boolean 'sysregs_loaded_on_cpu' tracks whether
some of the vcpu system registers are resident on the physical
CPU when running in VHE mode.

This is obviously a flag in hidding, so let's convert it to
a state flag, since this is solely a host concern (the hypervisor
itself always knows which state we're in).

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 30b6ab45f8)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ia08beccb869e89a3cce0f46d4a08b3e1ee474eab
2022-09-28 17:16:17 +00:00
Marc Zyngier
a83885ad91 UPSTREAM: KVM: arm64: Kill unused vcpu flags field
Horray, we have now sorted all the preexisting flags, and the
'flags' field is now unused. Get rid of it while nobody is
looking.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 781e3ae148)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iad0259759a02f3823d88f0dfdbb30d9f87b2113e
2022-09-28 17:16:17 +00:00
Marc Zyngier
0a8b715d20 BACKPORT: KVM: arm64: Move vcpu WFIT flag to the state flag set
The host kernel uses the WFIT flag to remember that a vcpu has used
this instruction and wake it up as required. Move it to the state
set, as nothing in the hypervisor uses this information.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit eebc538d8e)
[willdeacon@: Use kvm_vcpu_block() instead of kvm_vcpu_halt() in context]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iccd6cca64cfb3920be878fa883255e3d6ba2a01c
2022-09-28 17:16:17 +00:00
Marc Zyngier
c2bd465ec9 UPSTREAM: KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set
The ON_UNSUPPORTED_CPU flag is only there to track the sad fact
that we have ended-up on a CPU where we cannot really run.

Since this is only for the host kernel's use, move it to the state
set.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit aff3ccd732)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ida322ae52e86bbf433641cae809ef70b256e539e
2022-09-28 17:16:17 +00:00
Marc Zyngier
9efbdc1238 UPSTREAM: KVM: arm64: Move vcpu SVE/SME flags to the state flag set
The two HOST_{SVE,SME}_ENABLED are only used for the host kernel
to track its own state across a vcpu run so that it can be fully
restored.

Move these flags to the so called state set.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 0affa37fcd)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id09e1ee6ddb524f8cbcde73c76f89977ad421e7d
2022-09-28 17:16:17 +00:00
Marc Zyngier
8a5435ac36 UPSTREAM: KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set
The three debug flags (which deal with the debug registers, SPE and
TRBE) all are input flags to the hypervisor code.

Move them into the input set and convert them to the new accessors.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit b1da49088a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iefae8aeddfbe62070848305ebce44a9120beda6e
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
b922edc361 BACKPORT: arm64: Copy the task argument to unwind_state
Copy the task argument passed to arch_stack_walk() to unwind_state so that
it can be passed to unwind functions via unwind_state rather than as a
separate argument. The task is a fundamental part of the unwind state.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220617180219.20352-3-madvenka@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 82a592c13b)
[willdeacon@: Resolve context conflict with kretprobes field in
 'struct unwind_state']
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icf29c7da8111dd5f7a25d99e6277a99c78e1ec23
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
e12793be98 BACKPORT: arm64: Split unwind_init()
unwind_init() is currently a single function that initializes all of the
unwind state. Split it into the following functions and call them
appropriately:

	- unwind_init_from_regs() - initialize from regs passed by caller.

	- unwind_init_from_caller() - initialize for the current task
	  from the caller of arch_stack_walk().

	- unwind_init_from_task() - initialize from the saved state of a
	  task other than the current task. In this case, the other
	  task must not be running.

This is done for two reasons:

	- the different ways of initializing are clear

	- specialized code can be added to each initializer in the future.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220617180219.20352-2-madvenka@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit a019d8a2cc)
[willdeacon@: Resolve context conflict with kretprobes code in unwind_init()]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I35f738d9cd8ab5f6fa333ac451a32f2486734da4
2022-09-28 17:16:17 +00:00
Andrey Konovalov
9b73131102 UPSTREAM: arm64: stacktrace: use non-atomic __set_bit
Use the non-atomic version of set_bit() in arch/arm64/kernel/stacktrace.c,
as there is no concurrent accesses to frame->prev_type.

This speeds up stack trace collection and improves the boot time of
Generic KASAN by 2-5%.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/23dfa36d1cc91e4a1059945b7834eac22fb9854d.1653317461.git.andreyknvl@google.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 446297b28a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ib0b48a8d84de906ddf2d74e9b3f4ebcf933a6810
2022-09-28 17:16:17 +00:00
Andrey Konovalov
4a2137cca8 UPSTREAM: arm64: kasan: do not instrument stacktrace.c
Disable KASAN instrumentation of arch/arm64/kernel/stacktrace.c.

This speeds up Generic KASAN by 5-20%.

As a side-effect, KASAN is now unable to detect bugs in the stack trace
collection code. This is taken as an acceptable downside.

Also replace READ_ONCE_NOCHECK() with READ_ONCE() in stacktrace.c.
As the file is now not instrumented, there is no need to use the
NOCHECK version of READ_ONCE().

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/c4c944a2a905e949760fbeb29258185087171708.1653317461.git.andreyknvl@google.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 802b91118d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I374277671bb324b494fcf661851aeb5a8a1769c4
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
845562b6c6 UPSTREAM: arm64: stacktrace: align with common naming
For historical reasons, the naming of parameters and their types in the
arm64 stacktrace code differs from that used in generic code and other
architectures, even though the types are equivalent.

For consistency and clarity, use the generic names.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit bd5552bc48)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I1924f6b77bbe7f8fff9ee145d41364f4cb313948
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
06a99eb858 BACKPORT: arm64: stacktrace: rename stackframe to unwind_state
Rename "struct stackframe" to "struct unwind_state" for consistency and
better naming. Accordingly, rename variable/argument "frame" to "state".

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit e9d75a0ba8)
[willdeacon@: Drop references to missing kretprobes fields in
 'struct unwind_state']
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ife3a1e3661d19b35889cbd4575feb0f1e0e484f9
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
91f07e16aa UPSTREAM: arm64: stacktrace: rename unwinder functions
Rename unwinder functions for consistency and better naming.

	- Rename start_backtrace() to unwind_init().
	- Rename unwind_frame() to unwind_next().
	- Rename walk_stackframe() to unwind().

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c797bd4548)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iaf71fc995a325030639cd70fb9020f1992b0a213
2022-09-28 17:16:17 +00:00
Mark Rutland
3a441566c2 BACKPORT: arm64: stacktrace: make struct stackframe private to stacktrace.c
Now that arm64 uses arch_stack_walk() consistently, struct stackframe is
only used within stacktrace.c. To make it easier to read and maintain
this code, it would be nicer if the definition were there too.

Move the definition into stacktrace.c.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviwed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 96bb1530c4)
[willdeacon@: Drop missing kretprobes field from new definition of
 'struct stackframe']
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ieccd4dbee5d46a68ecfed5b4f9748abe070699e4
2022-09-28 17:16:17 +00:00
Mark Rutland
eb9cda07d8 UPSTREAM: arm64: stacktrace: delete PCS comment
The comment at the top of stacktrace.c isn't all that helpful, as it's
not associated with the code which inspects the frame record, and the
code example isn't representative of common code generation today.

Delete it.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit cb86a41b35)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I0988de7d7f019ceb1882c3aae074e92b3d9f81a5
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
f2536e51e4 UPSTREAM: arm64: stacktrace: remove NULL task check from unwind_frame()
Currently, there is a check for a NULL task in unwind_frame(). It is not
needed since all current callers pass a non-NULL task.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 4f6277e8ac)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icbebe5e7a79047a9db9104391277f063df1d616a
2022-09-28 17:16:17 +00:00
Mark Rutland
1b97dc8fd6 BACKPORT: arm64: Make some stacktrace functions private
Now that open-coded stack unwinds have been converted to
arch_stack_walk(), we no longer need to expose any of unwind_frame(),
walk_stackframe(), or start_backtrace() outside of stacktrace.c.

Make those functions private to stacktrace.c, removing their prototypes
from <asm/stacktrace.h> and marking them static.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit d2d1d2645c)
[willdeacon@: Retain 'notrace' annotation on start_backtrace() from LTS merge]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3e3288fed5060dd457ecbea2ccad15af105a1224
2022-09-28 17:16:17 +00:00
Will Deacon
2ef3336f08 Revert "ANDROID: arm64: stacktrace: export start_backtrace symbol"
This reverts commit b7ca6bc390.

Upstream has made this symbol static and subsequently reworked the
unwinder logic, so stop exporting start_backtrace() so that we can
backport the rest of the upstream changes.

Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3d64e5ceb2da41db1a47a5f78a412a22e5d75f12
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
41928e303c UPSTREAM: arm64: Make dump_backtrace() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic.

Currently, dump_backtrace() walks the stack of the current task or a
blocked task by calling stact_backtrace() and iterating unwind steps
using unwind_frame(). This can be written more simply in terms of
arch_stack_walk(), considering three distinct cases:

1) When unwinding a blocked task, start_backtrace() is called with the
   blocked task's saved PC and FP, and the unwind proceeds immediately
   from this point without skipping any entries. This is functionally
   equivalent to calling arch_stack_walk() with the blocked task, which
   will start with the task's saved PC and FP.

   There is no functional change to this case.

2) When unwinding the current task without regs, start_backtrace() is
   called with dump_backtrace() as the PC and __builtin_frame_address(0)
   as the next frame, and the unwind proceeds immediately without
   skipping. This is *almost* functionally equivalent to calling
   arch_stack_walk() for the current task, which will start with its
   caller (i.e. an offset into dump_backtrace()) as the PC, and the
   callers frame record as the next frame.

   The only difference being that dump_backtrace() will be reported with
   an offset (which is strictly more correct than currently). Otherwise
   there is no functional cahnge to this case.

3) When unwinding the current task with regs, start_backtrace() is
   called with dump_backtrace() as the PC and __builtin_frame_address(0)
   as the next frame, and the unwind is performed silently until the
   next frame is the frame pointed to by regs->fp. Reporting starts
   from regs->pc and continues from the frame in regs->fp.

   Historically, this pre-unwind was necessary to correctly record
   return addresses rewritten by the ftrace graph calller, but this is
   no longer necessary as these are now recovered using the FP since
   commit:

   c6d3cd32fd ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")

   This pre-unwind is not necessary to recover return addresses
   rewritten by kretprobes, which historically were not recovered, and
   are now recovered using the FP since commit:

   cd9bc2c925 ("arm64: Recover kretprobe modified return address in stacktrace")

   Thus, this is functionally equivalent to calling arch_stack_walk()
   with the current task and regs, which will start with regs->pc as the
   PC and regs->fp as the next frame, without a pre-unwind.

This patch makes dump_backtrace() use arch_stack_walk(). This simplifies
dump_backtrace() and will permit subsequent changes to the unwind code.

Aside from the improved reporting when unwinding current without regs,
there should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
[Mark: elaborate commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129142849.3056714-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 2dad6dc17b)
[willdeacon@: NOTE: We don't have cd9bc2c925 and backporting its
 dependencies is not viable so kretprobe unwinding remains broken even
 after this patch]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I04ca152601bd64f0e46a606af2aa5b35c86b5bf3
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
8d3d9885b9 UPSTREAM: arm64: Make profile_pc() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.

Currently profile_pc() walks the stack of an interrupted context by
calling start_backtrace() with the context's PC and FP, and iterating
unwind steps using walk_stackframe(). This is functionally equivalent to
calling arch_stack_walk() with the interrupted context's pt_regs, which
will start with the PC and FP from the regs.

Make profile_pc() use arch_stack_walk(). This simplifies profile_pc(),
and in future will alow us to make walk_stackframe() private to
stacktrace.c.

At the same time, we remove the early return for when regs->pc is not in
lock functions, as this will be handled by the first call to the
profile_pc_cb() callback.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: remove early return, elaborate commit message, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129142849.3056714-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 22ecd975b6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I65bdcde177048003f443c9280520bd1924b09491
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
e17fea30d6 UPSTREAM: arm64: Make return_address() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.

Currently return_address() walks the stack of the current task by
calling start_backtrace() with return_address as the PC and the frame
pointer of return_address() as the next frame, iterating unwind steps
using walk_stackframe(). This is functionally equivalent to calling
arch_stack_walk() for the current stack, which will start from its
caller (i.e. return_address()) as the PC and it's caller's frame record
as the next frame.

Make return_address() use arch_stackwalk(). This simplifies
return_address(), and in future will alow us to make walk_stackframe()
private to stacktrace.c.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: elaborate commit message, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 39ef362d2d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I2d5d45c0d583a662910018c12035effbdc7cb6b9
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
6c43c4bbf6 BACKPORT: arm64: Make __get_wchan() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.

Currently, __get_wchan() walks the stack of a blocked task by calling
start_backtrace() with the task's saved PC and FP values, and iterating
unwind steps using unwind_frame(). The initialization is functionally
equivalent to calling arch_stack_walk() with the blocked task, which
will start with the task's saved PC and FP values.

Currently __get_wchan() always performs an initial unwind step, which
will stkip __switch_to(), but as this is now marked as a __sched
function, this no longer needs special handling and will be skipped in
the same way as other sched functions.

Make __get_wchan() use arch_stack_walk(). This simplifies __get_wchan(),
and in future will alow us to make unwind_frame() private to
stacktrace.c. At the same time, we can simplify the try_get_task_stack()
check and avoid the unnecessary `stack_page` variable.

The change to the skipping logic means we may terminate one frame
earlier than previously where there are an excessive number of sched
functions in the trace, but this isn't seen in practice, and wchan is
best-effort anyway, so this should not be a problem.

Other than the above, there should be no functional change as a result
of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
[Mark: rebase atop wchan changes, elaborate commit message, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129142849.3056714-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 4f62bb7cb1)
[willdeacon@: Fix context conflict with task checks in get_wchan()]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iad86e4fcf28e88df2cca65767f984d351b34d3ed
2022-09-28 17:16:17 +00:00
Madhavan T. Venkataraman
9592359da1 BACKPORT: arm64: Make perf_callchain_kernel() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.

Currently perf_callchain_kernel() walks the stack of an interrupted
context by calling start_backtrace() with the context's PC and FP, and
iterating unwind steps using walk_stackframe(). This is functionally
equivalent to calling arch_stack_walk() with the interrupted context's
pt_regs, which will start with the PC and FP from the regs.

Make perf_callchain_kernel() use arch_stack_walk(). This simplifies
perf_callchain_kernel(), and in future will alow us to make
walk_stackframe() private to stacktrace.c.

At the same time, we update the callchain_trace() callback to check the
return value of perf_callchain_store(), which indicates whether there is
space for any further entries. When a non-zero value is returned,
further calls will be ignored, and are redundant, so we can stop the
unwind at this point.

We also remove the stale and confusing comment for callchain_trace.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: elaborate commit message, remove comment, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ed876d35a1)
[willdeacon@: Fix context conflict with 'perf_guest_cbs' in perf_callchain_kernel()]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ifb2003732409684f02fcf5fc221c26df57295ed2
2022-09-28 17:16:17 +00:00
Mark Rutland
924040f935 UPSTREAM: arm64: Mark __switch_to() as __sched
Unlike most architectures (and only in keeping with powerpc), arm64 has
a non __sched() function on the path to our cpu_switch_to() assembly
function.

It is expected that for a blocked task, in_sched_functions() can be used
to skip all functions between the raw context switch assembly and the
scheduler functions that call into __switch_to(). This is the behaviour
expected by stack_trace_consume_entry_nosched(), and the behaviour we'd
like to have such that we an simplify arm64's __get_wchan()
implementation to use arch_stack_walk().

This patch mark's arm64's __switch_to as __sched. This *will not* change
the behaviour of arm64's current __get_wchan() implementation, which
always performs an initial unwind step which skips __switch_to(). This
*will* change the behaviour of stack_trace_consume_entry_nosched() and
stack_trace_save_tsk() to match their expected behaviour on blocked
tasks, skipping all scheduler-internal functions including
__switch_to().

Other than the above, there should be no functional change as a result
of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129142849.3056714-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 86bcbafcb7)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I518cfec11d8f661c9fa0bbe23bec4e465599bfe8
2022-09-28 17:16:17 +00:00
Peter Zijlstra
e377341375 UPSTREAM: arch: Make ARCH_STACKWALK independent of STACKTRACE
Make arch_stack_walk() available for ARCH_STACKWALK architectures
without it being entangled in STACKTRACE.

Link: https://lore.kernel.org/lkml/20211022152104.356586621@infradead.org/
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[Mark: rebase, drop unnecessary arm change]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 1614b2b11f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I00643b1dada2a20113cb0632f195fd2758726e3a
2022-09-28 17:16:17 +00:00
Mark Rutland
68af9a9d7b BACKPORT: arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
When CONFIG_FUNCTION_GRAPH_TRACER is selected and the function graph
tracer is in use, unwind_frame() may erroneously associate a traced
function with an incorrect return address. This can happen when starting
an unwind from a pt_regs, or when unwinding across an exception
boundary.

This can be seen when recording with perf while the function graph
tracer is in use. For example:

| # echo function_graph > /sys/kernel/debug/tracing/current_tracer
| # perf record -g -e raw_syscalls:sys_enter:k /bin/true
| # perf report

... reports the callchain erroneously as:

| el0t_64_sync
| el0t_64_sync_handler
| el0_svc_common.constprop.0
| perf_callchain
| get_perf_callchain
| syscall_trace_enter
| syscall_trace_enter

... whereas when the function graph tracer is not in use, it reports:

| el0t_64_sync
| el0t_64_sync_handler
| el0_svc
| do_el0_svc
| el0_svc_common.constprop.0
| syscall_trace_enter
| syscall_trace_enter

The underlying problem is that ftrace_graph_get_ret_stack() takes an
index offset from the most recent entry added to the fgraph return
stack. We start an unwind at offset 0, and increment the offset each
time we encounter a rewritten return address (i.e. when we see
`return_to_handler`). This is broken in two cases:

1) Between creating a pt_regs and starting the unwind, function calls
   may place entries on the stack, leaving an arbitrary offset which we
   can only determine by performing a full unwind from the caller of the
   unwind code (and relying on none of the unwind code being
   instrumented).

   This can result in erroneous entries being reported in a backtrace
   recorded by perf or kfence when the function graph tracer is in use.
   Currently show_regs() is unaffected as dump_backtrace() performs an
   initial unwind.

2) When unwinding across an exception boundary (whether continuing an
   unwind or starting a new unwind from regs), we currently always skip
   the LR of the interrupted context. Where this was live and contained
   a rewritten address, we won't consume the corresponding fgraph ret
   stack entry, leaving subsequent entries off-by-one.

   This can result in erroneous entries being reported in a backtrace
   performed by any in-kernel unwinder when that backtrace crosses an
   exception boundary, with entries after the boundary being reported
   incorrectly. This includes perf, kfence, show_regs(), panic(), etc.

To fix this, we need to be able to uniquely identify each rewritten
return address such that we can map this back to the original return
address. We can use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR to associate
each rewritten return address with a unique location on the stack. As
the return address is passed in the LR (and so is not guaranteed a
unique location in memory), we use the FP upon entry to the function
(i.e. the address of the caller's frame record) as the return address
pointer. Any nested call will have a different FP value as the caller
must create its own frame record and update FP to point to this.

Since ftrace_graph_ret_addr() requires the return address with the PAC
stripped, the stripping of the PAC is moved before the fixup of the
rewritten address. As we would unconditionally strip the PAC, moving
this earlier is not harmful, and we can avoid a redundant strip in the
return address fixup code.

I've tested this with the perf case above, the ftrace selftests, and
a number of ad-hoc unwinder tests. The tests all pass, and I have seen
no unexpected behaviour as a result of this change. I've tested with
pointer authentication under QEMU TCG where magic-sysrq+l correctly
recovers the original return addresses.

Note that this doesn't fix the issue of skipping a live LR at an
exception boundary, which is a more general problem and requires more
substantial rework. Were we to consume the LR in all cases this would
result in warnings where the interrupted context's LR contains
`return_to_handler`, but the FP has been altered, e.g.

| func:
|	<--- ftrace entry ---> 	// logs FP & LR, rewrites LR
| 	STP	FP, LR, [SP, #-16]!
| 	MOV	FP, SP
| 	<--- INTERRUPT --->

... as ftrace_graph_get_ret_stack() fill not find a matching entry,
triggering the WARN_ON_ONCE() in unwind_frame().

Link: https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local
Link: https://lore.kernel.org/r/20211027132529.30027-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211029162245.39761-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit c6d3cd32fd)
[willdeacon@: Fixed conflicts due to lack of kretprobes unwinder support]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I70564a777f1424e6906be646ed09a1e4bb502e2e
2022-09-28 17:16:17 +00:00
Marc Zyngier
5ac0ddd7df UPSTREAM: KVM: arm64: Move vcpu PC/Exception flags to the input flag set
The PC update flags (which also deal with exception injection)
is one of the most complicated use of the flag we have. Make it
more fool prof by:

- moving it over to the new accessors and assign it to the
  input flag set

- turn the combination of generic ELx flags with another flag
  indicating the target EL itself into an explicit set of
  flags for each EL and vector combination

- add a new accessor to pend the exception

This is otherwise a pretty straightformward conversion.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 699bb2e0c6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ida64b52f942e8c174c56965acad7f07e63318b3a
2022-09-28 17:16:17 +00:00
Marc Zyngier
53977bee59 UPSTREAM: KVM: arm64: Move vcpu configuration flags into their own set
The KVM_ARM64_{GUEST_HAS_SVE,VCPU_SVE_FINALIZED,GUEST_HAS_PTRAUTH}
flags are purely configuration flags. Once set, they are never cleared,
but evaluated all over the code base.

Move these three flags into the configuration set in one go, using
the new accessors, and take this opportunity to drop the KVM_ARM64_
prefix which doesn't provide any help.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 4c0680d394)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ic4ac6ee4b2791e58eb1a2b9b00e4b2fa15e00f8f
2022-09-28 17:16:17 +00:00
Marc Zyngier
8ed1b808db UPSTREAM: KVM: arm64: Add three sets of flags to the vcpu state
It so appears that each of the vcpu flags is really belonging to
one of three categories:

- a configuration flag, set once and for all
- an input flag generated by the kernel for the hypervisor to use
- a state flag that is only for the kernel's own bookkeeping

As we are going to split all the existing flags into these three
sets, introduce all three in one go.

No functional change other than a bit of bloat...

Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 690bacb83b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I09747e3eb548f4f28383765126415759631b208e
2022-09-28 17:16:17 +00:00