Commit Graph

986173 Commits

Author SHA1 Message Date
Quentin Perret
01e57334c8 ANDROID: KVM: arm64: pkvm: Prevent the donation of no-map pages
Memory regions marked as no-map in DT routinely include TrustZone
carevouts and such. Although donating such pages to a guest may not
breach confidentiality, it may be used to corrupt its state in
uncontrollable ways. To prevent this, let's block host-initiated memory
transitions targeting no-map pages altogether in nVHE protected mode as
there should be no valid reason to do this currently.

Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list
of memblock regions, hence allowing to check for the presence of the
MEMBLOCK_NOMAP flag on any given region at EL2 easily.

Bug: 205819752
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: Id346d011ff02c2e92627481dfebc8d39334931c4
2022-01-28 14:15:25 +00:00
Fuad Tabba
55ee32da5e ANDROID: KVM: arm64: Don't remove shadow table entry twice on teardown
__pkvm_teardown_shadow() already removes the shadow table entry for
the VM being torn down while the shadow_lock is held, so drop the
additional removal and avoid corrupting 'num_shadow_entries' in the
process.

Fixes: 9c864eab57 ("ANDROID: KVM: arm64: Refcount shadow structs on vcpu_{load/put}()")
Signed-off-by: Fuad Tabba <tabba@google.com>
Bug: 209580772
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ie09ecde8c9b6f14ea1ee9d27fba62b713c21e484
2022-01-27 21:42:42 +00:00
David Brazdil
95602b6834 Revert "Revert "ANDROID: GKI: update virtual device symbol list""
Add symbols needed by the newly added open-dice.ko.

This reverts commit 870681ecd3.

Bug: 198197082
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I7dadbd95e23f77e8d50ff6c5d9e3e8ff746bef01
2022-01-27 15:41:28 +00:00
David Brazdil
51fe598af9 BACKPORT: FROMLIST: misc: open-dice: Add driver to expose DICE data to userspace
Open Profile for DICE is an open protocol for measured boot compatible
with the Trusted Computing Group's Device Identifier Composition
Engine (DICE) specification. The generated Compound Device Identifier
(CDI) certificates represent the hardware/software combination measured
by DICE, and can be used for remote attestation and sealing.

Add a driver that exposes reserved memory regions populated by firmware
with DICE CDIs and exposes them to userspace via a character device.

Userspace obtains the memory region's size from read() and calls mmap()
to create a mapping of the memory region in its address space. The
mapping is not allowed to be write+shared, giving userspace a guarantee
that the data were not overwritten by another process.

Userspace can also call write(), which triggers a wipe of the DICE data
by the driver. Because both the kernel and userspace mappings use
write-combine semantics, all clients observe the memory as zeroed after
the syscall has returned.

Acked-by: Rob Herring <robh@kernel.org>
Cc: Andrew Scull <ascull@google.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: David Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20220126231237.529308-3-dbrazdil@google.com
[dbrazdil@: Fixed context conflicts in reserved_mem_matches[] and Makefile]
Bug: 198197082
Change-Id: Iabd65f4d20036bb452e4103c7722f220c2273c81
2022-01-27 15:41:28 +00:00
David Brazdil
d2ee879f40 FROMLIST: dt-bindings: reserved-memory: Open Profile for DICE
Add DeviceTree bindings for Open Profile for DICE, an open protocol for
measured boot. Firmware uses DICE to measure the hardware/software
combination and generates Compound Device Identifier (CDI) certificates.
These are stored in memory and the buffer is described in the DT as
a reserved memory region compatible with 'google,open-dice'.

'no-map' is required to ensure the memory region is never treated by
the kernel as system memory.

Bug: 198197082
Signed-off-by: David Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20220126231237.529308-2-dbrazdil@google.com
Change-Id: I6da35b90ca2f519408f96edf3999ffb257cf72e8
2022-01-27 15:41:28 +00:00
David Brazdil
b2bdc0a19c Revert "BACKPORT: FROMLIST: misc: open-dice: Add driver to expose DICE data to userspace"
This reverts commit d1109f05c3.
It will be replaced with the latest patch set version from upstream.

Bug: 198197082
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I661f1bc06d336e5eaab9e52affeb273a0ad6fc2f
2022-01-27 15:41:28 +00:00
David Brazdil
112e6b4f25 Revert "FROMLIST: dt-bindings: reserved-memory: Open Profile for DICE"
This reverts commit 3d914125b2.
It will be replaced with the latest patch set version from upstream.

Bug: 198197082
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I8d9bc9dd059d316e5bc7ffbd0bb9a8a61fa770eb
2022-01-27 15:41:28 +00:00
Robin Peng
7edc8bc69d ANDROID: Update the ABI symbol list
Update the generic symbol list.

Bug: 211546634
Signed-off-by: Robin Peng <robinpeng@google.com>
Change-Id: I20511d8a34575ad02371ea66acc39f417bfb647e
2022-01-27 11:39:17 +00:00
Quentin Perret
c995d4a63c ANDROID: ABI: Update the generic symbol list
Bug: 207662659
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I6c910d855efde4be1a26ff064bcdc94b07bd668e
2022-01-27 11:28:34 +00:00
Wesley Cheng
d44db9ac8a UPSTREAM: usb: gadget: f_serial: Ensure gserial disconnected during unbind
Some UDCs may return an error during pullup disable as part of the
unbind path for a USB configuration.  This will lead to a scenario
where the disable() callback is skipped, whereas the unbind() still
occurs.  If this happens, the u_serial driver will continue to fail
subsequent binds, due to an already existing entry in the ports array.
Ensure that gserial_disconnect() is called during the f_serial unbind,
so the ports entry is properly cleared.

Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Link: https://lore.kernel.org/r/20220111064850.24311-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 215650635
(cherry picked from commit d6dd18efd0)
Change-Id: I6d345345f5a86d73def1634d9860c598542e42a1
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
2022-01-26 23:02:07 -08:00
Stanislav Fomichev
da8e680070 UPSTREAM: tools/resolve_btfids: Fix warnings
* make eprintf static, used only in main.c
* initialize ret in eprintf
* remove unused *tmp

v3:
* remove another err (Song Liu)

v2:
* remove unused 'int err = -1'

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210329223143.3659983-1-sdf@google.com
(cherry picked from commit e27bfefb21)
Bug: 203823368
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Ie1e72f161805e9b9634e45e6a348bbc386dfd076
2022-01-26 12:29:17 -08:00
Masahiro Yamada
f74899b38f UPSTREAM: kbuild: check CONFIG_AS_IS_LLVM instead of LLVM_IAS
LLVM_IAS is the user interface to set the -(no-)integrated-as flag,
and it should be used only for that purpose.

LLVM_IAS is checked in some places to determine the assembler type,
but it is not precise.

For example,

 $ make CC=gcc LLVM_IAS=1

... will use the GNU assembler (i.e. binutils) since LLVM_IAS=1 is
effective only when $(CC) is clang.

Of course, 'CC=gcc LLVM_IAS=1' is an odd combination, but the build
system can be more robust against such insane input.

Commit ba64beb174 ("kbuild: check the minimum assembler version in
Kconfig") introduced CONFIG_AS_IS_GNU/LLVM, which is more precise
because Kconfig checks the version string from the assembler in use.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
(cherry picked from commit 52cc02b910)
Bug: 203823368
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Iac51937a957ef2c227dd97c68da7f129e2fc4684
2022-01-26 12:28:56 -08:00
Saravana Kannan
9d7a259dba UPSTREAM: driver core: fw_devlink: Improve handling of cyclic dependencies
When we have a dependency of the form:

Device-A -> Device-C
	Device-B

Device-C -> Device-B

Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).

We have cyclic dependency: device-A -> device-C -> device-B -> device-A

fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.

While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.

To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].

[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/

Fixes: f9aa460672 ("driver core: Refactor fw_devlink feature")
Cc: stable <stable@vger.kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2de9d8e0d2)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Ib3f301fa4638b79f73b82dcdae7f9058ff61989c
2022-01-26 11:39:39 -08:00
Li Li
63e1ba8854 UPSTREAM: binder: fix freeze race
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.

But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.

1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.

This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.

Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.

As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.

NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.

To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.

If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.

A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.

the same time, confirmed the pending transactions succeeded.

Fixes: 432ff1e916 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b564171ade)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I73c75967903797f252c64fa6b4b7207d0ea73791
2022-01-26 11:39:38 -08:00
Chunfeng Yun
7950a1a6fe UPSTREAM: usb: xhci-mtk: fix issue of out-of-bounds array access
Bus bandwidth array access is based on esit, increase one
will cause out-of-bounds issue; for example, when esit is
XHCI_MTK_MAX_ESIT, will overstep boundary.

Fixes: 7c986fbc16 ("usb: xhci-mtk: get the microframe boundary for ESIT")
Cc: <stable@vger.kernel.org>
Reported-by: Stan Lu <stan.lu@mediatek.com>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629189389-18779-5-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit de5107f473)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Iaededaf59378afa4c511a6d547fd23a35be8989e
2022-01-26 11:39:38 -08:00
Mathias Nyman
e3ea5ed407 UPSTREAM: xhci: Fix failure to give back some cached cancelled URBs.
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.

xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.

This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.

Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.

This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.

This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.

Fixes: 674f8438c1 ("xhci: split handling halted endpoints into two steps")
Cc: <stable@vger.kernel.org> #5.12
Reported-by: Tao Wang <wat@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210820123503.2605901-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 94f339147f)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I6835eddfe9cd67505eaa920185bc10045a479b95
2022-01-26 11:39:38 -08:00
Tadeusz Struk
b496f320dc ANDROID: selftests: fix incfs_test
Fix incfs test build error:

incfs_test.c:4441:19: error: argument 2 is null but the corresponding
                      size argument 3 value is 1 [-Werror=nonnull]
 4441 |         TESTEQUAL(read(fd, NULL, 1), -1);
      |                   ^

Bug: 211066171

Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I028d02aef9938a9abe6c529756b89d7cb07507f2
2022-01-26 15:37:43 +00:00
Greg Kroah-Hartman
49fe66b3ce Revert "ANDROID: ABI: Update the generic symbol list"
This reverts commit 5c1e9f311f.

Reason for revert: Binary sysfs files are for hardware-passthrough only.

Bug: 207662659
Cc: Quentin Perret <qperret@google.com>
Change-Id: If479dd134f64682f010fe5c7df2fc87c58a8ba40
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2022-01-26 11:40:38 +00:00
Tadeusz Struk
0ed1e67a81 ANDROID: Incremental-fs: Doc: correct a sysfs path in incfs.rst
Correct a path to incremental-fs sysfs entry in incfs.rst

Bug: 211066171

Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: Id3a94888edd9022c517939b4667d9792fc04146a
2022-01-26 11:20:23 +00:00
Tadeusz Struk
93717b608d ANDROID: incremental-fs: fix mount_fs issue
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
The correct behavior for a file system is to allow only one such
mount, and then every subsequent attempt should fail with a -EBUSY
error code. In case of the issues listed below the common pattern
is that the reproducer calls:

mount("./file0", "./file0", "incremental-fs", 0, NULL)

many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:

BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN

The reason why many mounts with the same source and target are
possible is because the incfs_mount_fs() as it is allocates a new
super_block for every call, regardless of whether a given mount already
exists or not. This happens every time the sget() function is called
with a test param equal to NULL.
The correct behavior for an FS mount implementation is to call
appropriate mount vfs call for it's type, i.e. mount_bdev() for
a block device backed FS, mount_single() for a pseudo file system,
like sysfs that is mounted in a single, well know location, or
mount_nodev() for other special purpose FS like overlayfs.
In case of incremental-fs the open coded mount logic doesn't check
for abusive mount attempts such as overlays.
To fix this issue the logic needs to be changed to pass a proper
test function to sget() call, which then checks if a super_block
for a mount instance has already been allocated and also allows
the VFS to properly verify invalid mount attempts.

Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296

Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I66cfc3f1b5aaffb32b0845b2dad3ff26fe952e27
2022-01-26 09:05:16 +00:00
Long Ling
4094a44201 ANDROID: GKI: Update the ABI symbol list
Add drm_connector_set_panel_orientation

Bug: 214461751
Test: Build pass
Signed-off-by: Long Ling <longling@google.com>
Change-Id: Ic9f8dbe474053a13cb33d6924d2d06cf1d3f0674
2022-01-25 22:23:13 +00:00
Christian Brauner
10339b6da6 UPSTREAM: file: fix close_range() for unshare+cloexec
syzbot reported a bug when putting the last reference to a tasks file
descriptor table. Debugging this showed we didn't recalculate the
current maximum fd number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
after we unshared the file descriptors table. So max_fd could exceed the
current fdtable maximum causing us to set excessive bits. As a concrete
example, let's say the user requested everything from fd 4 to ~0UL to be
closed and their current fdtable size is 256 with their highest open fd
being 4. With CLOSE_RANGE_UNSHARE the caller will end up with a new
fdtable which has room for 64 file descriptors since that is the lowest
fdtable size we accept. But now max_fd will still point to 255 and needs
to be adjusted. Fix this by retrieving the correct maximum fd value in
__range_cloexec().

Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com
Fixes: 582f1fb6b7 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Fixes: fec8a6a691 ("close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit 9b5b872215)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Id04f75bfeb49dcff0457136f74b47a2d05bd9d51
2022-01-25 21:20:37 +00:00
Christian Brauner
841ee1fff7 UPSTREAM: close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
After introducing CLOSE_RANGE_CLOEXEC syzbot reported a crash when
CLOSE_RANGE_CLOEXEC is specified in conjunction with CLOSE_RANGE_UNSHARE.
When CLOSE_RANGE_UNSHARE is specified the caller will receive a private
file descriptor table in case their file descriptor table is currently
shared.

For the case where the caller has requested all file descriptors to be
actually closed via e.g. close_range(3, ~0U, 0) the kernel knows that
the caller does not need any of the file descriptors anymore and will
optimize the close operation by only copying all files in the range from
0 to 3 and no others.

However, if the caller requested CLOSE_RANGE_CLOEXEC together with
CLOSE_RANGE_UNSHARE the caller wants to still make use of the file
descriptors so the kernel needs to copy all of them and can't optimize.

The original patch didn't account for this and thus could cause oopses
as evidenced by the syzbot report because it assumed that all fds had
been copied. Fix this by handling the CLOSE_RANGE_CLOEXEC case.

syzbot reported
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
BUG: KASAN: null-ptr-deref in atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
BUG: KASAN: null-ptr-deref in filp_close+0x22/0x170 fs/open.c:1274
Read of size 8 at addr 0000000000000077 by task syz-executor511/8522

CPU: 1 PID: 8522 Comm: syz-executor511 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 __kasan_report mm/kasan/report.c:549 [inline]
 kasan_report.cold+0x5/0x37 mm/kasan/report.c:562
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
 instrument_atomic_read include/linux/instrumented.h:71 [inline]
 atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
 atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
 filp_close+0x22/0x170 fs/open.c:1274
 close_files fs/file.c:402 [inline]
 put_files_struct fs/file.c:417 [inline]
 put_files_struct+0x1cc/0x350 fs/file.c:414
 exit_files+0x12a/0x170 fs/file.c:435
 do_exit+0xb4f/0x2a00 kernel/exit.c:818
 do_group_exit+0x125/0x310 kernel/exit.c:920
 get_signal+0x428/0x2100 kernel/signal.c:2792
 arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
 handle_signal_work kernel/entry/common.c:147 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x447039
Code: Unable to access opcode bytes at RIP 0x44700f.
RSP: 002b:00007f1b1225cdb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00000000006dbc28 RCX: 0000000000447039
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000006dbc2c
RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 00007fff223b6bef R14: 00007f1b1225d9c0 R15: 00000000006dbc2c
==================================================================

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com

Tested on:

commit:         10f7cddd selftests/core: add regression test for CLOSE_RAN..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git vfs
kernel config:  https://syzkaller.appspot.com/x/.config?x=5d42216b510180e3
dashboard link: https://syzkaller.appspot.com/bug?extid=96cfd2b22b3213646a93
compiler:       gcc (GCC) 10.1.0-syz 20200507

Reported-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Fixes: 582f1fb6b7 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20201217213303.722643-1-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit fec8a6a691)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I0c6bbd8a46b93293c212fff08728514915d5c690
2022-01-25 21:20:27 +00:00
Giuseppe Scrivano
737c09e195 UPSTREAM: fs, close_range: add flag CLOSE_RANGE_CLOEXEC
When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't
immediately close the files but it sets the close-on-exec bit.

It is useful for e.g. container runtimes that usually install a
seccomp profile "as late as possible" before execv'ing the container
process itself.  The container runtime could either do:
  1                                  2
- install_seccomp_profile();       - close_range(MIN_FD, MAX_INT, 0);
- close_range(MIN_FD, MAX_INT, 0); - install_seccomp_profile();
- execve(...);                     - execve(...);

Both alternative have some disadvantages.

In the first variant the seccomp_profile cannot block the close_range
syscall, as well as opendir/read/close/... for the fallback on older
kernels.
In the second variant, close_range() can be used only on the fds
that are not going to be needed by the runtime anymore, and it must be
potentially called multiple times to account for the different ranges
that must be closed.

Using close_range(..., ..., CLOSE_RANGE_CLOEXEC) solves these issues.
The runtime is able to use the existing open fds, the seccomp profile
can block close_range() and the syscalls used for its fallback.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Link: https://lore.kernel.org/r/20201118104746.873084-2-gscrivan@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit 582f1fb6b7)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ib2d44f9760a80e3febdb25925d17ce5ffd14910e
2022-01-25 21:20:17 +00:00
Nick Desaulniers
a185c0f116 ANDROID: remove more stale variables from build.config files
CROSS_COMPILE is no longer necessary when building with LLVM=1 as of
upstream v5.15-rc1's
commit 231ad7f409 ("Makefile: infer --target from ARCH for CC=clang")
which was backported to android13-5.10.

LLVM_IAS=1 is implied by LLVM=1 as of LLVM_IAS as of
upstream v5.15-rc1's
commit f12b034afe ("scripts/Makefile.clang: default to LLVM_IAS=1")
which was backported to android13-5.10.

Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Id426f680bba8d478ed300dd0f6551dc419ea345f
2022-01-25 21:16:33 +00:00
Tadeusz Struk
33078fb6fb ANDROID: incremental-fs: fix GPF in pending_reads_dispatch_ioctl
It is possible that fget returns NULL. This needs to be handled
correctly in ioctl_permit_fill.

Bug: 212821226

Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: Iec8be21982afeab6794b78ab1a542671c52acea2
2022-01-25 15:39:24 +00:00
Nick Desaulniers
e48efff5cb ANDROID: arch/Kconfig: fix up LTO LLVM_IAS depdency
Due to LTO being backported to android's 5.10 branches first, the
backport of ba64beb174 dropped this. Now
that we have backports for checking AS_IS_LLVM, use that so that we
don't accidentally prevent LTO from being selectable.

Bug: 1938905
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I93fe13dba285bbb9fa1ae0e111bd2955c0b87c12
2022-01-25 01:20:50 -08:00
Nick Desaulniers
ffff118fe1 BACKPORT: scripts/Makefile.clang: default to LLVM_IAS=1
LLVM_IAS=1 controls enabling clang's integrated assembler via
-integrated-as. This was an explicit opt in until we could enable
assembler support in Clang for more architecures. Now we have support
and CI coverage of LLVM_IAS=1 for all architecures except a few more
bugs affecting s390 and powerpc.

This commit flips the default from opt in via LLVM_IAS=1 to opt out via
LLVM_IAS=0.  CI systems or developers that were previously doing builds
with CC=clang or LLVM=1 without explicitly setting LLVM_IAS must now
explicitly opt out via LLVM_IAS=0, otherwise they will be implicitly
opted-in.

This finally shortens the command line invocation when cross compiling
with LLVM to simply:

$ make ARCH=arm64 LLVM=1

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
[nd: conflict in Documentation/kbuild/llvm.rst]
(cherry picked from commit f12b034afe)
Change-Id: I026d42aa40c83eb149c8e39fe91a0b0164ff71b0
2022-01-25 01:20:50 -08:00
Nick Desaulniers
377062e983 UPSTREAM: Makefile: infer --target from ARCH for CC=clang
We get constant feedback that the command line invocation of make is too
long when compiling with LLVM. CROSS_COMPILE is helpful when a toolchain
has a prefix of the target triple, or is an absolute path outside of
$PATH.

Since a Clang binary is generally multi-targeted, we can infer a given
target from SRCARCH/ARCH.  If CROSS_COMPILE is not set, simply set
--target= for CLANG_FLAGS, KBUILD_CFLAGS, and KBUILD_AFLAGS based on
$SRCARCH.

Previously, we'd cross compile via:
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make LLVM=1 LLVM_IAS=1
Now:
$ ARCH=arm64 make LLVM=1 LLVM_IAS=1

For native builds (not involving cross compilation) we now explicitly
specify a target triple rather than rely on the implicit host triple.

Link: https://github.com/ClangBuiltLinux/linux/issues/1399
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Arnd Bergmann <arnd@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
(cherry picked from commit 231ad7f409)
Change-Id: I6f61a4937e6af22deda66eeaac6c667c889dc683
2022-01-25 01:20:50 -08:00
Nick Desaulniers
79bd0fef92 BACKPORT: Makefile: move initial clang flag handling into scripts/Makefile.clang
With some of the changes we'd like to make to CROSS_COMPILE, the initial
block of clang flag handling which controls things like the target triple,
whether or not to use the integrated assembler and how to find GAS,
and erroring on unknown warnings is becoming unwieldy. Move it into its
own file under scripts/.

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
[nd: conflict in MAINTAINERS]
(cherry picked from commit 6f5b41a2f5)
Change-Id: I350ba8aaf924f7b373d25ef67260e6373a405d12
2022-01-25 01:20:50 -08:00
Masahiro Yamada
46d37c6b6d UPSTREAM: kbuild: warn if a different compiler is used for external module builds
It is always safe to use the same compiler for the kernel and external
modules, but in reality, some distributions such as Fedora release a
different version of GCC from the one used for building the kernel.

There was a long discussion about mixing different compilers [1].

I do not repeat it here, but at least, showing a heads up in that
case is better than nothing.

Linus suggested [2]:
  And a warning might be more palatable even if different compiler
  version work fine together. Just a heads up on "it looks like you
  might be mixing compiler versions" is a valid note, and isn't
  necessarily wrong. Even when they work well together, maybe you want
  to have people at least _aware_ of it.

This commit shows a warning unless the compiler is exactly the same.

  warning: the compiler differs from the one used to build the kernel
    The kernel was built by: gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
    You are using:           gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)

Check the difference, and if it is OK with you, please proceed at your
risk.

To avoid the locale issue as in commit bcbcf50f52 ("kbuild: fix
ld-version.sh to not be affected by locale"), pass LC_ALL=C to
"$(CC) --version".

[1] https://lore.kernel.org/linux-hardening/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/
[2] https://lore.kernel.org/lkml/CAHk-=wgjwhDy-y4mQh34L+2aF=n6BjzHdqAW2=8wri5x7O04pA@mail.gmail.com/

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 6072b2c49d)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ibbeeca6c7139f1d03621a8ba2137bafa32c14474
2022-01-25 01:20:50 -08:00
Masahiro Yamada
7cb08715b6 UPSTREAM: kbuild: split cc-option and friends to scripts/Makefile.compiler
scripts/Kbuild.include is included everywhere, but macros such as
cc-option are needed by build targets only.

For example, when 'make clean' traverses the tree, it does not need
to evaluate $(call cc-option,).

Split cc-option, ld-option, etc. to scripts/Makefile.compiler, which
is only included from the top Makefile and scripts/Makefile.build.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
(cherry picked from commit 57fd251c78)
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I7cef781922a192a9bcd21152a74ff2cec0527ab0
2022-01-25 01:20:50 -08:00
Masahiro Yamada
5f94c8bd74 BACKPORT: kbuild: prefix $(srctree)/ to some included Makefiles
VPATH is used in Kbuild to make pattern rules search for prerequisites
in both $(objtree) and $(srctree). Some of *.c, *.S files are not real
sources, but generated by tools such as flex, bison, perl.

In contrast, I doubt the benefit of --include-dir=$(abs_srctree) because
it is always clear which Makefiles are real sources, and which are not.

So, my hope is to add $(srctree)/ prefix to all check-in Makefiles,
then remove --include-dir=$(abs_srctree) flag in the future.

I am touching only some Kbuild core parts for now. Treewide fixes will
be needed to achieve this goal.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 3204a7fb98)
[nd: conflict in scripts/Makefile.modpost due to d7a956441f, which is
itself a backport of 850ded46c6]
Bug: 189327973
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I691d266b68d769ec0fabcc98ed515cfb1cad5ccf
2022-01-25 01:20:50 -08:00
Masahiro Yamada
0e5e792a0e UPSTREAM: kbuild: replace sed with $(subst ) or $(patsubst )
For simple text replacement, it is better to use a built-in function
instead of sed if possible. You can save one process forking.

I do not mean to replace all sed invocations because GNU Make itself
does not support regular expression (unless you use guile).

I just replaced simple ones.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 6e0839fda3)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I92c893dfb17055db0b8a8ca2f39ddfa5f31a118d
2022-01-25 01:20:49 -08:00
Nathan Chancellor
26a2046334 UPSTREAM: Makefile: Only specify '--prefix=' when building with clang + GNU as
When building with LLVM_IAS=1, there is no point to specifying
'--prefix=' because that flag is only used to find GNU cross tools,
which will not be used indirectly when using the integrated assembler.
All of the tools are invoked directly from PATH or a full path specified
via the command line, which does not depend on the value of '--prefix='.

Sharing commands to reproduce issues becomes a little bit easier without
a '--prefix=' value because that '--prefix=' value is specific to a
user's machine due to it being an absolute path.

Some further notes from Fangrui Song:

  clang can spawn GNU as (if -f?no-integrated-as is specified) and GNU
  objcopy (-f?no-integrated-as and -gsplit-dwarf and -g[123]).
  objcopy is only used for GNU as assembled object files.
  With integrated assembler, the object file streamer creates .o and
  .dwo simultaneously.
  With GNU as, two objcopy commands are needed to extract .debug*.dwo to
  .dwo files && another command to remove .debug*.dwo sections.

A small consequence of this change (to keep things simple) is that
'--prefix=' will always be specified now, even with a native build, when
it was not before. This should not be an issue due to the way that the
Makefile searches for the prefix (based on elfedit's location). This
ends up improving the experience for host builds because PATH is better
respected and matches GCC's behavior more closely. See the below thread
for more details:

https://lore.kernel.org/r/20210205213651.GA16907@Ryzen-5-4500U.localdomain/

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit eec08090bc)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I90f232a1551afb9118176a61ac5de38384a171fd
2022-01-25 01:20:49 -08:00
Nathan Chancellor
f0a6553e1d UPSTREAM: Makefile: Remove '--gcc-toolchain' flag
This flag was originally added to allow clang to find the GNU cross
tools in commit 785f11aa59 ("kbuild: Add better clang cross build
support"). This flag was not enough to find the tools at times so
'--prefix' was added to the list in commit ef8c4ed9db ("kbuild: allow
to use GCC toolchain not in Clang search path") and improved upon in
commit ca9b31f6bb ("Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang
cross compilation"). Now that '--prefix' specifies a full path and
prefix, '--gcc-toolchain' serves no purpose because the kernel builds
with '-nostdinc' and '-nostdlib'.

This has been verified with self compiled LLVM 10.0.1 and LLVM 13.0.0 as
well as a distribution version of LLVM 11.1.0 without binutils in the
LLVM toolchain locations.

Link: https://reviews.llvm.org/D97902
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit c91d4e47e1)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ie1f64ed8562964ceba21e402a3d97850f6471508
2022-01-25 01:20:49 -08:00
Masahiro Yamada
f671c01fa7 UPSTREAM: kbuild: remove ld-version macro
There is no direct user of ld-version; you can use CONFIG_LD_VERSION
if needed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Bug: 209655537
(cherry picked from commit 05f6bbf2d7)
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: If7f82474beea2822340ee8503de5087336766223
2022-01-25 01:20:49 -08:00
Masahiro Yamada
8fe8aa47e4 UPSTREAM: kbuild: remove LLVM=1 test from HAS_LTO_CLANG
As Documentation/kbuild/llvm.rst notes, LLVM=1 switches the default of
tools, but you can still override CC, LD, etc. individually. This LLVM=1
check is unneeded because each tool is already checked separately.

"make CC=clang LD=ld.lld NM=llvm-nm AR=llvm-ar LLVM_IAS=1 menuconfig"
should be able to enable Clang LTO.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Bug: 209655537
(cherry picked from commit 4c273d23c4)
Change-Id: Ieb1f4bbf98521b929c42253260a9a5aec25e73e2
2022-01-25 01:20:49 -08:00
Nathan Chancellor
a0723f1fd0 UPSTREAM: Makefile: Remove # characters from compiler string
When using AMD's Optimizing C/C++ Compiler (AOCC), the build fails due
to a # character in the version string, which is interpreted as a
comment:

$ make CC=clang defconfig init/main.o
include/config/auto.conf.cmd:1374: *** invalid syntax in conditional. Stop.

$ sed -n 1374p include/config/auto.conf.cmd
ifneq "$(CC_VERSION_TEXT)" "AMD clang version 11.0.0 (CLANG: AOCC_2.3.0-Build#85 2020_11_10) (based on LLVM Mirror.Version.11.0.0)"

Remove all # characters in the version string so that the build does not
fail unexpectedly.

Link: https://github.com/ClangBuiltLinux/linux/issues/1298
Reported-by: Michael Fuckner <michael@fuckner.net>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit c75173a269)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ie2ba7041325f13865b59b43517111674519ce386
2022-01-25 01:20:49 -08:00
Nick Desaulniers
2ca7f6ca9a UPSTREAM: Makefile: reuse CC_VERSION_TEXT
I noticed we're invoking $(CC) via $(shell) more than once to check the
version.  Let's reuse the first string captured in $CC_VERSION_TEXT.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
[masahiro.yamada:
CC_VERSION_TEXT is assigned by = instead of :=, so this $(shell ) is
evaluated multiple times anyway. The number of $(CC) invocations will
be still the same. Replacing 'grep' with the built-in $(findstring )
will give real performance benefit.]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit db07562aea)
Bug: 209655537
Change-Id: Ib8745c014472a9d63e51e98f9474cced778d57df
2022-01-25 01:20:49 -08:00
David Brazdil
870681ecd3 Revert "ANDROID: GKI: update virtual device symbol list"
Reason for revert: Fails init on devices with no DICE DT nodes
Reverted Changes:
I035ad0998:ANDROID: GKI: update virtual device symbol list
I0b2ec380d:ANDROID: arm64: Enable CONFIG_OPEN_DICE=m

Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: Ieca199d3fa4ad5f574558f55d49a61be2f0a36ff
2022-01-24 22:34:42 +00:00
Suzuki K Poulose
a5bf62aaed BACKPORT: arm64: errata: Add workaround for TSB flush failures
Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
from errata, where a TSB (trace synchronization barrier)
fails to flush the trace data completely, when executed from
a trace prohibited region. In Linux we always execute it
after we have moved the PE to trace prohibited region. So,
we can apply the workaround every time a TSB is executed.

The work around is to issue two TSB consecutively.

NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
that a late CPU could be blocked from booting if it is the
first CPU that requires the workaround. This is because we
do not allow setting a cpu_hwcaps after the SMP boot. The
other alternative is to use "this_cpu_has_cap()" instead
of the faster system wide check, which may be a bit of an
overhead, given we may have to do this in nvhe KVM host
before a guest entry.

Bug: 213931796
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit fa82d0b4b8)
[Fix conflict due to another workaround that is not backported
(TRBE_OVERWRITE). Also manually update cpucaps.h which is autogenerated
in upstream from arch/arm64/tools/cpucaps which we ignored as part of
the conflict resolution]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I37403e5a51de2b7015ee6e3e94ee690e66010890
2022-01-24 17:42:27 +00:00
Suzuki K Poulose
d92823b63b UPSTREAM: arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
Add the CPU Partnumbers for the new Arm designs.

Bug: 213931796
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 2d0d656700)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I3bdd17992299ea5a5d2e31d9e8cf84a4bd554cdc
2022-01-24 17:42:14 +00:00
Suzuki K Poulose
af2c988091 UPSTREAM: coresight: trbe: Defer the probe on offline CPUs
If a CPU is offline during the driver init, we could end up causing
a kernel crash trying to register the coresight device for the TRBE
instance. The trbe_cpudata for the TRBE instance is initialized only
when it is probed. Otherwise, we could end up dereferencing a NULL
cpudata->drvdata.

e.g:

[    0.149999] coresight ete0: CPU0: ete v1.1 initialized
[    0.149999] coresight-etm4x ete_1: ETM arch init failed
[    0.149999] coresight-etm4x: probe of ete_1 failed with error -22
[    0.150085] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
[    0.150085] Mem abort info:
[    0.150085]   ESR = 0x96000005
[    0.150085]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.150085]   SET = 0, FnV = 0
[    0.150085]   EA = 0, S1PTW = 0
[    0.150085] Data abort info:
[    0.150085]   ISV = 0, ISS = 0x00000005
[    0.150085]   CM = 0, WnR = 0
[    0.150085] [0000000000000050] user address but active_mm is swapper
[    0.150085] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    0.150085] Modules linked in:
[    0.150085] Hardware name: FVP Base RevC (DT)
[    0.150085] pstate: 00800009 (nzcv daif -PAN +UAO -TCO BTYPE=--)
[    0.150155] pc : arm_trbe_register_coresight_cpu+0x74/0x144
[    0.150155] lr : arm_trbe_register_coresight_cpu+0x48/0x144
  ...

[    0.150237] Call trace:
[    0.150237]  arm_trbe_register_coresight_cpu+0x74/0x144
[    0.150237]  arm_trbe_device_probe+0x1c0/0x2d8
[    0.150259]  platform_drv_probe+0x94/0xbc
[    0.150259]  really_probe+0x1bc/0x4a8
[    0.150266]  driver_probe_device+0x7c/0xb8
[    0.150266]  device_driver_attach+0x6c/0xac
[    0.150266]  __driver_attach+0xc4/0x148
[    0.150266]  bus_for_each_dev+0x7c/0xc8
[    0.150266]  driver_attach+0x24/0x30
[    0.150266]  bus_add_driver+0x100/0x1e0
[    0.150266]  driver_register+0x78/0x110
[    0.150266]  __platform_driver_register+0x44/0x50
[    0.150266]  arm_trbe_init+0x28/0x84
[    0.150266]  do_one_initcall+0x94/0x2bc
[    0.150266]  do_initcall_level+0xa4/0x158
[    0.150266]  do_initcalls+0x54/0x94
[    0.150319]  do_basic_setup+0x24/0x30
[    0.150319]  kernel_init_freeable+0xe8/0x14c
[    0.150319]  kernel_init+0x14/0x18c
[    0.150319]  ret_from_fork+0x10/0x30
[    0.150319] Code: f94012c8 b0004ce2 9134a442 52819801 (f9402917)
[    0.150319] ---[ end trace d23e0cfe5098535e ]---
[    0.150346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Fix this by skipping the step, if we are unable to probe the CPU.

Bug: 213931796
Fixes: 3fbf7f011f ("coresight: sink: Add TRBE driver")
Reported-by: Bransilav Rankov <branislav.rankov@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: stable <stable@vger.kernel.org>
Tested-by: Branislav Rankov <branislav.rankov@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20211014142238.2221248-1-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit a08025b3fe)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Iea490213245f9f486c775398dd2972311d204a7e
2022-01-24 17:42:01 +00:00
Suzuki K Poulose
262bfd89ed UPSTREAM: coresight: etm4x: Use Trace Filtering controls dynamically
The Trace Filtering support (FEAT_TRF) ensures that the ETM
can be prohibited from generating any trace for a given EL.
This is much stricter knob, than the TRCVICTLR exception level
masks, which doesn't prevent the ETM from generating Context
packets for an "excluded" EL. At the moment, we do a onetime
enable trace at user and kernel and leave it untouched for the
kernel life time. This implies that the ETM could potentially
generate trace packets containing the kernel addresses, and
thus leaking the kernel virtual address in the trace.

This patch makes the switch dynamic, by honoring the filters
set by the user and enforcing them in the TRFCR controls.
We also rename the cpu_enable_tracing() appropriately to
cpu_detect_trace_filtering() and the drvdata member
trfc => trfcr to indicate the "value" of the TRFCR_EL1.

Bug: 213931796
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-3-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 5f6fd1aa8c)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I20432aa7cfd4375e1cdb9d256a2adc9df548d68e
2022-01-24 17:41:51 +00:00
Suzuki K Poulose
5e4d418233 BACKPORT: coresight: etm4x: Save restore TRFCR_EL1
When the CPU enters a low power mode, the TRFCR_EL1 contents could be
reset. Thus we need to save/restore the TRFCR_EL1 along with the ETM4x
registers to allow the tracing.

The TRFCR related helpers are in a new header file, as we need to use
them for TRBE in the later patches.

Bug: 213931796
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-2-suzuki.poulose@arm.com
[Fixed cosmetic details]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 937d3f58ca)
[Conflict in in drivers/hwtracing/coresight/coresight-etm4x-core.c,
unnecessary new includes were removed]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: If0d2b8625ffab0bfcd4afbe6fc8c6e7362c01c4a
2022-01-24 17:41:36 +00:00
Leo Yan
bd68b82ac7 UPSTREAM: coresight: tmc-etr: Speed up for bounce buffer in flat mode
The AUX bounce buffer is allocated with API dma_alloc_coherent(), in the
low level's architecture code, e.g. for Arm64, it maps the memory with
the attribution "Normal non-cacheable"; this can be concluded from the
definition for pgprot_dmacoherent() in arch/arm64/include/asm/pgtable.h.

Later when access the AUX bounce buffer, since the memory mapping is
non-cacheable, it's low efficiency due to every load instruction must
reach out DRAM.

This patch changes to allocate pages with dma_alloc_noncoherent(), the
driver can access the memory via cacheable mapping; therefore, load
instructions can fetch data from cache lines rather than always read
data from DRAM, the driver can boost memory performance.  After using
the cacheable mapping, the driver uses dma_sync_single_for_cpu() to
invalidate cacheline prior to read bounce buffer so can avoid read stale
trace data.

By measurement the duration for function tmc_update_etr_buffer() with
ftrace function_graph tracer, it shows the performance significant
improvement for copying 4MiB data from bounce buffer:

  # echo tmc_etr_get_data_flat_buf > set_graph_notrace // avoid noise
  # echo tmc_update_etr_buffer > set_graph_function
  # echo function_graph > current_tracer

  before:

  # CPU  DURATION                  FUNCTION CALLS
  # |     |   |                     |   |   |   |
  2)               |    tmc_update_etr_buffer() {
  ...
  2) # 8148.320 us |    }

  after:

  # CPU  DURATION                  FUNCTION CALLS
  # |     |   |                     |   |   |   |
  2)               |  tmc_update_etr_buffer() {
  ...
  2) # 2525.420 us |  }

Bug: 213931796
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210905032144.966766-1-leo.yan@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 0abd076217)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I271f3882e1bdafb8053aecdb0948584a287b499f
2022-01-24 17:41:23 +00:00
Leo Yan
7704af7834 UPSTREAM: coresight: tmc-etr: Add barrier after updating AUX ring buffer
Since a memory barrier is required between AUX trace data store and
aux_head store, and the AUX trace data is filled with memcpy(), it's
sufficient to use smp_wmb() so can ensure the trace data is visible
prior to updating aux_head.

Bug: 213931796
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210809111407.596077-3-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 26701ceb4c)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I1bb64436a794d03e993fc85fca0f44224823d9d3
2022-01-24 17:41:09 +00:00
Marc Zyngier
0d871c7063 FROMGIT: KVM: arm64: Use shadow SPSR_EL1 when injecting exceptions on !VHE
Injecting an exception into a guest with non-VHE is risky business.
Instead of writing in the shadow register for the switch code to
restore it, we override the CPU register instead. Which gets
overriden a few instructions later by said restore code.

The result is that although the guest correctly gets the exception,
it will return to the original context in some random state,
depending on what was there the first place... Boo.

Fix the issue by writing to the shadow register. The original code
is absolutely fine on VHE, as the state is already loaded, and writing
to the shadow register in that case would actually be a bug.

Fixes: bb666c472c ("KVM: arm64: Inject AArch64 exceptions from HYP")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20220121184207.423426-1-maz@kernel.org
(cherry picked from commit 278583055a
 git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ifc1504c4be69e36a0a70033b36ee148c337ac7b5
2022-01-24 13:40:31 +00:00
Marc Zyngier
2f7bdca6d9 FROMGIT: KVM: arm64: vgic-v3: Restrict SEIS workaround to known broken systems
Contrary to what df652bcf11 ("KVM: arm64: vgic-v3: Work around GICv3
locally generated SErrors") was asserting, there is at least one other
system out there (Cavium ThunderX2) implementing SEIS, and not in
an obviously broken way.

So instead of imposing the M1 workaround on an innocent bystander,
let's limit it to the two known broken Apple implementations.

Fixes: df652bcf11 ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors")
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220122103912.795026-1-maz@kernel.org
(cherry picked from commit d11a327ed9
 git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I4c1b8faab30744f81b012e03b31cdd8254614f87
2022-01-24 13:40:31 +00:00