When we have a dependency of the form:
Device-A -> Device-C
Device-B
Device-C -> Device-B
Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
We have cyclic dependency: device-A -> device-C -> device-B -> device-A
fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.
While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.
To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].
[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
Fixes: f9aa460672 ("driver core: Refactor fw_devlink feature")
Cc: stable <stable@vger.kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2de9d8e0d2)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Ib3f301fa4638b79f73b82dcdae7f9058ff61989c
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.
But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.
1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.
This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.
Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.
As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.
NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.
To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.
If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.
A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.
the same time, confirmed the pending transactions succeeded.
Fixes: 432ff1e916 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b564171ade)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I73c75967903797f252c64fa6b4b7207d0ea73791
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.
xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.
This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.
Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.
This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.
This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.
Fixes: 674f8438c1 ("xhci: split handling halted endpoints into two steps")
Cc: <stable@vger.kernel.org> #5.12
Reported-by: Tao Wang <wat@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210820123503.2605901-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 94f339147f)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I6835eddfe9cd67505eaa920185bc10045a479b95
Correct a path to incremental-fs sysfs entry in incfs.rst
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: Id3a94888edd9022c517939b4667d9792fc04146a
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
The correct behavior for a file system is to allow only one such
mount, and then every subsequent attempt should fail with a -EBUSY
error code. In case of the issues listed below the common pattern
is that the reproducer calls:
mount("./file0", "./file0", "incremental-fs", 0, NULL)
many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:
BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN
The reason why many mounts with the same source and target are
possible is because the incfs_mount_fs() as it is allocates a new
super_block for every call, regardless of whether a given mount already
exists or not. This happens every time the sget() function is called
with a test param equal to NULL.
The correct behavior for an FS mount implementation is to call
appropriate mount vfs call for it's type, i.e. mount_bdev() for
a block device backed FS, mount_single() for a pseudo file system,
like sysfs that is mounted in a single, well know location, or
mount_nodev() for other special purpose FS like overlayfs.
In case of incremental-fs the open coded mount logic doesn't check
for abusive mount attempts such as overlays.
To fix this issue the logic needs to be changed to pass a proper
test function to sget() call, which then checks if a super_block
for a mount instance has already been allocated and also allows
the VFS to properly verify invalid mount attempts.
Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I66cfc3f1b5aaffb32b0845b2dad3ff26fe952e27
syzbot reported a bug when putting the last reference to a tasks file
descriptor table. Debugging this showed we didn't recalculate the
current maximum fd number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
after we unshared the file descriptors table. So max_fd could exceed the
current fdtable maximum causing us to set excessive bits. As a concrete
example, let's say the user requested everything from fd 4 to ~0UL to be
closed and their current fdtable size is 256 with their highest open fd
being 4. With CLOSE_RANGE_UNSHARE the caller will end up with a new
fdtable which has room for 64 file descriptors since that is the lowest
fdtable size we accept. But now max_fd will still point to 255 and needs
to be adjusted. Fix this by retrieving the correct maximum fd value in
__range_cloexec().
Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com
Fixes: 582f1fb6b7 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Fixes: fec8a6a691 ("close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit 9b5b872215)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Id04f75bfeb49dcff0457136f74b47a2d05bd9d51
After introducing CLOSE_RANGE_CLOEXEC syzbot reported a crash when
CLOSE_RANGE_CLOEXEC is specified in conjunction with CLOSE_RANGE_UNSHARE.
When CLOSE_RANGE_UNSHARE is specified the caller will receive a private
file descriptor table in case their file descriptor table is currently
shared.
For the case where the caller has requested all file descriptors to be
actually closed via e.g. close_range(3, ~0U, 0) the kernel knows that
the caller does not need any of the file descriptors anymore and will
optimize the close operation by only copying all files in the range from
0 to 3 and no others.
However, if the caller requested CLOSE_RANGE_CLOEXEC together with
CLOSE_RANGE_UNSHARE the caller wants to still make use of the file
descriptors so the kernel needs to copy all of them and can't optimize.
The original patch didn't account for this and thus could cause oopses
as evidenced by the syzbot report because it assumed that all fds had
been copied. Fix this by handling the CLOSE_RANGE_CLOEXEC case.
syzbot reported
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
BUG: KASAN: null-ptr-deref in atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
BUG: KASAN: null-ptr-deref in filp_close+0x22/0x170 fs/open.c:1274
Read of size 8 at addr 0000000000000077 by task syz-executor511/8522
CPU: 1 PID: 8522 Comm: syz-executor511 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
__kasan_report mm/kasan/report.c:549 [inline]
kasan_report.cold+0x5/0x37 mm/kasan/report.c:562
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
instrument_atomic_read include/linux/instrumented.h:71 [inline]
atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
filp_close+0x22/0x170 fs/open.c:1274
close_files fs/file.c:402 [inline]
put_files_struct fs/file.c:417 [inline]
put_files_struct+0x1cc/0x350 fs/file.c:414
exit_files+0x12a/0x170 fs/file.c:435
do_exit+0xb4f/0x2a00 kernel/exit.c:818
do_group_exit+0x125/0x310 kernel/exit.c:920
get_signal+0x428/0x2100 kernel/signal.c:2792
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x447039
Code: Unable to access opcode bytes at RIP 0x44700f.
RSP: 002b:00007f1b1225cdb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00000000006dbc28 RCX: 0000000000447039
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000006dbc2c
RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 00007fff223b6bef R14: 00007f1b1225d9c0 R15: 00000000006dbc2c
==================================================================
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Tested on:
commit: 10f7cddd selftests/core: add regression test for CLOSE_RAN..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git vfs
kernel config: https://syzkaller.appspot.com/x/.config?x=5d42216b510180e3
dashboard link: https://syzkaller.appspot.com/bug?extid=96cfd2b22b3213646a93
compiler: gcc (GCC) 10.1.0-syz 20200507
Reported-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Fixes: 582f1fb6b7 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20201217213303.722643-1-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit fec8a6a691)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I0c6bbd8a46b93293c212fff08728514915d5c690
When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't
immediately close the files but it sets the close-on-exec bit.
It is useful for e.g. container runtimes that usually install a
seccomp profile "as late as possible" before execv'ing the container
process itself. The container runtime could either do:
1 2
- install_seccomp_profile(); - close_range(MIN_FD, MAX_INT, 0);
- close_range(MIN_FD, MAX_INT, 0); - install_seccomp_profile();
- execve(...); - execve(...);
Both alternative have some disadvantages.
In the first variant the seccomp_profile cannot block the close_range
syscall, as well as opendir/read/close/... for the fallback on older
kernels.
In the second variant, close_range() can be used only on the fds
that are not going to be needed by the runtime anymore, and it must be
potentially called multiple times to account for the different ranges
that must be closed.
Using close_range(..., ..., CLOSE_RANGE_CLOEXEC) solves these issues.
The runtime is able to use the existing open fds, the seccomp profile
can block close_range() and the syscalls used for its fallback.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Link: https://lore.kernel.org/r/20201118104746.873084-2-gscrivan@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit 582f1fb6b7)
Bug: 216276716
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ib2d44f9760a80e3febdb25925d17ce5ffd14910e
CROSS_COMPILE is no longer necessary when building with LLVM=1 as of
upstream v5.15-rc1's
commit 231ad7f409 ("Makefile: infer --target from ARCH for CC=clang")
which was backported to android13-5.10.
LLVM_IAS=1 is implied by LLVM=1 as of LLVM_IAS as of
upstream v5.15-rc1's
commit f12b034afe ("scripts/Makefile.clang: default to LLVM_IAS=1")
which was backported to android13-5.10.
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Id426f680bba8d478ed300dd0f6551dc419ea345f
It is possible that fget returns NULL. This needs to be handled
correctly in ioctl_permit_fill.
Bug: 212821226
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: Iec8be21982afeab6794b78ab1a542671c52acea2
Due to LTO being backported to android's 5.10 branches first, the
backport of ba64beb174 dropped this. Now
that we have backports for checking AS_IS_LLVM, use that so that we
don't accidentally prevent LTO from being selectable.
Bug: 1938905
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I93fe13dba285bbb9fa1ae0e111bd2955c0b87c12
LLVM_IAS=1 controls enabling clang's integrated assembler via
-integrated-as. This was an explicit opt in until we could enable
assembler support in Clang for more architecures. Now we have support
and CI coverage of LLVM_IAS=1 for all architecures except a few more
bugs affecting s390 and powerpc.
This commit flips the default from opt in via LLVM_IAS=1 to opt out via
LLVM_IAS=0. CI systems or developers that were previously doing builds
with CC=clang or LLVM=1 without explicitly setting LLVM_IAS must now
explicitly opt out via LLVM_IAS=0, otherwise they will be implicitly
opted-in.
This finally shortens the command line invocation when cross compiling
with LLVM to simply:
$ make ARCH=arm64 LLVM=1
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
[nd: conflict in Documentation/kbuild/llvm.rst]
(cherry picked from commit f12b034afe)
Change-Id: I026d42aa40c83eb149c8e39fe91a0b0164ff71b0
We get constant feedback that the command line invocation of make is too
long when compiling with LLVM. CROSS_COMPILE is helpful when a toolchain
has a prefix of the target triple, or is an absolute path outside of
$PATH.
Since a Clang binary is generally multi-targeted, we can infer a given
target from SRCARCH/ARCH. If CROSS_COMPILE is not set, simply set
--target= for CLANG_FLAGS, KBUILD_CFLAGS, and KBUILD_AFLAGS based on
$SRCARCH.
Previously, we'd cross compile via:
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make LLVM=1 LLVM_IAS=1
Now:
$ ARCH=arm64 make LLVM=1 LLVM_IAS=1
For native builds (not involving cross compilation) we now explicitly
specify a target triple rather than rely on the implicit host triple.
Link: https://github.com/ClangBuiltLinux/linux/issues/1399
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Arnd Bergmann <arnd@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
(cherry picked from commit 231ad7f409)
Change-Id: I6f61a4937e6af22deda66eeaac6c667c889dc683
With some of the changes we'd like to make to CROSS_COMPILE, the initial
block of clang flag handling which controls things like the target triple,
whether or not to use the integrated assembler and how to find GAS,
and erroring on unknown warnings is becoming unwieldy. Move it into its
own file under scripts/.
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
[nd: conflict in MAINTAINERS]
(cherry picked from commit 6f5b41a2f5)
Change-Id: I350ba8aaf924f7b373d25ef67260e6373a405d12
It is always safe to use the same compiler for the kernel and external
modules, but in reality, some distributions such as Fedora release a
different version of GCC from the one used for building the kernel.
There was a long discussion about mixing different compilers [1].
I do not repeat it here, but at least, showing a heads up in that
case is better than nothing.
Linus suggested [2]:
And a warning might be more palatable even if different compiler
version work fine together. Just a heads up on "it looks like you
might be mixing compiler versions" is a valid note, and isn't
necessarily wrong. Even when they work well together, maybe you want
to have people at least _aware_ of it.
This commit shows a warning unless the compiler is exactly the same.
warning: the compiler differs from the one used to build the kernel
The kernel was built by: gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
You are using: gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
Check the difference, and if it is OK with you, please proceed at your
risk.
To avoid the locale issue as in commit bcbcf50f52 ("kbuild: fix
ld-version.sh to not be affected by locale"), pass LC_ALL=C to
"$(CC) --version".
[1] https://lore.kernel.org/linux-hardening/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/
[2] https://lore.kernel.org/lkml/CAHk-=wgjwhDy-y4mQh34L+2aF=n6BjzHdqAW2=8wri5x7O04pA@mail.gmail.com/
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 6072b2c49d)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ibbeeca6c7139f1d03621a8ba2137bafa32c14474
scripts/Kbuild.include is included everywhere, but macros such as
cc-option are needed by build targets only.
For example, when 'make clean' traverses the tree, it does not need
to evaluate $(call cc-option,).
Split cc-option, ld-option, etc. to scripts/Makefile.compiler, which
is only included from the top Makefile and scripts/Makefile.build.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 209655537
(cherry picked from commit 57fd251c78)
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I7cef781922a192a9bcd21152a74ff2cec0527ab0
VPATH is used in Kbuild to make pattern rules search for prerequisites
in both $(objtree) and $(srctree). Some of *.c, *.S files are not real
sources, but generated by tools such as flex, bison, perl.
In contrast, I doubt the benefit of --include-dir=$(abs_srctree) because
it is always clear which Makefiles are real sources, and which are not.
So, my hope is to add $(srctree)/ prefix to all check-in Makefiles,
then remove --include-dir=$(abs_srctree) flag in the future.
I am touching only some Kbuild core parts for now. Treewide fixes will
be needed to achieve this goal.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 3204a7fb98)
[nd: conflict in scripts/Makefile.modpost due to d7a956441f, which is
itself a backport of 850ded46c6]
Bug: 189327973
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I691d266b68d769ec0fabcc98ed515cfb1cad5ccf
For simple text replacement, it is better to use a built-in function
instead of sed if possible. You can save one process forking.
I do not mean to replace all sed invocations because GNU Make itself
does not support regular expression (unless you use guile).
I just replaced simple ones.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 6e0839fda3)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I92c893dfb17055db0b8a8ca2f39ddfa5f31a118d
When building with LLVM_IAS=1, there is no point to specifying
'--prefix=' because that flag is only used to find GNU cross tools,
which will not be used indirectly when using the integrated assembler.
All of the tools are invoked directly from PATH or a full path specified
via the command line, which does not depend on the value of '--prefix='.
Sharing commands to reproduce issues becomes a little bit easier without
a '--prefix=' value because that '--prefix=' value is specific to a
user's machine due to it being an absolute path.
Some further notes from Fangrui Song:
clang can spawn GNU as (if -f?no-integrated-as is specified) and GNU
objcopy (-f?no-integrated-as and -gsplit-dwarf and -g[123]).
objcopy is only used for GNU as assembled object files.
With integrated assembler, the object file streamer creates .o and
.dwo simultaneously.
With GNU as, two objcopy commands are needed to extract .debug*.dwo to
.dwo files && another command to remove .debug*.dwo sections.
A small consequence of this change (to keep things simple) is that
'--prefix=' will always be specified now, even with a native build, when
it was not before. This should not be an issue due to the way that the
Makefile searches for the prefix (based on elfedit's location). This
ends up improving the experience for host builds because PATH is better
respected and matches GCC's behavior more closely. See the below thread
for more details:
https://lore.kernel.org/r/20210205213651.GA16907@Ryzen-5-4500U.localdomain/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit eec08090bc)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I90f232a1551afb9118176a61ac5de38384a171fd
This flag was originally added to allow clang to find the GNU cross
tools in commit 785f11aa59 ("kbuild: Add better clang cross build
support"). This flag was not enough to find the tools at times so
'--prefix' was added to the list in commit ef8c4ed9db ("kbuild: allow
to use GCC toolchain not in Clang search path") and improved upon in
commit ca9b31f6bb ("Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang
cross compilation"). Now that '--prefix' specifies a full path and
prefix, '--gcc-toolchain' serves no purpose because the kernel builds
with '-nostdinc' and '-nostdlib'.
This has been verified with self compiled LLVM 10.0.1 and LLVM 13.0.0 as
well as a distribution version of LLVM 11.1.0 without binutils in the
LLVM toolchain locations.
Link: https://reviews.llvm.org/D97902
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit c91d4e47e1)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ie1f64ed8562964ceba21e402a3d97850f6471508
There is no direct user of ld-version; you can use CONFIG_LD_VERSION
if needed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Bug: 209655537
(cherry picked from commit 05f6bbf2d7)
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: If7f82474beea2822340ee8503de5087336766223
As Documentation/kbuild/llvm.rst notes, LLVM=1 switches the default of
tools, but you can still override CC, LD, etc. individually. This LLVM=1
check is unneeded because each tool is already checked separately.
"make CC=clang LD=ld.lld NM=llvm-nm AR=llvm-ar LLVM_IAS=1 menuconfig"
should be able to enable Clang LTO.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Bug: 209655537
(cherry picked from commit 4c273d23c4)
Change-Id: Ieb1f4bbf98521b929c42253260a9a5aec25e73e2
When using AMD's Optimizing C/C++ Compiler (AOCC), the build fails due
to a # character in the version string, which is interpreted as a
comment:
$ make CC=clang defconfig init/main.o
include/config/auto.conf.cmd:1374: *** invalid syntax in conditional. Stop.
$ sed -n 1374p include/config/auto.conf.cmd
ifneq "$(CC_VERSION_TEXT)" "AMD clang version 11.0.0 (CLANG: AOCC_2.3.0-Build#85 2020_11_10) (based on LLVM Mirror.Version.11.0.0)"
Remove all # characters in the version string so that the build does not
fail unexpectedly.
Link: https://github.com/ClangBuiltLinux/linux/issues/1298
Reported-by: Michael Fuckner <michael@fuckner.net>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit c75173a269)
Bug: 209655537
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Ie2ba7041325f13865b59b43517111674519ce386
I noticed we're invoking $(CC) via $(shell) more than once to check the
version. Let's reuse the first string captured in $CC_VERSION_TEXT.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
[masahiro.yamada:
CC_VERSION_TEXT is assigned by = instead of :=, so this $(shell ) is
evaluated multiple times anyway. The number of $(CC) invocations will
be still the same. Replacing 'grep' with the built-in $(findstring )
will give real performance benefit.]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit db07562aea)
Bug: 209655537
Change-Id: Ib8745c014472a9d63e51e98f9474cced778d57df
Reason for revert: Fails init on devices with no DICE DT nodes
Reverted Changes:
I035ad0998:ANDROID: GKI: update virtual device symbol list
I0b2ec380d:ANDROID: arm64: Enable CONFIG_OPEN_DICE=m
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: Ieca199d3fa4ad5f574558f55d49a61be2f0a36ff
Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
from errata, where a TSB (trace synchronization barrier)
fails to flush the trace data completely, when executed from
a trace prohibited region. In Linux we always execute it
after we have moved the PE to trace prohibited region. So,
we can apply the workaround every time a TSB is executed.
The work around is to issue two TSB consecutively.
NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
that a late CPU could be blocked from booting if it is the
first CPU that requires the workaround. This is because we
do not allow setting a cpu_hwcaps after the SMP boot. The
other alternative is to use "this_cpu_has_cap()" instead
of the faster system wide check, which may be a bit of an
overhead, given we may have to do this in nvhe KVM host
before a guest entry.
Bug: 213931796
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit fa82d0b4b8)
[Fix conflict due to another workaround that is not backported
(TRBE_OVERWRITE). Also manually update cpucaps.h which is autogenerated
in upstream from arch/arm64/tools/cpucaps which we ignored as part of
the conflict resolution]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I37403e5a51de2b7015ee6e3e94ee690e66010890
The Trace Filtering support (FEAT_TRF) ensures that the ETM
can be prohibited from generating any trace for a given EL.
This is much stricter knob, than the TRCVICTLR exception level
masks, which doesn't prevent the ETM from generating Context
packets for an "excluded" EL. At the moment, we do a onetime
enable trace at user and kernel and leave it untouched for the
kernel life time. This implies that the ETM could potentially
generate trace packets containing the kernel addresses, and
thus leaking the kernel virtual address in the trace.
This patch makes the switch dynamic, by honoring the filters
set by the user and enforcing them in the TRFCR controls.
We also rename the cpu_enable_tracing() appropriately to
cpu_detect_trace_filtering() and the drvdata member
trfc => trfcr to indicate the "value" of the TRFCR_EL1.
Bug: 213931796
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-3-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 5f6fd1aa8c)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I20432aa7cfd4375e1cdb9d256a2adc9df548d68e
The AUX bounce buffer is allocated with API dma_alloc_coherent(), in the
low level's architecture code, e.g. for Arm64, it maps the memory with
the attribution "Normal non-cacheable"; this can be concluded from the
definition for pgprot_dmacoherent() in arch/arm64/include/asm/pgtable.h.
Later when access the AUX bounce buffer, since the memory mapping is
non-cacheable, it's low efficiency due to every load instruction must
reach out DRAM.
This patch changes to allocate pages with dma_alloc_noncoherent(), the
driver can access the memory via cacheable mapping; therefore, load
instructions can fetch data from cache lines rather than always read
data from DRAM, the driver can boost memory performance. After using
the cacheable mapping, the driver uses dma_sync_single_for_cpu() to
invalidate cacheline prior to read bounce buffer so can avoid read stale
trace data.
By measurement the duration for function tmc_update_etr_buffer() with
ftrace function_graph tracer, it shows the performance significant
improvement for copying 4MiB data from bounce buffer:
# echo tmc_etr_get_data_flat_buf > set_graph_notrace // avoid noise
# echo tmc_update_etr_buffer > set_graph_function
# echo function_graph > current_tracer
before:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 8148.320 us | }
after:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 2525.420 us | }
Bug: 213931796
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210905032144.966766-1-leo.yan@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit 0abd076217)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I271f3882e1bdafb8053aecdb0948584a287b499f
Injecting an exception into a guest with non-VHE is risky business.
Instead of writing in the shadow register for the switch code to
restore it, we override the CPU register instead. Which gets
overriden a few instructions later by said restore code.
The result is that although the guest correctly gets the exception,
it will return to the original context in some random state,
depending on what was there the first place... Boo.
Fix the issue by writing to the shadow register. The original code
is absolutely fine on VHE, as the state is already loaded, and writing
to the shadow register in that case would actually be a bug.
Fixes: bb666c472c ("KVM: arm64: Inject AArch64 exceptions from HYP")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20220121184207.423426-1-maz@kernel.org
(cherry picked from commit 278583055a
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ifc1504c4be69e36a0a70033b36ee148c337ac7b5
Contrary to what df652bcf11 ("KVM: arm64: vgic-v3: Work around GICv3
locally generated SErrors") was asserting, there is at least one other
system out there (Cavium ThunderX2) implementing SEIS, and not in
an obviously broken way.
So instead of imposing the M1 workaround on an innocent bystander,
let's limit it to the two known broken Apple implementations.
Fixes: df652bcf11 ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors")
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220122103912.795026-1-maz@kernel.org
(cherry picked from commit d11a327ed9
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I4c1b8faab30744f81b012e03b31cdd8254614f87
CMOs issued from EL2 cannot directly use the kernel helpers,
as EL2 doesn't have a mapping of the guest pages. Oops.
Instead, use the mm_ops indirection to use helpers that will
perform a mapping at EL2 and allow the CMO to be effective.
Fixes: 25aa28691b ("KVM: arm64: Move guest CMOs to the fault handlers")
Reviewed-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220114125038.1336965-1-maz@kernel.org
(cherry picked from commit 094d00f8ca
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Id020c5adb0ce0d02bc6cd344e7d01859539b9f4f
This reverts commit 9ad5e201c7.
The corresponding patch has been queued in the kvmarm tree as a fix for
5.17, so revert our FROMLIST version in preparation for pulling in the
upstream queue instead.
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I7f65941b2b13125c6c6397590abbc3e18cb54496
The implementor will be used to condition the FIQ support quirk.
The specific CPU types are not used at the moment, but let's add them
for documentation purposes.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
(cherry picked from commit 11ecdad722)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: If84141e50dfb52258c9ecfca9979f2d77e9a5197
If a vcpu exits for a data abort with an invalid syndrome, the
expectations are that userspace has a chance to save the day if
it has requested to see such exits.
However, this is completely futile in the case of a protected VM,
as none of the state is available. In this particular case, inject
a data abort directly into the vcpu, consistent with what userspace
could do.
This also helps with pKVM, which discards all syndrome information when
forwarding data aborts that are not known to be MMIO.
Finally, hide the RETURN_NISV_IO_ABORT_TO_USER cap from userspace on
protected VMs, and document this tweak to the API.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Bug: 209580772
[willdeacon@: Drop KVM_CAP_SET_GUEST_DEBUG2 from context]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I4090c7c266b27776089cac69efd489264ed003cf
Commit 3f6536412f ("ANDROID: KVM: arm64: refactor vcpu_read_sys_reg
and vcpu_write_sys_reg for hyp use") predicated direct access to the
live vCPU registers on an is_vhe_hyp_code() check, neglecting the fact
that these functions are also used by the VHE *kernel* code.
Restore the old behaviour by changing the check so that only the nVHE
hyp code unconditionally uses the 'ctxt_sys_reg' table.
Reported-by: Marc Zyngier <mzyngier@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Bug: 209580772
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I17d4c12ffdbbd95e8e8d1820ebb6438b138361aa
Typically, TLB invalidation of guest stage-2 mappings using nVHE is
performed by a hypercall originating from the host. For the invalidation
instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
swizzle the active stage-2 context around the TLBI instruction.
With guest-to-host memory sharing and unsharing hypercalls originating
from the guest under pKVM, there is now a need to support both guest
and host VMID invalidations issued from guest context.
Replace the __tlb_switch_to_{guest,host}() functions with a more general
{enter,exit}_vmid_context() implementation which supports being invoked
from guest context and acts as a no-op if the target context matches the
running context.
Bug: 209580772
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I259b1acd11f25143080af2deaab820746090436d
We should only apply the mask ourselves if we're not
using POSIX acls
Bug: 215212818
Test: com.android.cts.externalstorageapp.CommonExternalStorageTest
#testAllPackageDirsWritable, verify files and cache permissions
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: If105afe62a60b93cbce1ca5ab5caf11f008aa7db
The upstream change to make f2fs use iomap for direct I/O was backported
to this kernel branch, which broke the out-of-tree support for fscrypt
direct I/O because f2fs_iomap_begin() isn't aware of encryption. Make
the needed change to f2fs_iomap_begin(), matching what I've proposed
upstream at
https://lore.kernel.org/r/20220120071215.123274-5-ebiggers@kernel.org.
Also drop the fscrypt support from fs/direct-io.c, which is no longer
used since both ext4 and f2fs now use iomap for direct I/O.
Bug: 162255927
Bug: 215554521
Fixes: 6d54ce0108 ("Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.10.y' into android13-5.10")
Change-Id: I6b99b623ad3b8a86099c260787b2086b415a0e12
Signed-off-by: Eric Biggers <ebiggers@google.com>
Changes in 5.10.93
kbuild: Add $(KBUILD_HOSTLDFLAGS) to 'has_libelf' test
devtmpfs regression fix: reconfigure on each mount
orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc()
remoteproc: qcom: pil_info: Don't memcpy_toio more than is provided
vfs: fs_context: fix up param length parsing in legacy_parse_param
perf: Protect perf_guest_cbs with RCU
KVM: x86: Register Processor Trace interrupt hook iff PT enabled in guest
KVM: s390: Clarify SIGP orders versus STOP/RESTART
9p: only copy valid iattrs in 9P2000.L setattr implementation
video: vga16fb: Only probe for EGA and VGA 16 color graphic cards
media: uvcvideo: fix division by zero at stream start
rtlwifi: rtl8192cu: Fix WARNING when calling local_irq_restore() with interrupts enabled
firmware: qemu_fw_cfg: fix sysfs information leak
firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entries
firmware: qemu_fw_cfg: fix kobject leak in probe error path
KVM: x86: remove PMU FIXED_CTR3 from msrs_to_save_all
ALSA: hda/realtek: Add speaker fixup for some Yoga 15ITL5 devices
ALSA: hda/realtek - Fix silent output on Gigabyte X570 Aorus Master after reboot from Windows
ALSA: hda: ALC287: Add Lenovo IdeaPad Slim 9i 14ITL5 speaker quirk
ALSA: hda/realtek: Add quirk for Legion Y9000X 2020
ALSA: hda/realtek: Re-order quirk entries for Lenovo
powerpc/pseries: Get entry and uaccess flush required bits from H_GET_CPU_CHARACTERISTICS
mtd: fixup CFI on ixp4xx
Linux 5.10.93
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I068bb827060f65df5c11bdffcd93090f2ab21148