Select HAVE_KVM at all times on arm64, as the OF requirement is
always there (even in the case of an ACPI system, we still depend
on some of the OF infrastructure), and won't fo away.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Will Deacon <will@kernel.org>
[maz: Drop the "HAVE_KVM if OF" dependency, as OF is always there on arm64,
new commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210921222231.518092-3-seanjc@google.com
(cherry picked from commit e26bb75aa2)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I64236d409454463bf0f2b0a2efb2e5a00d8f3deb
Unconditionally "source" the generic KVM Kconfig instead of wrapping it
with KVM=y. A future patch will select HAVE_KVM so that referencing
HAVE_KVM in common kernel code doesn't break, and because KVM=y and
HAVE_KVM=n is weird. Source the generic KVM Kconfig unconditionally so
that HAVE_KVM and KVM don't end up with a circular dependency.
Note, all but one of generic KVM's "configs" are of the HAVE_XYZ nature,
and the one outlier correctly takes a dependency on CONFIG_KVM, i.e. the
generic Kconfig is intended to be included unconditionally.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
[maz: made NVHE_EL2_DEBUG depend on KVM]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210921222231.518092-2-seanjc@google.com
(cherry picked from commit c8f1e96734)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I94e4a0666f5da3ff17ebfa0068eec5a9358556f3
Although KVM can be compiled out of the kernel, it cannot be disabled
at runtime. Allow this possibility by introducing a new mode that
will prevent KVM from initialising.
This is useful in the (limited) circumstances where you don't want
KVM to be available (what is wrong with you?), or when you want
to install another hypervisor instead (good luck with that).
Reviewed-by: David Brazdil <dbrazdil@google.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Scull <ascull@google.com>
Link: https://lore.kernel.org/r/20211001170553.3062988-1-maz@kernel.org
(cherry picked from commit b6a68b97af)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ie796c716fc7cece906a8cded0ae4652a828988bb
Verify that the GICv2 CPU interface does not extend beyond the
VM-specified IPA range (phys_size).
base + size > phys_size AND base < phys_size
Add the missing check into kvm_vgic_addr() which is called when setting
the region. This patch also enables some superfluous checks for the
distributor (vgic_check_ioaddr was enough as alignment == size for the
distributors).
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-4-ricarkol@google.com
(cherry picked from commit c56a87da0a)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ib462da73c1f819152ee799507f037126db02ebbe
Verify that the redistributor regions do not extend beyond the
VM-specified IPA range (phys_size). This can happen when using
KVM_VGIC_V3_ADDR_TYPE_REDIST or KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS
with:
base + size > phys_size AND base < phys_size
Add the missing check into vgic_v3_alloc_redist_region() which is called
when setting the regions, and into vgic_v3_check_base() which is called
when attempting the first vcpu-run. The vcpu-run check does not apply to
KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS because the regions size is known
before the first vcpu-run. Note that using the REDIST_REGIONS API
results in a different check, which already exists, at first vcpu run:
that the number of redist regions is enough for all vcpus.
Finally, this patch also enables some extra tests in
vgic_v3_alloc_redist_region() by calculating "size" early for the legacy
redist api: like checking that the REDIST region can fit all the already
created vcpus.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-3-ricarkol@google.com
(cherry picked from commit 4612d98f58)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I9c5877cb305621d206e36250aa970a55926d4b5e
Add the new vgic_check_iorange helper that checks that an iorange is
sane: the start address and size have valid alignments, the range is
within the addressable PA range, start+size doesn't overflow, and the
start wasn't already defined.
No functional change.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-2-ricarkol@google.com
(cherry picked from commit f25c5e4daf)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ic702fe5c28dc1c7a7bc5975703acee084b8a7b6b
After pKVM has been 'finalised' using the __pkvm_prot_finalize hypercall,
the calling CPU will have a Stage-2 translation enabled to prevent access
to memory pages owned by EL2.
Although this forms a significant part of the process to deprivilege the
host kernel, we also need to ensure that the hypercall interface is
reduced so that the EL2 code cannot, for example, be re-initialised using
a new set of vectors.
Re-order the hypercalls so that only a suffix remains available after
finalisation of pKVM.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-7-will@kernel.org
(cherry picked from commit 057bed206f)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I61e57c7bff456d40866acd01475b2e9ace1e1bd6
__pkvm_prot_finalize() completes the deprivilege of the host when pKVM
is in use by installing a stage-2 translation table for the calling CPU.
Issuing the hypercall multiple times for a given CPU makes little sense,
but in such a case just return early with -EPERM rather than go through
the whole page-table dance again.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-6-will@kernel.org
(cherry picked from commit 07036cffe1)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I79dc15319bd3dc09233d95cfaa07d020e8a57973
If the __pkvm_prot_finalize hypercall returns an error, we WARN but fail
to propagate the failure code back to kvm_arch_init().
Pass a pointer to a zero-initialised return variable so that failure
to finalise the pKVM protections on a host CPU can be reported back to
KVM.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-5-will@kernel.org
(cherry picked from commit 2f2e1a5069)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I83610a95046341a7429e09ead03ed08b87e0873a
The stub hypercalls provide mechanisms to reset and replace the EL2 code,
so uninstall them once pKVM has been initialised in order to ensure the
integrity of the hypervisor code.
To ensure pKVM initialisation remains functional, split cpu_hyp_reinit()
into two helper functions to separate usage of the stub from usage of
pkvm hypercalls either side of __pkvm_init on the boot CPU.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-4-will@kernel.org
(cherry picked from commit 8579a185ba)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I6ad3affee8119fb312be91c2e6555103d12089cd
When pKVM is enabled, the hypervisor code at EL2 and its data structures
are inaccessible to the host kernel and cannot be torn down or replaced
as this would defeat the integrity properies which pKVM aims to provide.
Furthermore, the ABI between the host and EL2 is flexible and private to
whatever the current implementation of KVM requires and so booting a new
kernel with an old EL2 component is very likely to end in disaster.
In preparation for uninstalling the hyp stub calls which are relied upon
to reset EL2, disable kexec and hibernation in the host when protected
KVM is enabled.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-3-will@kernel.org
(cherry picked from commit 8f4566f18d)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ib85048668b185268dc95374f74b0b5bfa65fc33f
__KVM_HOST_SMCCC_FUNC_* is a royal pain, as there is a fair amount
of churn around these #defines, and we avoid making it an enum
only for the sake of the early init, low level code that requires
__KVM_HOST_SMCCC_FUNC___kvm_hyp_init to be usable from assembly.
Let's be brave and turn everything but this symbol into an enum,
using a bit of arithmetic to avoid any overlap.
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/877depq9gw.wl-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-2-will@kernel.org
(cherry picked from commit a78738ed1d)
Bug: 204960018
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I4cad5753673fb3959ce230e0a8cb9f5e3ab16877
Vendor architectures may contain CPUs running on the same clock line
which contain different capacities. Add a tracehook in this path to
allow vendor modules to skip implicit check to prevent crashes.
Bug: 206602617
Change-Id: Ica01a214689607b8d79b370c20bc9a8c44ca2117
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
GKI required the KMI_GENERATION to be added to the kernel version
string, but this only makes sense for GKI kernels, for non-GKI kernels
we don't need it. Leave all the other stuff we added, though, as it
seems useful.
Bug: 205897686
Change-Id: I2e7b3cb7ed5529f1e5e7c9d79a1f7ce4a1b6ee1f
Signed-off-by: Alistair Delva <adelva@google.com>
This reverts commit 77dfeaa02d.
Throttling can now be done using hist triggers and synthetic events
Bug: 145972256
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: I39c284040e2fdb815cda980f3a40ef188e59287c
Using old_creds as an indication that we are not overriding the
credentials, bypass call to inode_owner_or_capable. This solves
a problem with all execv calls being blocked when using the caller's
credentials.
Bug: 204981027
Link: https://lore.kernel.org/lkml/20211117015806.2192263-5-dvander@google.com
Change-Id: Ifa966dabda7413873614d1da24629dc8054db131
Signed-off-by: David Anderson <dvander@google.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials. The incoming accesses are
checked against the caller's credentials.
If the principles of least privilege are applied, the mounter's
credentials might not overlap the credentials of the caller's when
accessing the overlayfs filesystem. For example, a file that a lower
DAC privileged caller can execute, is MAC denied to the generally
higher DAC privileged mounter, to prevent an attack vector.
We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials. The module boolean parameter and mount option
override_creds is also added as a presence check for this "feature",
existence of /sys/module/overlay/parameters/override_creds.
It was not always this way. Circa 4.6 there was no recorded mounter's
credentials, instead privileged access to upper or work directories
were temporarily increased to perform the operations. The MAC
(selinux) policies were caller's in all cases. override_creds=off
partially returns us to this older access model minus the insecure
temporary credential increases. This is to permit use in a system
with non-overlapping security models for each executable including
the agent that mounts the overlayfs filesystem. In Android
this is the case since init, which performs the mount operations,
has a minimal MAC set of privileges to reduce any attack surface,
and services that use the content have a different set of MAC
privileges (eg: read, for vendor labelled configuration, execute for
vendor libraries and modules). The caveats are not a problem in
the Android usage model, however they should be fixed for
completeness and for general use in time.
Bug: 204981027
Link: https://lore.kernel.org/lkml/20211117015806.2192263-4-dvander@google.com
Change-Id: I46e6c74ff634eb064cf9d714017432171a898890
Signed-off-by: David Anderson <dvander@google.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
__vfs_getxattr({dentry...XATTR_NOSECURITY}) ->
handler->get({dentry...XATTR_NOSECURITY}) ->
__vfs_getxattr({realdentry...XATTR_NOSECURITY}) ->
lower_handler->get({realdentry...XATTR_NOSECURITY}) which
would report back through the chain data and success as expected,
the logging security layer at the top would have the data to
determine the access permissions and report back to the logs and
the caller that the target context was blocked.
For selinux this would solve the cosmetic issue of the selinux log
and allow audit2allow to correctly report the rule needed to address
the access problem.
Check impure, opaque, origin & meta xattr with no sepolicy audit
(using __vfs_getxattr) since these operations are internal to
overlayfs operations and do not disclose any data. This became
an issue for credential override off since sys_admin would have
been required by the caller; whereas would have been inherently
present for the creator since it performed the mount.
This is a change in operations since we do not check in the new
ovl_do_getxattr function if the credential override is off or not.
Reasoning is that the sepolicy check is unnecessary overhead,
especially since the check can be expensive.
Because for override credentials off, this affects _everyone_ that
underneath performs private xattr calls without the appropriate
sepolicy permissions and sys_admin capability. Providing blanket
support for sys_admin would be bad for all possible callers.
For the override credentials on, this will affect only the mounter,
should it lack sepolicy permissions. Not considered a security
problem since mounting by definition has sys_admin capabilities,
but sepolicy contexts would still need to be crafted.
It should be noted that there is precedence, __vfs_getxattr is used
in other filesystems for their own internal trusted xattr management.
Change-Id: I0b8fe9f1fe6c763fbd27a09c6de8209d1dc9d2f7
Signed-off-by: David Anderson <dvander@google.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Link: https://lore.kernel.org/lkml/20211117015806.2192263-3-dvander@google.com
Bug: 204981027
Add a flag option to get xattr method that could have a bit flag of
XATTR_NOSECURITY passed to it. XATTR_NOSECURITY is generally then
set in the __vfs_getxattr path when called by security
infrastructure.
This handles the case of a union filesystem driver that is being
requested by the security layer to report back the xattr data.
For the use case where access is to be blocked by the security layer.
The path then could be security(dentry) ->
__vfs_getxattr(dentry...XATTR_NOSECURITY) ->
handler->get(dentry...XATTR_NOSECURITY) ->
__vfs_getxattr(lower_dentry...XATTR_NOSECURITY) ->
lower_handler->get(lower_dentry...XATTR_NOSECURITY)
which would report back through the chain data and success as
expected, the logging security layer at the top would have the
data to determine the access permissions and report back the target
context that was blocked.
Without the get handler flag, the path on a union filesystem would be
the errant security(dentry) -> __vfs_getxattr(dentry) ->
handler->get(dentry) -> vfs_getxattr(lower_dentry) -> nested ->
security(lower_dentry, log off) -> lower_handler->get(lower_dentry)
which would report back through the chain no data, and -EACCES.
For selinux for both cases, this would translate to a correctly
determined blocked access. In the first case with this change a correct avc
log would be reported, in the second legacy case an incorrect avc log
would be reported against an uninitialized u:object_r:unlabeled:s0
context making the logs cosmetically useless for audit2allow.
This patch series is inert and is the wide-spread addition of the
flags option for xattr functions, and a replacement of __vfs_getxattr
with __vfs_getxattr(...XATTR_NOSECURITY).
Bug: 204981027
Link: https://lore.kernel.org/lkml/20211117015806.2192263-2-dvander@google.com
Change-Id: Id2c6fa6eeb2b5cca5a11e0cd02a3fbf2a5fcbef4
Signed-off-by: David Anderson <dvander@google.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
If a binary operation is detected while parsing an expression string,
the operand strings are deduced by splitting the experssion string at
the position of the detected binary operator. Both operand strings are
sub-strings (can be empty string) of the expression string but will
never be NULL.
Currently a NULL check is used for missing operands, fix this by
checking for empty strings instead.
Link: https://lkml.kernel.org/r/20211112191324.1302505-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Fixes: 9710b2f341 ("tracing: Fix operator precedence for hist triggers expression")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
(cherry picked from commit 1cab6bce42)
Bug: 146055070
Bug: 145972256
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: Ic19c0ac55ea3519756aacd3cf92b0fbed0ad5304
commit a44f9d6f9d upstream.
There is a wrong comparison of the total size of the loaded firmware
css->fw->size with the size of a pointer to struct imgu_fw_header.
Turn binary_header into a flexible-array member[1][2], use the
struct_size() helper and fix the wrong size comparison. Notice
that the loaded firmware needs to contain at least one 'struct
imgu_fw_info' item in the binary_header[] array.
It's also worth mentioning that
"css->fw->size < struct_size(css->fwp, binary_header, 1)"
with binary_header declared as a flexible-array member is equivalent
to
"css->fw->size < sizeof(struct imgu_fw_header)"
with binary_header declared as a one-element array (as in the original
code).
The replacement of the one-element array with a flexible-array member
also helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Fixes: 09d290f0ba ("media: staging/intel-ipu3: css: Add support for firmware management")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a56d3e40bd upstream.
USB bulk and interrupt message timeouts are specified in milliseconds
and should specifically not vary with CONFIG_HZ.
Note that the bulk-out transfer timeout was set to the endpoint
bInterval value, which should be ignored for bulk endpoints and is
typically set to zero. This meant that a failing bulk-out transfer
would never time out.
Assume that the 10 second timeout used for all other transfers is more
than enough also for the bulk-out endpoint.
Fixes: 985cafccbf ("Staging: Comedi: vmk80xx: Add k8061 support")
Fixes: 951348b377 ("staging: comedi: vmk80xx: wait for URBs to complete")
Cc: stable@vger.kernel.org # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20211025114532.4599-6-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a23461c474 upstream.
The driver uses endpoint-sized USB transfer buffers but up until
recently had no sanity checks on the sizes.
Commit e1f13c879a ("staging: comedi: check validity of wMaxPacketSize
of usb endpoints found") inadvertently fixed NULL-pointer dereferences
when accessing the transfer buffers in case a malicious device has a
zero wMaxPacketSize.
Make sure to allocate buffers large enough to handle also the other
accesses that are done without a size check (e.g. byte 18 in
vmk80xx_cnt_insn_read() for the VMK8061_MODEL) to avoid writing beyond
the buffers, for example, when doing descriptor fuzzing.
The original driver was for a low-speed device with 8-byte buffers.
Support was later added for a device that uses bulk transfers and is
presumably a full-speed device with a maximum 64-byte wMaxPacketSize.
Fixes: 985cafccbf ("Staging: Comedi: vmk80xx: Add k8061 support")
Cc: stable@vger.kernel.org # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20211025114532.4599-4-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 536de747bc upstream.
USB transfer buffers are typically mapped for DMA and must not be
allocated on the stack or transfers will fail.
Allocate proper transfer buffers in the various command helpers and
return an error on short transfers instead of acting on random stack
data.
Note that this also fixes a stack info leak on systems where DMA is not
used as 32 bytes are always sent to the device regardless of how short
the command is.
Fixes: 63274cd7d3 ("Staging: comedi: add usb dt9812 driver")
Cc: stable@vger.kernel.org # 2.6.29
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211027093529.30896-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c052cc1a06 upstream.
Syzbot reported use-after-free in rtl8712_dl_fw(). The problem was in
race condition between r871xu_dev_remove() ->ndo_open() callback.
It's easy to see from crash log, that driver accesses released firmware
in ->ndo_open() callback. It may happen, since driver was releasing
firmware _before_ unregistering netdev. Fix it by moving
unregister_netdev() before cleaning up resources.
Call Trace:
...
rtl871x_open_fw drivers/staging/rtl8712/hal_init.c:83 [inline]
rtl8712_dl_fw+0xd95/0xe10 drivers/staging/rtl8712/hal_init.c:170
rtl8712_hal_init drivers/staging/rtl8712/hal_init.c:330 [inline]
rtl871x_hal_init+0xae/0x180 drivers/staging/rtl8712/hal_init.c:394
netdev_open+0xe6/0x6c0 drivers/staging/rtl8712/os_intfs.c:380
__dev_open+0x2bc/0x4d0 net/core/dev.c:1484
Freed by task 1306:
...
release_firmware+0x1b/0x30 drivers/base/firmware_loader/main.c:1053
r871xu_dev_remove+0xcc/0x2c0 drivers/staging/rtl8712/usb_intf.c:599
usb_unbind_interface+0x1d8/0x8d0 drivers/usb/core/driver.c:458
Fixes: 8c213fa591 ("staging: r8712u: Use asynchronous firmware loading")
Cc: stable <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+c55162be492189fb4f51@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/20211019211718.26354-1-paskripkin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cf3f8133b upstream.
Commit ccaa66c8dd reinstated the kmap/kunmap that had been dropped in
commit 8c945d32e6 ("btrfs: compression: drop kmap/kunmap from lzo").
However, it seems to have done so incorrectly due to the change not
reverting cleanly, and lzo_decompress_bio() ended up not having a
matching "kunmap()" to the "kmap()" that was put back.
Also, any assert that the page pointer is not NULL should be before the
kmap() of said pointer, since otherwise you'd just oops in the kmap()
before the assert would even trigger.
I noticed this when trying to verify my btrfs merge, and things not
adding up. I'm doing this fixup before re-doing my merge, because this
commit needs to also be backported to 5.15 (after verification from the
btrfs people).
Fixes: ccaa66c8dd ("Revert 'btrfs: compression: drop kmap/kunmap from lzo'")
Cc: David Sterba <dsterba@suse.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f612ed3f7 upstream.
We have observed that on very large machines with newer CPUs, the static
key/branch switching delay is on the order of milliseconds. This is due
to the required broadcast IPIs, which simply does not scale well to
hundreds of CPUs (cores). If done too frequently, this can adversely
affect tail latencies of various workloads.
One workaround is to increase the sample interval to several seconds,
while decreasing sampled allocation coverage, but the problem still
exists and could still increase tail latencies.
As already noted in the Kconfig help text, there are trade-offs: at
lower sample intervals the dynamic branch results in better performance;
however, at very large sample intervals, the static keys mode can result
in better performance -- careful benchmarking is recommended.
Our initial benchmarking showed that with large enough sample intervals
and workloads stressing the allocator, the static keys mode was slightly
better. Evaluating and observing the possible system-wide side-effects
of the static-key-switching induced broadcast IPIs, however, was a blind
spot (in particular on large machines with 100s of cores).
Therefore, a major downside of the static keys mode is, unfortunately,
that it is hard to predict performance on new system architectures and
topologies, but also making conclusions about performance of new
workloads based on a limited set of benchmarks.
Most distributions will simply select the defaults, while targeting a
large variety of different workloads and system architectures. As such,
the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
only recommended after careful evaluation.
For reference, on x86-64 the condition in kfence_alloc() generates
exactly
2 instructions in the kmem_cache_alloc() fast-path:
| ...
| cmpl $0x0,0x1a8021c(%rip) # ffffffff82d560d0 <kfence_allocation_gate>
| je ffffffff812d6003 <kmem_cache_alloc+0x243>
| ...
which, given kfence_allocation_gate is infrequently modified, should be
well predicted by most CPUs.
Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07e8481d3c upstream.
Regardless of KFENCE mode (CONFIG_KFENCE_STATIC_KEYS: either using
static keys to gate allocations, or using a simple dynamic branch),
always use a static branch to avoid the dynamic branch in kfence_alloc()
if KFENCE was disabled at boot.
For CONFIG_KFENCE_STATIC_KEYS=n, this now avoids the dynamic branch if
KFENCE was disabled at boot.
To simplify, also unifies the location where kfence_allocation_gate is
read-checked to just be inline in kfence_alloc().
Link: https://lkml.kernel.org/r/20211019102524.2807208-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32e9f56a96 upstream.
When freeing txn buffers, binder_transaction_buffer_release()
attempts to detect whether the current context is the target by
comparing current->group_leader to proc->tsk. This is an unreliable
test. Instead explicitly pass an 'is_failure' boolean.
Detecting the sender was being used as a way to tell if the
transaction failed to be sent. When cleaning up after
failing to send a transaction, there is no need to close
the fds associated with a BINDER_TYPE_FDA object. Now
'is_failure' can be used to accurately detect this case.
Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Cc: stable <stable@vger.kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52f8869337 upstream.
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29bc22ac5e upstream.
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.
Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.
Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21b5fcdccb upstream.
musb_gadget_queue() adds the passed request to musb_ep::req_list. If the
endpoint is idle and it is the first request then it invokes
musb_queue_resume_work(). If the function returns an error then the
error is passed to the caller without any clean-up and the request
remains enqueued on the list. If the caller enqueues the request again
then the list corrupts.
Remove the request from the list on error.
Fixes: ea2f35c01d ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Viraj Shah <viraj.shah@linutronix.de>
Link: https://lore.kernel.org/r/20211021093644.4734-1-viraj.shah@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0548b2690 upstream.
On 64-bit:
drivers/usb/gadget/udc/fsl_qe_udc.c: In function ‘qe_ep0_rx’:
drivers/usb/gadget/udc/fsl_qe_udc.c:842:13: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
842 | vaddr = (u32)phys_to_virt(in_be32(&bd->buf));
| ^
In file included from drivers/usb/gadget/udc/fsl_qe_udc.c:41:
drivers/usb/gadget/udc/fsl_qe_udc.c:843:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
843 | frame_set_data(pframe, (u8 *)vaddr);
| ^
The driver assumes physical and virtual addresses are 32-bit, hence it
cannot work on 64-bit platforms.
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20211027080849.3276289-1-geert@linux-m68k.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d5e7a28b1 upstream.
This is a new warning in clang top-of-tree (will be clang 14):
In file included from arch/x86/kvm/mmu/mmu.c:27:
arch/x86/kvm/mmu/spte.h:318:9: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
return __is_bad_mt_xwr(rsvd_check, spte) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||
arch/x86/kvm/mmu/spte.h:318:9: note: cast one or both operands to int to silence this warning
The code is fine, but change it anyway to shut up this clever clogs
of a compiler.
Reported-by: torvic9@mailbox.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>