When reworking the vgic locking, the vgic distributor registration
got simplified, which was a very good cleanup. But just a tad too
radical, as we now register the *native* vgic only, ignoring the
GICv2-on-GICv3 that allows pre-historic VMs (or so I thought)
to run.
As it turns out, QEMU still defaults to GICv2 in some cases, and
this breaks Nathan's setup!
Fix it by propagating the *requested* vgic type rather than the
host's version.
Fixes: 59112e9c39 ("KVM: arm64: vgic: Fix a circular locking issue")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
link: https://lore.kernel.org/r/20230606221525.GA2269598@dev-arch.thelio-3990X
(cherry picked from commit 1caa71a7a6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I3a74c9de0afd9a38f4ca8dd5d4ce27d1937a5705
vgic_its_create() changes the vgic state without holding the
config_lock, which triggers a lockdep warning in vgic_v4_init():
[ 358.667941] WARNING: CPU: 3 PID: 178 at arch/arm64/kvm/vgic/vgic-v4.c:245 vgic_v4_init+0x15c/0x7a8
...
[ 358.707410] vgic_v4_init+0x15c/0x7a8
[ 358.708550] vgic_its_create+0x37c/0x4a4
[ 358.709640] kvm_vm_ioctl+0x1518/0x2d80
[ 358.710688] __arm64_sys_ioctl+0x7ac/0x1ba8
[ 358.711960] invoke_syscall.constprop.0+0x70/0x1e0
[ 358.713245] do_el0_svc+0xe4/0x2d4
[ 358.714289] el0_svc+0x44/0x8c
[ 358.715329] el0t_64_sync_handler+0xf4/0x120
[ 358.716615] el0t_64_sync+0x190/0x194
Wrap the whole of vgic_its_create() with config_lock since, in addition
to calling vgic_v4_init(), it also modifies the global kvm->arch.vgic
state.
Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-3-jean-philippe@linaro.org
(cherry picked from commit 9cf2f840c4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Id6319d5719181072b7202a814c71e9509c0ba865
Lockdep reports a circular lock dependency between the srcu and the
config_lock:
[ 262.179917] -> #1 (&kvm->srcu){.+.+}-{0:0}:
[ 262.182010] __synchronize_srcu+0xb0/0x224
[ 262.183422] synchronize_srcu_expedited+0x24/0x34
[ 262.184554] kvm_io_bus_register_dev+0x324/0x50c
[ 262.185650] vgic_register_redist_iodev+0x254/0x398
[ 262.186740] vgic_v3_set_redist_base+0x3b0/0x724
[ 262.188087] kvm_vgic_addr+0x364/0x600
[ 262.189189] vgic_set_common_attr+0x90/0x544
[ 262.190278] vgic_v3_set_attr+0x74/0x9c
[ 262.191432] kvm_device_ioctl+0x2a0/0x4e4
[ 262.192515] __arm64_sys_ioctl+0x7ac/0x1ba8
[ 262.193612] invoke_syscall.constprop.0+0x70/0x1e0
[ 262.195006] do_el0_svc+0xe4/0x2d4
[ 262.195929] el0_svc+0x44/0x8c
[ 262.196917] el0t_64_sync_handler+0xf4/0x120
[ 262.198238] el0t_64_sync+0x190/0x194
[ 262.199224]
[ 262.199224] -> #0 (&kvm->arch.config_lock){+.+.}-{3:3}:
[ 262.201094] __lock_acquire+0x2b70/0x626c
[ 262.202245] lock_acquire+0x454/0x778
[ 262.203132] __mutex_lock+0x190/0x8b4
[ 262.204023] mutex_lock_nested+0x24/0x30
[ 262.205100] vgic_mmio_write_v3_misc+0x5c/0x2a0
[ 262.206178] dispatch_mmio_write+0xd8/0x258
[ 262.207498] __kvm_io_bus_write+0x1e0/0x350
[ 262.208582] kvm_io_bus_write+0xe0/0x1cc
[ 262.209653] io_mem_abort+0x2ac/0x6d8
[ 262.210569] kvm_handle_guest_abort+0x9b8/0x1f88
[ 262.211937] handle_exit+0xc4/0x39c
[ 262.212971] kvm_arch_vcpu_ioctl_run+0x90c/0x1c04
[ 262.214154] kvm_vcpu_ioctl+0x450/0x12f8
[ 262.215233] __arm64_sys_ioctl+0x7ac/0x1ba8
[ 262.216402] invoke_syscall.constprop.0+0x70/0x1e0
[ 262.217774] do_el0_svc+0xe4/0x2d4
[ 262.218758] el0_svc+0x44/0x8c
[ 262.219941] el0t_64_sync_handler+0xf4/0x120
[ 262.221110] el0t_64_sync+0x190/0x194
Note that the current report, which can be triggered by the vgic_irq
kselftest, is a triple chain that includes slots_lock, but after
inverting the slots_lock/config_lock dependency, the actual problem
reported above remains.
In several places, the vgic code calls kvm_io_bus_register_dev(), which
synchronizes the srcu, while holding config_lock (#1). And the MMIO
handler takes the config_lock while holding the srcu read lock (#0).
Break dependency #1, by registering the distributor and redistributors
without holding config_lock. The ITS also uses kvm_io_bus_register_dev()
but already relies on slots_lock to serialize calls.
The distributor iodev is created on the first KVM_RUN call. Multiple
threads will race for vgic initialization, and only the first one will
see !vgic_ready() under the lock. To serialize those threads, rely on
slots_lock rather than config_lock.
Redistributors are created earlier, through KVM_DEV_ARM_VGIC_GRP_ADDR
ioctls and vCPU creation. Similarly, serialize the iodev creation with
slots_lock, and the rest with config_lock.
Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-2-jean-philippe@linaro.org
(cherry picked from commit 59112e9c39)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ib3b4846646f148af95746d786fc55b589b3217b6
commit f003277311 ("KVM: arm64: Use config_lock to protect vgic
state") was meant to rectify a longstanding lock ordering issue in KVM
where the kvm->lock is taken while holding vcpu->mutex. As it so
happens, the aforementioned commit introduced yet another locking issue
by acquiring the its_lock before acquiring the config lock.
This is obviously wrong, especially considering that the lock ordering
is well documented in vgic.c. Reshuffle the locks once more to take the
config_lock before the its_lock. While at it, sprinkle in the lockdep
hinting that has become popular as of late to keep lockdep apprised of
our ordering.
Cc: stable@vger.kernel.org
Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230412062733.988229-1-oliver.upton@linux.dev
(cherry picked from commit 49e5d16b6f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: If3a7d338bbcc490a7545ace0a8c039bb5e1dcbf0
kvm->lock must be taken outside of the vcpu->mutex. Of course, the
locking documentation for KVM makes this abundantly clear. Nonetheless,
the locking order in KVM/arm64 has been wrong for quite a while; we
acquire the kvm->lock while holding the vcpu->mutex all over the shop.
All was seemingly fine until commit 42a90008f8 ("KVM: Ensure lockdep
knows about kvm->lock vs. vcpu->mutex ordering rule") caught us with our
pants down, leading to lockdep barfing:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc7+ #19 Not tainted
------------------------------------------------------
qemu-system-aar/859 is trying to acquire lock:
ffff5aa69269eba0 (&host_kvm->lock){+.+.}-{3:3}, at: kvm_reset_vcpu+0x34/0x274
but task is already holding lock:
ffff5aa68768c0b8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8c/0xba0
which lock already depends on the new lock.
Add a dedicated lock to serialize writes to VM-scoped configuration from
the context of a vCPU. Protect the register width flags with the new
lock, thus avoiding the need to grab the kvm->lock while holding
vcpu->mutex in kvm_reset_vcpu().
Cc: stable@vger.kernel.org
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/kvmarm/f6452cdd-65ff-34b8-bab0-5c06416da5f6@arm.com/
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-3-oliver.upton@linux.dev
(cherry picked from commit c43120afb5)
[willdeacon@: Fix context conflict with pKVM VM type check]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I26d65f63a5e56399ffc4d1f74f62e0c15b37eea1
KVM/arm64 had the lock ordering backwards on vcpu->mutex and kvm->lock
from the very beginning. One such example is the way vCPU resets are
handled: the kvm->lock is acquired while handling a guest CPU_ON PSCI
call.
Add a dedicated lock to serialize writes to kvm_vcpu_arch::{mp_state,
reset_state}. Promote all accessors of mp_state to {READ,WRITE}_ONCE()
as readers do not acquire the mp_state_lock. While at it, plug yet
another race by taking the mp_state_lock in the KVM_SET_MP_STATE ioctl
handler.
As changes to MP state are now guarded with a dedicated lock, drop the
kvm->lock acquisition from the PSCI CPU_ON path. Similarly, move the
reader of reset_state outside of the kvm->lock and instead protect it
with the mp_state_lock. Note that writes to reset_state::reset have been
demoted to regular stores as both readers and writers acquire the
mp_state_lock.
While the kvm->lock inversion still exists in kvm_reset_vcpu(), at least
now PSCI CPU_ON no longer depends on it for serializing vCPU reset.
Cc: stable@vger.kernel.org
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-2-oliver.upton@linux.dev
(cherry picked from commit 0acc7239c2)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Iaec5533c5d73195eb5006262e4dcd84454cf5ebe
There are various bits of VM-scoped data that can only be configured
before the first call to KVM_RUN, such as the hypercall bitmaps and
the PMU. As these fields are protected by the kvm->lock and accessed
while holding vcpu->mutex, this is yet another example of lock
inversion.
Change out the kvm->lock for kvm->arch.config_lock in all of these
instances. Opportunistically simplify the locking mechanics of the
PMU configuration by holding the config_lock for the entirety of
kvm_arm_pmu_v3_set_attr().
Note that this also addresses a couple of bugs. There is an unguarded
read of the PMU version in KVM_ARM_VCPU_PMU_V3_FILTER which could race
with KVM_ARM_VCPU_PMU_V3_SET_PMU. Additionally, until now writes to the
per-vCPU vPMU irq were not serialized VM-wide, meaning concurrent calls
to KVM_ARM_VCPU_PMU_V3_IRQ could lead to a false positive in
pmu_irq_is_valid().
Cc: stable@vger.kernel.org
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-4-oliver.upton@linux.dev
(cherry picked from commit 4bba7f7def)
[willdeacon@: Fixed context conflict with moved pkvm trap init]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ibafb1b975b48c854ab981c93f74de1ab582c314d
Almost all of the vgic state is VM-scoped but accessed from the context
of a vCPU. These accesses were serialized on the kvm->lock which cannot
be nested within a vcpu->mutex critical section.
Move over the vgic state to using the config_lock. Tweak the lock
ordering where necessary to ensure that the config_lock is acquired
after the vcpu->mutex. Acquire the config_lock in kvm_vgic_create() to
avoid a race between the converted flows and GIC creation. Where
necessary, continue to acquire kvm->lock to avoid a race with vCPU
creation (i.e. flows that use lock_all_vcpus()).
Finally, promote the locking expectations in comments to lockdep
assertions and update the locking documentation for the config_lock as
well as vcpu->mutex.
Cc: stable@vger.kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-5-oliver.upton@linux.dev
(cherry picked from commit f003277311)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I20403cc5b0ba6baff6ca3dd3e8db6f337602821e
Currently, the unknown no-running-vcpu sites are reported when a
dirty page is tracked by mark_page_dirty_in_slot(). Until now, the
only known no-running-vcpu site is saving vgic/its tables through
KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on KVM device
"kvm-arm-vgic-its". Unfortunately, there are more unknown sites to
be handled and no-running-vcpu context will be allowed in these
sites: (1) KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} command
on KVM device "kvm-arm-vgic-its" to restore vgic/its tables. The
vgic3 LPI pending status could be restored. (2) Save vgic3 pending
table through KVM_DEV_ARM_{VGIC_GRP_CTRL, VGIC_SAVE_PENDING_TABLES}
command on KVM device "kvm-arm-vgic-v3".
In order to handle those unknown cases, we need a unified helper
vgic_write_guest_lock(). struct vgic_dist::save_its_tables_in_progress
is also renamed to struct vgic_dist::save_tables_in_progress.
No functional change intended.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-3-gshan@redhat.com
(cherry picked from commit a23eaf9368)
[willdeacon@: Drop missing dirty-ring hunks]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ie0dbb02e4f0f360b7554030e67c80d20ac8c1ca3
struct xhci_vendor_ops can be change when bug or new features.
So, Add padding to struct xhci_vendor_opsin order to be able to handle
any future problems easier.
Bug: 156315379
Change-Id: I62fe5edeee9f5bcfe7834a82f3e35d11a54cf52f
Signed-off-by: JaeHun Jung <jh0801.jung@samsung.com>
This reverts commit bb732365f7.
Reason for revert: breaks when enabled via config
when building the target `//common-modules/virtual-device:virtual_device_arm_dist`
```
In file included from common/drivers/android/android_debug_symbols.c:12:
common/arch/arm/include/asm/stacktrace.h:41:21: error: call to undeclared function 'in_entry_text'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
frame->ex_frame = in_entry_text(frame->pc);
^
In file included from common/drivers/android/android_debug_symbols.c:13:
common/arch/arm/include/asm/sections.h:14:20: error: static declaration of 'in_entry_text' follows non-static declaration
static inline bool in_entry_text(unsigned long addr)
^
common/arch/arm/include/asm/stacktrace.h:41:21: note: previous implicit declaration is here
frame->ex_frame = in_entry_text(frame->pc);
^
```
Change-Id: Id31003d4c9c60758f6038a63d40ffd7f8044cc9f
Signed-off-by: Matthias Maennich <maennich@google.com>
Introduce new API to expose symbols useful for debugging the GKI kernel.
Symbols exported from this driver would be difficult to maintain via the
traditional EXPORT_SYMBOL_GPL.
Bug: 199236943
Signed-off-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Yogesh Lal <ylal@codeaurora.org>
Bug: 287890135
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
[ delete some unused symbols and add _text/_end ]
Change-Id: I1cadb409289ca9ce36b0084efc9ac46f6bec6741
By default, thermal power throttle is always enable, but sometimes it
need to be disabled for a period of time, so add it to meet platform
thermal requirement.
Bug: 209386157
Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Di Shen <di.shen@unisoc.com>
Change-Id: If9c53a9669eec8e2821d837cfa3c660a9cfbf934
(cherry picked from commit 64999249d5)
To implement the devfreq cooling device registration by
energy model, it should add devfreq_cooling_em_register
to symbol list.
1 function symbol(s) added
'struct thermal_cooling_device* devfreq_cooling_em_register(struct devfreq*, struct devfreq_cooling_power*)'
Bug: 288934529
Signed-off-by: Di Shen <di.shen@unisoc.com>
Change-Id: I168a5bf1130edd7e53f107deb5c606fc98a95953
[ Upstream commit 04c55383fa ]
In the event of a failure in tcf_change_indev(), u32_set_parms() will
immediately return without decrementing the recently incremented
reference counter. If this happens enough times, the counter will
rollover and the reference freed, leading to a double free which can be
used to do 'bad things'.
In order to prevent this, move the point of possible failure above the
point where the reference counter is incremented. Also save any
meaningful return values to be applied to the return data at the
appropriate point in time.
This issue was caught with KASAN.
Bug: 273251569
Fixes: 705c709126 ("net: sched: cls_u32: no need to call tcf_exts_change for newly allocated struct")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 07f9cc229b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I95524bfda9a08a40b3d54515e528419dba18dc55
* CONFIG_WATCHDOG is disabled when compiling with
--kgdb option, hence the list of modules produced is
adjusted conditionally.
Bug: 270320056
Change-Id: I0eafb118836e6a31dc3b0392ab7d60b5597b9367
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
During XHCI resume, if there was a host controller error detected the
routine will attempt to re-initialize the XHCI HC, so that it can return
back to an operational state. If the XHCI host controller is being
removed, this sequence would be already handled within the XHCI halt path,
leading to a duplicate set of reg ops/calls. In addition, since the XHCI
bus is being removed, the overhead added in restarting the HCD is
unnecessary. Check for the XHC state before setting the reinit_xhc
parameter, which is responsible for triggering the restart.
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Message-ID: <20230531222719.14143-2-quic_wcheng@quicinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 285037166
(cherry picked from commit fb2ce17874https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing)
Change-Id: Iaaf20e855930b67b356e34286991411f74af2d60
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
When the kernel is built inside a sandbox container,
a forest of symlinks to the source files may be
created in the container. In this case, the generated
kheaders.tar.xz should follow these symlinks
to access the source files, instead of packing
the symlinks themselves.
Test: manual (add kheaders_data.tar.xz to the output,
then examine the contents)
Bug: 276339429
Fixes: b0acbba3f489 ("Revert "Revert "Revert "FROMLIST: kheaders: Follow symlinks to source files."""")
Link: https://lore.kernel.org/lkml/20230420010029.2702543-1-elsk@google.com/
Signed-off-by: Yifan Hong <elsk@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:28fa7afc424f3dc53358c0e9b080433d78f0cd54)
Merged-In: Ie4db22dfa13d05fdccb3ad8f4fae2fe3fead994e
Change-Id: Ie4db22dfa13d05fdccb3ad8f4fae2fe3fead994e
Generally DAMP is a best practice in Bazel, for this
specific case, it helps with:
* Better target discoverability and auto-completion.
* It's possible to use `select` for KGDB fixes later on
without encountering name expectations broken.
Bug: 256196368
Bug: 270320056
Change-Id: I300404a9b2b4b7c6569145a942ecb445d23e8e9a
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
* CONFIG_WATCHDOG is disabled when compiling with
--kgdb option, hence the list of modules produced is
adjusted conditionally on its value.
Bug: 270320056
Change-Id: I4db55fdf6b91a65209d2e0ae3bbb5f384c7eca22
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
The existing fuse-bpf freeing logic would free the fuse_file struct
immediately. However, this would break readahead. Move freeing logic
to the same place as done in classic fuse.
Bug: 286287652
Test: fuse_test passes, android boots, cts tests run
Change-Id: If13519f0e956a8da0dc98e7ac4aed2036070e969
Signed-off-by: Paul Lawrence <paullawrence@google.com>
By putting and nulling fuse_inode's bpf field in fuse_evict_inode, we
left a race condition - this inode can still be active. Do not put the
bpf program until we are doing the final free in fuse_free_inode. This
was the root cause of the reported bug.
The backing inode cannot be put in fuse_free_inode, since put_inode can
sleep and this is called from an RCU handler. But the backing inode
cannot be freed until an RCU interval, so move the put_inode to the same
location as in overlayfs, which is destroy_inode.
Remove a path in fuse_handle_bpf_prog whereby bpf can be nulled out.
When we want to be able to null/change the bpf_prog in the future, we
will have to use a mutex or maybe RCU to protect existing users. But
until this time, ban this path.
Bug: 284450048
Test: fuse_test passes, Pixel 6 passes basic tests
Change-Id: Ie6844242f279a5b202eb021eac5a2dd3d08bf09d
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Current usecases require more than 16 CMA areas. Hence increase the
number of CMA areas to 32.
Bug: 287582821
Change-Id: I50439ee2a3e16d62fdf6c77b99f4779f3af430d6
Signed-off-by: Pratyush Brahma <quic_pbrahma@quicinc.com>
Signed-off-by: Jaskaran Singh <quic_jasksing@quicinc.com>
They are controlled by kernel_images.modules_list, which is
set by define_common_kernels already.
The flags in build.configs has no effect.
Test: TH
Bug: 287697703
Signed-off-by: Yifan Hong <elsk@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:9bf4e4620ecc801c7eb824210595d9777b4a2ff8)
Merged-In: I1e322529476b4db67a1574393819900bdbd41311
Change-Id: I1e322529476b4db67a1574393819900bdbd41311
Commit "ANDROID: HID; Over-ride default maximum buffer size when using
UHID" provided a means for the UHID driver to offer an alternative
(smaller) report buffer size when dealing with user-space. The method
used was an Android-only solution designed to prevent the KMI ABI from
being broken (nb: the upstream solution was cleaner, but broke the ABI).
Since this solution involved consuming resources exported by a
subordinate driver, that driver would have to be enabled for the export
to take place. Since all of our default configs enable UHID, an issue
was not detected. However, for more specific kernel configs, where HID
is enabled, but UHID is not, this leads to compile-time undefined symbol
errors:
ld.lld: error: undefined symbol: uhid_hid_driver
This patch relies on the compiler to leave out unutilised sections of
the code if the associated resources are not available.
Bug: 260007429
Reported-by: Paul Lawrence <paullawrence@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I80b1aa7454c89d5c5e21f0268252ffb666efab97
[ Upstream commit 6326442278 ]
In r592_probe, dev->detect_timer was bound with r592_detect_timer.
In r592_irq function, the timer function will be invoked by mod_timer.
If we remove the module which will call hantro_release to make cleanup,
there may be a unfinished work. The possible sequence is as follows,
which will cause a typical UAF bug.
Fix it by canceling the work before cleanup in r592_remove.
CPU0 CPU1
|r592_detect_timer
r592_remove |
memstick_free_host|
put_device; |
kfree(host); |
|
| queue_work
| &host->media_checker //use
Bug: 287729043
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Link: https://lore.kernel.org/r/20230307164338.1246287-1-zyytlz.wz@163.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 9a342d4eb9)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Idb15f593287ebaeec294b3e276126306fa6743ba
commit 22ed903eee upstream.
syzbot detected a crash during log recovery:
XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791
XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200.
XFS (loop0): Starting recovery (logdev: internal)
==================================================================
BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074
CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106
print_address_description+0x74/0x340 mm/kasan/report.c:306
print_report+0x107/0x1f0 mm/kasan/report.c:417
kasan_report+0xcd/0x100 mm/kasan/report.c:517
xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913
xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713
xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953
xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946
xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930
xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493
xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829
xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933
xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666
get_tree_bdev+0x400/0x620 fs/super.c:1282
vfs_get_tree+0x88/0x270 fs/super.c:1489
do_new_mount+0x289/0xad0 fs/namespace.c:3145
do_mount fs/namespace.c:3488 [inline]
__do_sys_mount fs/namespace.c:3697 [inline]
__se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f89fa3f4aca
Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca
RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10
RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d
R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004
R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50
</TASK>
The fuzzed image contains an AGF with an obviously garbage
agf_refcount_level value of 32, and a dirty log with a buffer log item
for that AGF. The ondisk AGF has a higher LSN than the recovered log
item. xlog_recover_buf_commit_pass2 reads the buffer, compares the
LSNs, and decides to skip replay because the ondisk buffer appears to be
newer.
Unfortunately, the ondisk buffer is corrupt, but recovery just read the
buffer with no buffer ops specified:
error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno,
buf_f->blf_len, buf_flags, &bp, NULL);
Skipping the buffer leaves its contents in memory unverified. This sets
us up for a kernel crash because xfs_refcount_recover_cow_leftovers
reads the buffer (which is still around in XBF_DONE state, so no read
verification) and creates a refcountbt cursor of height 32. This is
impossible so we run off the end of the cursor object and crash.
Fix this by invoking the verifier on all skipped buffers and aborting
log recovery if the ondisk buffer is corrupt. It might be smarter to
force replay the log item atop the buffer and then see if it'll pass the
write verifier (like ext4 does) but for now let's go with the
conservative option where we stop immediately.
Bug: 284409747
Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reported-by: Danila Chernetsov <listdansp@mail.ru>
Link: https://lore.kernel.org/linux-xfs/20230601164439.15404-1-listdansp@mail.ru
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a2961463d7)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ie5e156221966323a9cb7cc261b4ed17593cfaabd
commit 25c150ac10 upstream.
Previously, capability was checked using capable(), which verified that the
caller of the ioctl system call had the required capability. In addition,
the result of the check would be stored in the HCI_SOCK_TRUSTED flag,
making it persistent for the socket.
However, malicious programs can abuse this approach by deliberately sharing
an HCI socket with a privileged task. The HCI socket will be marked as
trusted when the privileged task occasionally makes an ioctl call.
This problem can be solved by using sk_capable() to check capability, which
ensures that not only the current task but also the socket opener has the
specified capability, thus reducing the risk of privilege escalation
through the previously identified vulnerability.
Bug: 286456284
Cc: stable@vger.kernel.org
Fixes: f81f5b2db8 ("Bluetooth: Send control open and close messages for HCI raw sockets")
Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 47e6893a5b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I9a4b20c7b1e9b4e6bbd6371264aec039770a52ff
mas_rebalance() is called to rebalance an insufficient node into a
single node or two sufficient nodes. The preallocation estimate is
always too many in this case as the height of the tree will never grow
and there is no possibility to have a three way split in this case, so
revise the node allocation count.
Change-Id: I04ba0674da381c06d4f8077f9f59d64b7d1a8312
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Link: https://lore.kernel.org/all/20230612203953.2093911-9-Liam.Howlett@oracle.com/
Bug: 274059236
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
mas_prealloc() may walk partially down the tree before finding that a
split or spanning store is needed. When the write occurs, relax the
logic on resetting the walk so that partial walks will not restart, but
walks that have gone too far (a store that affects beyond the current
node) should be restarted.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Link: https://lore.kernel.org/all/20230612203953.2093911-16-Liam.Howlett@oracle.com/
Bug: 274059236
Change-Id: I87dedebae085f067b08caeaf1bd19bb343ff305f
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Calculate the number of nodes based on the pending write action instead
of assuming the worst case.
This addresses a performance regression introduced in platforms that
have longer allocation timing.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Link: https://lore.kernel.org/all/20230612203953.2093911-15-Liam.Howlett@oracle.com/
[surenb: replace mas_wr_new_end with mas_wr_node_size]
Bug: 274059236
Change-Id: I8fc22bca45fa005acf767722034a260242a4da52
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
This is needed to get the module on the system_dlkm image.
Bug: 276339429
Change-Id: Ib8c19d0d23f27bc3872e8d387b20cef07327c600
Signed-off-by: Will McVicker <willmcvicker@google.com>