The mmap read lock is used during the shrinker's callback, which means
that using alloc->vma pointer isn't safe as it can race with munmap().
As of commit dd2283f260 ("mm: mmap: zap pages with read mmap_sem in
munmap") the mmap lock is downgraded after the vma has been isolated.
I was able to reproduce this issue by manually adding some delays and
triggering page reclaiming through the shrinker's debug sysfs. The
following KASAN report confirms the UAF:
==================================================================
BUG: KASAN: slab-use-after-free in zap_page_range_single+0x470/0x4b8
Read of size 8 at addr ffff356ed50e50f0 by task bash/478
CPU: 1 PID: 478 Comm: bash Not tainted 6.6.0-rc5-00055-g1c8b86a3799f-dirty #70
Hardware name: linux,dummy-virt (DT)
Call trace:
zap_page_range_single+0x470/0x4b8
binder_alloc_free_page+0x608/0xadc
__list_lru_walk_one+0x130/0x3b0
list_lru_walk_node+0xc4/0x22c
binder_shrink_scan+0x108/0x1dc
shrinker_debugfs_scan_write+0x2b4/0x500
full_proxy_write+0xd4/0x140
vfs_write+0x1ac/0x758
ksys_write+0xf0/0x1dc
__arm64_sys_write+0x6c/0x9c
Allocated by task 492:
kmem_cache_alloc+0x130/0x368
vm_area_alloc+0x2c/0x190
mmap_region+0x258/0x18bc
do_mmap+0x694/0xa60
vm_mmap_pgoff+0x170/0x29c
ksys_mmap_pgoff+0x290/0x3a0
__arm64_sys_mmap+0xcc/0x144
Freed by task 491:
kmem_cache_free+0x17c/0x3c8
vm_area_free_rcu_cb+0x74/0x98
rcu_core+0xa38/0x26d4
rcu_core_si+0x10/0x1c
__do_softirq+0x2fc/0xd24
Last potentially related work creation:
__call_rcu_common.constprop.0+0x6c/0xba0
call_rcu+0x10/0x1c
vm_area_free+0x18/0x24
remove_vma+0xe4/0x118
do_vmi_align_munmap.isra.0+0x718/0xb5c
do_vmi_munmap+0xdc/0x1fc
__vm_munmap+0x10c/0x278
__arm64_sys_munmap+0x58/0x7c
Fix this issue by performing instead a vma_lookup() which will fail to
find the vma that was isolated before the mmap lock downgrade. Note that
this option has better performance than upgrading to a mmap write lock
which would increase contention. Plus, mmap_write_trylock() has been
recently removed anyway.
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Cc: stable@vger.kernel.org
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-3-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 304651042
(cherry picked from commit 3f489c2067c5824528212b0fc18b28d51332d906
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
char-misc-next)
Change-Id: I206096ab47666eaee1651a4e102a01e6b7b4e5fb
Signed-off-by: Carlos Llamas <cmllamas@google.com>
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream.
Lee pointed out issue found by syscaller [0] hitting BUG in prog array
map poke update in prog_array_map_poke_run function due to error value
returned from bpf_arch_text_poke function.
There's race window where bpf_arch_text_poke can fail due to missing
bpf program kallsym symbols, which is accounted for with check for
-EINVAL in that BUG_ON call.
The problem is that in such case we won't update the tail call jump
and cause imbalance for the next tail call update check which will
fail with -EBUSY in bpf_arch_text_poke.
I'm hitting following race during the program load:
CPU 0 CPU 1
bpf_prog_load
bpf_check
do_misc_fixups
prog_array_map_poke_track
map_update_elem
bpf_fd_array_map_update_elem
prog_array_map_poke_run
bpf_arch_text_poke returns -EINVAL
bpf_prog_kallsyms_add
After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
poke update fails on expected jump instruction check in bpf_arch_text_poke
with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
Similar race exists on the program unload.
Fixing this by moving the update to bpf_arch_poke_desc_update function which
makes sure we call __bpf_arch_text_poke that skips the bpf address check.
Each architecture has slightly different approach wrt looking up bpf address
in bpf_arch_text_poke, so instead of splitting the function or adding new
'checkip' argument in previous version, it seems best to move the whole
map_poke_run update as arch specific code.
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
Bug: 309551558
Fixes: ebf7d1f508 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Cc: Lee Jones <lee@kernel.org>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 13578b4ea4)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1291f0589e84f627ee44d07acb24196fab166c29
Under certain circumstances a SoC can reach a critical temperaturelimit
and is unable to stabilize the temperature around a temperaturecontrol.
The system may ask for a specific power budget butbecause of the OPP
density, we can only choose an OPP with a powerbudget lower than the
requested one and under-utilize the CPU, thuslosing performance. In
other words, one OPP under-utilizes the CPUwith a power less than the
requested power budget and the next OPPexceeds the power budget. The
cpu idle cooling can solve this problem.
Bug: 299411923
Signed-off-by: Aran Dalton <arda@allwinnertech.com>
Change-Id: I1c17b340617e88be075097dc47f30ce94be2a4d7
As we reserve only 1GB of memory for the MMIO region don't prepopulate
the entire remaining address space with MMIO as this is prone to failure.
Instead, let the MMIO regions to be created lazily on the fault path and
keep only the RAM regions prepopulated.
Bug: 307805059
Test: Boot pKVM with CONFIG_ARM64_16K_PAGES
Change-Id: I6327f42eb17c6588335a1e04736393c9032114ab
Signed-off-by: Sebastian Ene <sebastianene@google.com>
From pKVM point of view, unknown SMCs are simply forwarded, we can't
consider them invalid or not. This was probably a typo following a copy
of the host_hcall event.
Bug: 299430621
Change-Id: Ieb53f985a5187a8b5a9feb4a95982b15cdc1b04a
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
(cherry picked from commit 717d1f8f91)
The structures that define hyp events must be packed so they match
their format definitions in the tracefs file
hyp/events/hyp/<event>/format.
Bug: 299430621
Change-Id: Ia7e1a686744d5c9c3f8a21881f03228c8acecade
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
When this pKVM module ops has been introduced, the documentation has
been omitted.
Bug: 308373293
Change-Id: I9e471414e72a1ee04c132de4ed95d77e815ae8c9
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
The KOBJ_CHANGE uevent is sent before gadget unbind is actually
executed, resulting in inaccurate uevent emitted at incorrect timing
(the uevent would have USB_UDC_DRIVER variable set while it would
soon be removed).
Move the KOBJ_CHANGE uevent to the end of the unbind function so that
uevent is sent only after the change has been made.
Fixes: 2ccea03a8f ("usb: gadget: introduce UDC Class")
Cc: stable@vger.kernel.org
Signed-off-by: Roy Luo <royluo@google.com>
Link: https://lore.kernel.org/r/20231128221756.2591158-1-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 312543856
Change-Id: Ida7fa7e1cfae3d1b3f3348512a67fe91065f25af
(cherry picked from commit 73ea73affe8622bdf292de898da869d441da6a9d)
[royluo: resolved conflicts in drivers/usb/gadget/udc/core.c]
Signed-off-by: Roy Luo <royluo@google.com>
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs
DM tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, many readahead IOs are submitted to dm_verity
from filesystem. Then the kverity works are busy doing FEC process
which cost too much time to finish dm-verity IO. This causes needless
delay which feels like system is hung.
After adding debugging it was found that each readahead IO needed
around 10s to finish when this situation occurred. This is due to IO
amplification:
dm-snapshot suspend
erofs_readahead // 300+ io is submitted
dm_submit_bio (dm_verity)
dm_submit_bio (dm_snapshot)
bio return EIO
bio got nothing, it's empty
verity_end_io
verity_verify_io
forloop range(0, io->n_blocks) // each io->nblocks ~= 20
verity_fec_decode
fec_decode_rsb
fec_read_bufs
forloop range(0, v->fec->rsn) // v->fec->rsn = 253
new_read
submit_bio (dm_snapshot)
end loop
end loop
dm-snapshot resume
Readahead BIOs get nothing while dm-snapshot is suspended, so all of
them will cause verity's FEC.
Each readahead BIO needs to verify ~20 (io->nblocks) blocks.
Each block needs to do FEC, and every block needs to do 253
(v->fec->rsn) reads.
So during the suspend interval(~200ms), 300 readahead BIOs trigger
~1518000 (300*20*253) IOs to dm-snapshot.
As readahead IO is not required by userspace, and to fix this issue,
it is best to pass readahead errors to upper layer to handle it.
Cc: stable@vger.kernel.org
Fixes: a739ff3f54 ("dm verity: add support for forward error correction")
Bug: 316972624
Link: https://lore.kernel.org/dm-devel/b84fb49-bf63-3442-8c99-d565e134f2@redhat.com
Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Akilesh Kailash <akailash@google.com>
(cherry picked from commit 0193e3966ceeeef69e235975918b287ab093082b)
Change-Id: I73560e5660cebdc1997e1f9926cbb8888789eb46
commit 317eb9685095678f2c9f5a8189de698c5354316a upstream.
Otherwise set elements can be deactivated twice which will cause a crash.
Bug: 316310313
Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 189c2a8293)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I27fb6ee806642e23ca02700763a387341dd463e6
Bug: 292925770
Test: fuse_test run. The following steps on Android also now pass:
Create /data/123 and /data/media/0/Android/data/45 directories
Mount /data/123 directory to /data/media/0/Android/data/45 directory
Create 1.txt under the /data/123 directory
File 1.txt should appear in /storage/emulated/0/Android/data/45
Signed-off-by: Paul Lawrence <paullawrence@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:9323938705b42cb4dd863d5cf8022ba8f2282952)
Merged-In: I1fe27d743ca2981e624a9aa87d9ab6deb313aadc
Change-Id: I1fe27d743ca2981e624a9aa87d9ab6deb313aadc
Nothing fancy here. Keeping full history is not required.
`git checkout mainline/master -- scripts/checkpatch.pl`
This may need to be done periodically.
Bug: 316492624
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I4c90b50197ca7277c59e96bf332ecf795c4f3d12
Block mappings can be split as part of a page table update. When
prefaulting entries during the split, it is pointless to install
valid ptes which will later be modified by the same walk.
At the same time, push the check for pte_is_counted into the
prefault handler, where it logically belongs.
Bug: 308373293
Change-Id: If4599b2860aa62d82ce8db019a8410c2d883de71
Signed-off-by: Keir Fraser <keirf@google.com>
This allows protection attributes to be changed for a range of
pages via a single module API call.
The original API call modifying a single page is now implemented
as a shim on top of the new range-based call.
The ABI STG is also fixed up:
type 'struct pkvm_module_ops' changed
member 'union { int(* host_stage2_mod_prot_range)(u64, enum kvm_pgtable_prot, u64); struct { u64 android_kabi_reserved1; }; union { }; }' was added
member 'u64 android_kabi_reserved1' was removed
Bug: 308373293
Change-Id: I6fbb2e0b325aa972148f48746565dcc10d74edaf
Signed-off-by: Keir Fraser <keirf@google.com>
Modules can only relax permissions to RWX. This seems rather arbitrary.
Instead, allow any valid permissions to be set, as long as the page is
a pristine host page, or already module owned.
Bug: 308373293
Change-Id: I905786fad6543f47a00bd9b9f07e17dd660d457c
Signed-off-by: Keir Fraser <keirf@google.com>
Merge the relaxation and restriction paths to both only need to adjust
permissions. This avoids un-map + re-map on the restriction path; and
avoids installing an annotated entry on the relaxation path (which
will cause a translation fault on first access by the host).
Bug: 308373293
Change-Id: I9c7a6ac149aad64b19a5ce7808334188475b27cc
Signed-off-by: Keir Fraser <keirf@google.com>
When splitting a block mapping, we install a table entry pointing to an
empty page and recreate the new entries lazily as we fault them in. For
page-tables with the KVM_PGTABLE_S2_IDMAP flag, this can result in
unnecessary translation faults.
When splitting a block for a page-table with KVM_PGTABLE_S2_IDMAP set,
pre-populate the newly allocate page-table page with contiguous ptes
based on the attributes of the block.
Bug: 308373293
Change-Id: I0c53d048de913e193830caef93d75755270db709
Signed-off-by: Will Deacon <willdeacon@google.com>
Signed-off-by: Keir Fraser <keirf@google.com>
CONFIG_TASK_DELAY_ACCT cannot be enabled since `struct task_struct`
is KMI frozen. Instead, use vendor hooks to allow delay accounting
to be implemented in a vendor module.
Bug: 310129610
Bug: 314931189
Change-Id: If814d7834889fe162aba3dd97e935289127ca3ae
Signed-off-by: Dongyun Liu <dongyun.liu@transsion.com>
(cherry picked from commit bb57557246d39dba8a66df7f43983fe1ec71bff6)
A transaction complete work is allocated and queued for each
transaction. Under certain conditions the work->type might be marked as
BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT to notify userspace about
potential spamming threads or as BINDER_WORK_TRANSACTION_PENDING when
the target is currently frozen.
However, these work types are not being handled in binder_release_work()
so they will leak during a cleanup. This was reported by syzkaller with
the following kmemleak dump:
BUG: memory leak
unreferenced object 0xffff88810e2d6de0 (size 32):
comm "syz-executor338", pid 5046, jiffies 4294968230 (age 13.590s)
hex dump (first 32 bytes):
e0 6d 2d 0e 81 88 ff ff e0 6d 2d 0e 81 88 ff ff .m-......m-.....
04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff81573b75>] kmalloc_trace+0x25/0x90 mm/slab_common.c:1114
[<ffffffff83d41873>] kmalloc include/linux/slab.h:599 [inline]
[<ffffffff83d41873>] kzalloc include/linux/slab.h:720 [inline]
[<ffffffff83d41873>] binder_transaction+0x573/0x4050 drivers/android/binder.c:3152
[<ffffffff83d45a05>] binder_thread_write+0x6b5/0x1860 drivers/android/binder.c:4010
[<ffffffff83d486dc>] binder_ioctl_write_read drivers/android/binder.c:5066 [inline]
[<ffffffff83d486dc>] binder_ioctl+0x1b2c/0x3cf0 drivers/android/binder.c:5352
[<ffffffff816b25f2>] vfs_ioctl fs/ioctl.c:51 [inline]
[<ffffffff816b25f2>] __do_sys_ioctl fs/ioctl.c:871 [inline]
[<ffffffff816b25f2>] __se_sys_ioctl fs/ioctl.c:857 [inline]
[<ffffffff816b25f2>] __x64_sys_ioctl+0xf2/0x140 fs/ioctl.c:857
[<ffffffff84b30008>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff84b30008>] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
[<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fix the leaks by kfreeing these work types in binder_release_work() and
handle them as a BINDER_WORK_TRANSACTION_COMPLETE cleanup.
Cc: stable@vger.kernel.org
Fixes: 0567461a7a ("binder: return pending info for frozen async txns")
Fixes: a7dc1e6f99 ("binder: tell userspace to dump current backtrace when detected oneway spamming")
Reported-by: syzbot+7f10c1653e35933c0f1e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7f10c1653e35933c0f1e
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Change-Id: I8e1ee7af87ef5706544e4f320e9498b8f4855a6b
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20230922175138.230331-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1aa3aaf895)
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Kfence only needs its pool to be mapped as page granularity, if it is
inited early. Previous judgement was a bit over protected. From [1], Mark
suggested to "just map the KFENCE region a page granularity". So I
decouple it from judgement and do page granularity mapping for kfence
pool only. Need to be noticed that late init of kfence pool still requires
page granularity mapping.
Page granularity mapping in theory cost more(2M per 1GB) memory on arm64
platform. Like what I've tested on QEMU(emulated 1GB RAM) with
gki_defconfig, also turning off rodata protection:
Before:
[root@liebao ]# cat /proc/meminfo
MemTotal: 999484 kB
After:
[root@liebao ]# cat /proc/meminfo
MemTotal: 1001480 kB
To implement this, also relocate the kfence pool allocation before the
linear mapping setting up, arm64_kfence_alloc_pool is to allocate phys
addr, __kfence_pool is to be set after linear mapping set up.
LINK: [1] https://lore.kernel.org/linux-arm-kernel/Y+IsdrvDNILA59UN@FVFF77S0Q05N/
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/1679066974-690-1-git-send-email-quic_zhenhuah@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 284812202
Change-Id: I8e7c565d3f4d6349a028a6a060259d62cf5beee7
(cherry picked from commit bfa7965b33)
Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>
KFENCE requires linear map to be mapped at page granularity, so that it
is possible to protect/unprotect single pages, just like with
rodata_full and DEBUG_PAGEALLOC.
Instead of repating
can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE)
make can_set_direct_map() handle the KFENCE case.
This also prevents potential false positives in kernel_page_present()
that may return true for non-present page if CONFIG_KFENCE is enabled.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220921074841.382615-1-rppt@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit b9dd04a20f)
Bug: 284812202
Change-Id: Ie87b2184b5aac2c34a05dd9b832b937786e367ff
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
5.15.138 Dragonboard 845c because of recently added symbol,
rpmsg_register_device_override.
So add it to the symbol list and update the abi definitions.
Bug: 313495196
Change-Id: I3a3504b6d2061bfce0abe9801e2ecb210c337b9f
Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
(cherry picked from commit 02ca2ae2af)
Signed-off-by: Lee Jones <joneslee@google.com>
In commit e70898ae1a ("rpmsg: Fix kfree() of static memory on setting
driver_override") a pointer was changed to const, which messes with the
CRC and ABI checks. As the code is fine if this is left as not-const,
just put it back to preserve the abi.
Bug: 161946584
Fixes: e70898ae1a ("rpmsg: Fix kfree() of static memory on setting driver_override")
Change-Id: I9a87b9cf412191d9872b48f1f876a81df6701de0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
(cherry picked from commit 4f2270e2bca1854ebe8be23a82f665eaa27ee831)
Signed-off-by: Lee Jones <joneslee@google.com>
In commit 389190b254 ("driver: platform: Add helper for safer setting
of driver_override"), a pointer was changed to const, which messes with
the CRC and ABI checks. As the code is fine if this is left as
not-const, just put it back to preserve the abi.
Bug: 161946584
Fixes: 389190b254 ("driver: platform: Add helper for safer setting of driver_override")
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ieb4a730a6a5767d31fbec2f1ba683617f5cda7a9
(cherry picked from commit 1202da82c0)
Signed-off-by: Lee Jones <joneslee@google.com>
commit bb17d110cb upstream.
driver_set_override() helper uses device_lock() so it should not be
called before rpmsg_register_device() (which calls device_register()).
Effect can be seen with CONFIG_DEBUG_MUTEXES:
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 3 PID: 57 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x430
...
Call trace:
__mutex_lock+0x1ec/0x430
mutex_lock_nested+0x44/0x50
driver_set_override+0x124/0x150
qcom_glink_native_probe+0x30c/0x3b0
glink_rpm_probe+0x274/0x350
platform_probe+0x6c/0xe0
really_probe+0x17c/0x3d0
__driver_probe_device+0x114/0x190
driver_probe_device+0x3c/0xf0
...
Refactor the rpmsg_register_device() function to use two-step device
registering (initialization + add) and call driver_set_override() in
proper moment.
This moves the code around, so while at it also NULL-ify the
rpdev->driver_override in error path to be sure it won't be kfree()
second time.
Bug: 295334746
Fixes: 42cd402b8f ("rpmsg: Fix kfree() of static memory on setting driver_override")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220429195946.1061725-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bfd4a664dd)
[Lee: Git was confused that the hunk being removed had changed]
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ic07d9ff669e88a50354ad8e978ae8e93316a2a5e
commit 6c2f421174 upstream.
Several core drivers and buses expect that driver_override is a
dynamically allocated memory thus later they can kfree() it.
However such assumption is not documented, there were in the past and
there are already users setting it to a string literal. This leads to
kfree() of static memory during device release (e.g. in error paths or
during unbind):
kernel BUG at ../mm/slub.c:3960!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
...
(kfree) from [<c058da50>] (platform_device_release+0x88/0xb4)
(platform_device_release) from [<c0585be0>] (device_release+0x2c/0x90)
(device_release) from [<c0a69050>] (kobject_put+0xec/0x20c)
(kobject_put) from [<c0f2f120>] (exynos5_clk_probe+0x154/0x18c)
(exynos5_clk_probe) from [<c058de70>] (platform_drv_probe+0x6c/0xa4)
(platform_drv_probe) from [<c058b7ac>] (really_probe+0x280/0x414)
(really_probe) from [<c058baf4>] (driver_probe_device+0x78/0x1c4)
(driver_probe_device) from [<c0589854>] (bus_for_each_drv+0x74/0xb8)
(bus_for_each_drv) from [<c058b48c>] (__device_attach+0xd4/0x16c)
(__device_attach) from [<c058a638>] (bus_probe_device+0x88/0x90)
(bus_probe_device) from [<c05871fc>] (device_add+0x3dc/0x62c)
(device_add) from [<c075ff10>] (of_platform_device_create_pdata+0x94/0xbc)
(of_platform_device_create_pdata) from [<c07600ec>] (of_platform_bus_create+0x1a8/0x4fc)
(of_platform_bus_create) from [<c0760150>] (of_platform_bus_create+0x20c/0x4fc)
(of_platform_bus_create) from [<c07605f0>] (of_platform_populate+0x84/0x118)
(of_platform_populate) from [<c0f3c964>] (of_platform_default_populate_init+0xa0/0xb8)
(of_platform_default_populate_init) from [<c01031f8>] (do_one_initcall+0x8c/0x404)
Provide a helper which clearly documents the usage of driver_override.
This will allow later to reuse the helper and reduce the amount of
duplicated code.
Convert the platform driver to use a new helper and make the
driver_override field const char (it is not modified by the core).
Bug: 95334746
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 389190b254)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1ef85ee1f3828e947adec4a45363d8434876aade
Fix some merging errors. Quells the following WARN() output:
Duplicate entry for capability 62
WARNING: CPU: 0 PID: 0 at arch/arm64/kernel/cpufeature.c:958 init_cpu_hwcaps_indirect_list_from_array+0x7c/0x98
Bug: 315739115
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I4e8798cc6395050a54222ec68432fae3c286dd87
CPU errata 2441007 (Cortex-A55) and 2441009 (Cortex-A510) are categorised
as "rare" by Arm and consequently the workaround is not intended to be
deployed in practice as the issue is not expected to occur in real-world
environments.
Given that the cost of the workaround, which issues additional broadcast
TLB invalidation requests, has been shown to impact kswapd significantly
on Pixel devices, disable the workaround following Arm's recommendation.
Bug: 306231846
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I39b6d9736cfa79827321151b45774f62c8d1a747
INFO: type 'struct fscrypt_info' changed
member 'u8 ci_data_unit_bits' was added
member 'u8 ci_data_units_per_block_bits' was added
type 'struct fscrypt_policy_v2' changed
member '__u8 log2_data_unit_size' was added
member changed from '__u8 __reserved[4]' to '__u8 __reserved[3]'
offset changed from 32 to 40
type changed from '__u8[4]' to '__u8[3]'
number of elements changed from 4 to 3
Bug: 299136786
Bug: 302588300
Change-Id: Idbbc2123961a41d395323c72cef67d94bdd17ab0
Signed-off-by: Eric Biggers <ebiggers@google.com>
Until now, fscrypt has always used the filesystem block size as the
granularity of file contents encryption. Two scenarios have come up
where a sub-block granularity of contents encryption would be useful:
1. Inline crypto hardware that only supports a crypto data unit size
that is less than the filesystem block size.
2. Support for direct I/O at a granularity less than the filesystem
block size, for example at the block device's logical block size in
order to match the traditional direct I/O alignment requirement.
(1) first came up with older eMMC inline crypto hardware that only
supports a crypto data unit size of 512 bytes. That specific case
ultimately went away because all systems with that hardware continued
using out of tree code and never actually upgraded to the upstream
inline crypto framework. But, now it's coming back in a new way: some
current UFS controllers only support a data unit size of 4096 bytes, and
there is a proposal to increase the filesystem block size to 16K.
(2) was discussed as a "nice to have" feature, though not essential,
when support for direct I/O on encrypted files was being upstreamed.
Still, the fact that this feature has come up several times does suggest
it would be wise to have available. Therefore, this patch implements it
by using one of the reserved bytes in fscrypt_policy_v2 to allow users
to select a sub-block data unit size. Supported data unit sizes are
powers of 2 between 512 and the filesystem block size, inclusively.
Support is implemented for both the FS-layer and inline crypto cases.
This patch focuses on the basic support for sub-block data units. Some
things are out of scope for this patch but may be addressed later:
- Supporting sub-block data units in combination with
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64, in most cases. Unfortunately this
combination usually causes data unit indices to exceed 32 bits, and
thus fscrypt_supported_policy() correctly disallows it. The users who
potentially need this combination are using f2fs. To support it, f2fs
would need to provide an option to slightly reduce its max file size.
- Supporting sub-block data units in combination with
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. This has the same problem
described above, but also it will need special code to make DUN
wraparound still happen on a FS block boundary.
- Supporting use case (2) mentioned above. The encrypted direct I/O
code will need to stop requiring and assuming FS block alignment.
This won't be hard, but it belongs in a separate patch.
- Supporting this feature on filesystems other than ext4 and f2fs.
(Filesystems declare support for it via their fscrypt_operations.)
On UBIFS, sub-block data units don't make sense because UBIFS encrypts
variable-length blocks as a result of compression. CephFS could
support it, but a bit more work would be needed to make the
fscrypt_*_block_inplace functions play nicely with sub-block data
units. I don't think there's a use case for this on CephFS anyway.
Link: https://lore.kernel.org/r/20230925055451.59499-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Bug: 299136786
Bug: 302588300
(cherry picked from commit 5b11888471806edf699316d4dcb9b426caebbef2)
(Reworked this commit to not change struct fscrypt_operations and not
depend on other commits that changed struct fscrypt_operations. Also
resolved conflicts with the HW-wrapped key support. Also use pages
instead of folios, since older kernels don't have folios.)
Change-Id: Ic3dc56ef3f42d123f812e9037e2cc6f0b24bacc1
Signed-off-by: Eric Biggers <ebiggers@google.com>
Android has carried custom patches not allowing file-backed page
allocation from CMA area since it could cause CMA allocation
failure/slowness. However, Compaction could allow migrating
file-backed pages to CMA area so causes CMA allocation's trouble.
This patch checks whether there are file-backed migration source
pages or not in compaction. If there are, compaction allows only
MIGRATE_MOVABLE's pageblock, not MIGRATE_CMA's one for selecting
migration target pages(i.e., free pages).
[surenb: original patch reworked using compact_control_ext to avoid
breaking frozen ABI]
Bug: 207498240
Bug: 305594365
Change-Id: Ibf30eea6bf24aafa2a75a73cef6084b1c837bd06
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
[ Upstream commit 93995bf4af2c5a99e2a87f0cd5ce547d31eb7630 ]
The expired catchall element is not deactivated and removed from GC sync
path. This path holds mutex so just call nft_setelem_data_deactivate()
and nft_setelem_catchall_remove() before queueing the GC work.
Bug: 310691882
Fixes: 4a9e12ea7e ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC")
Reported-by: lonial con <kongln9170@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 13e2d49647)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ic5d1d98fe5a749e759869f0789cbb77c4ab5e6c2
readpages will be triggered on the fuse fs in passthrough mode though
system calls like fadvise. If the daemon isn't aware of the file, this
will likely cause a hang.
For the moment, simply ignore fadvise in this situation
Bug: 301201239
Test: fuse_test, atest ScopedStorageDeviceTest both pass
Signed-off-by: Paul Lawrence <paullawrence@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:ac9071df3ba6715219a16a44d4711f041b0c25de)
Merged-In: I524a84aeeb1b1593e51264fcc37f7cfa66757168
Change-Id: I524a84aeeb1b1593e51264fcc37f7cfa66757168
Adding the following symbols:
- __traceiter_android_vh_ptep_clear_flush_young
- __tracepoint_android_vh_ptep_clear_flush_young
Bug: 312692863
Change-Id: I5e6232bb1121d3d59ea57380c7950412b1652208
Signed-off-by: Martin Liu <liumartin@google.com>