[ Upstream commit a0864cf32044233e56247fa0eed3ac660f15db9e ]
The flash module has status bits to indicate different thermal
conditions which are called as OTSTx. For each OTSTx status,
there is a recommended total flash current for all channels to
prevent the flash module entering into higher thermal level.
For example, the total flash current should be limited to 1000mA/500mA
respectively when the HW reaches the OTST1/OTST2 thermal level.
Signed-off-by: Fenglin Wu <quic_fenglinw@quicinc.com>
Link: https://lore.kernel.org/r/20240705-qcom_flash_thermal_derating-v3-1-8e2e2783e3a6@quicinc.com
Signed-off-by: Lee Jones <lee@kernel.org>
Stable-dep-of: fab15f57360b ("leds: flash: leds-qcom-flash: Fix registry access after re-bind")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c5f1dde7f731e7bf2e7c169ca42cb4989fc2f8b9 ]
Instead of relying on RTNL, ets_dump() can use READ_ONCE()
annotations, paired with WRITE_ONCE() ones in ets_change().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 87c6efc5ce9c ("net/sched: ets: use old 'nbands' while purging unused classes")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59b33fab4ca4d7dacc03367082777627e05d0323 ]
After commit 5c70eb5c593d ("net: better track kernel sockets lifetime"),
kernel sockets now use net_passive reference counting. However, commit
95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"")
restored the manual socket refcount manipulation without adapting to this
new mechanism, causing a memory leak.
The issue can be reproduced by[1]:
1. Creating a network namespace
2. Mounting and Unmounting CIFS within the namespace
3. Deleting the namespace
Some memory leaks may appear after a period of time following step 3.
unreferenced object 0xffff9951419f6b00 (size 256):
comm "ip", pid 447, jiffies 4294692389 (age 14.730s)
hex dump (first 32 bytes):
1b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 80 77 c2 44 51 99 ff ff .........w.DQ...
backtrace:
__kmem_cache_alloc_node+0x30e/0x3d0
__kmalloc+0x52/0x120
net_alloc_generic+0x1d/0x30
copy_net_ns+0x86/0x200
create_new_namespaces+0x117/0x300
unshare_nsproxy_namespaces+0x60/0xa0
ksys_unshare+0x148/0x360
__x64_sys_unshare+0x12/0x20
do_syscall_64+0x59/0x110
entry_SYSCALL_64_after_hwframe+0x78/0xe2
...
unreferenced object 0xffff9951442e7500 (size 32):
comm "mount.cifs", pid 475, jiffies 4294693782 (age 13.343s)
hex dump (first 32 bytes):
40 c5 38 46 51 99 ff ff 18 01 96 42 51 99 ff ff @.8FQ......BQ...
01 00 00 00 6f 00 c5 07 6f 00 d8 07 00 00 00 00 ....o...o.......
backtrace:
__kmem_cache_alloc_node+0x30e/0x3d0
kmalloc_trace+0x2a/0x90
ref_tracker_alloc+0x8e/0x1d0
sk_alloc+0x18c/0x1c0
inet_create+0xf1/0x370
__sock_create+0xd7/0x1e0
generic_ip_connect+0x1d4/0x5a0 [cifs]
cifs_get_tcp_session+0x5d0/0x8a0 [cifs]
cifs_mount_get_session+0x47/0x1b0 [cifs]
dfs_mount_share+0xfa/0xa10 [cifs]
cifs_mount+0x68/0x2b0 [cifs]
cifs_smb3_do_mount+0x10b/0x760 [cifs]
smb3_get_tree+0x112/0x2e0 [cifs]
vfs_get_tree+0x29/0xf0
path_mount+0x2d4/0xa00
__se_sys_mount+0x165/0x1d0
Root cause:
When creating kernel sockets, sk_alloc() calls net_passive_inc() for
sockets with sk_net_refcnt=0. The CIFS code manually converts kernel
sockets to user sockets by setting sk_net_refcnt=1, but doesn't call
the corresponding net_passive_dec(). This creates an imbalance in the
net_passive counter, which prevents the network namespace from being
destroyed when its last user reference is dropped. As a result, the
entire namespace and all its associated resources remain allocated.
Timeline of patches leading to this issue:
- commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
namespace.") in v6.12 fixed the original netns UAF by manually
managing socket refcounts
- commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after
rmmod") in v6.13 attempted to use kernel sockets but introduced
TCP timer issues
- commit 5c70eb5c593d ("net: better track kernel sockets lifetime")
in v6.14-rc5 introduced the net_passive mechanism with
sk_net_refcnt_upgrade() for proper socket conversion
- commit 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock
after rmmod"") in v6.15-rc3 reverted to manual refcount management
without adapting to the new net_passive changes
Fix this by using sk_net_refcnt_upgrade() which properly handles the
net_passive counter when converting kernel sockets to user sockets.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220343 [1]
Fixes: 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"")
Cc: stable@vger.kernel.org
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
[ Upstream commit 8ec396d05d1b737c87311fb7311f753b02c2a6b1 ]
Patch series "mm: reinstate ability to map write-sealed memfd mappings
read-only".
In commit 158978945f31 ("mm: perform the mapping_map_writable() check
after call_mmap()") (and preceding changes in the same series) it became
possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
behaviour") unintentionally undid this logic by moving the
mapping_map_writable() check before the shmem_mmap() hook is invoked,
thereby regressing this change.
This series reworks how we both permit write-sealed mappings being mapped
read-only and disallow mprotect() from undoing the write-seal, fixing this
regression.
We also add a regression test to ensure that we do not accidentally
regress this in future.
Thanks to Julian Orth for reporting this regression.
This patch (of 2):
In commit 158978945f31 ("mm: perform the mapping_map_writable() check
after call_mmap()") (and preceding changes in the same series) it became
possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
This was previously unnecessarily disallowed, despite the man page
documentation indicating that it would be, thereby limiting the usefulness
of F_SEAL_WRITE logic.
We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE
seal (one which disallows future writes to the memfd) to also be used for
F_SEAL_WRITE.
For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a
read-only mapping to disallow mprotect() from overriding the seal - an
operation performed by seal_check_write(), invoked from shmem_mmap(), the
f_op->mmap() hook used by shmem mappings.
By extending this to F_SEAL_WRITE and critically - checking
mapping_map_writable() to determine if we may map the memfd AFTER we
invoke shmem_mmap() - the desired logic becomes possible. This is because
mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will
have cleared.
Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
behaviour") unintentionally undid this logic by moving the
mapping_map_writable() check before the shmem_mmap() hook is invoked,
thereby regressing this change.
We reinstate this functionality by moving the check out of shmem_mmap()
and instead performing it in do_mmap() at the point at which VMA flags are
being determined, which seems in any case to be a more appropriate place
in which to make this determination.
In order to achieve this we rework memfd seal logic to allow us access to
this information using existing logic and eliminate the clearing of
VM_MAYWRITE from seal_check_write() which we are performing in do_mmap()
instead.
Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.1732804776.git.lorenzo.stoakes@oracle.com
Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Julian Orth <ju.orth@gmail.com>
Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@mail.gmail.com/
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Lorenzo Stoakes <lstoakes@gmail.com>
[ Upstream commit e8e17ee90eaf650c855adb0a3e5e965fd6692ff1 ]
Patch series "permit write-sealed memfd read-only shared mappings", v4.
The man page for fcntl() describing memfd file seals states the following
about F_SEAL_WRITE:-
Furthermore, trying to create new shared, writable memory-mappings via
mmap(2) will also fail with EPERM.
With emphasis on 'writable'. In turns out in fact that currently the
kernel simply disallows all new shared memory mappings for a memfd with
F_SEAL_WRITE applied, rendering this documentation inaccurate.
This matters because users are therefore unable to obtain a shared mapping
to a memfd after write sealing altogether, which limits their usefulness.
This was reported in the discussion thread [1] originating from a bug
report [2].
This is a product of both using the struct address_space->i_mmap_writable
atomic counter to determine whether writing may be permitted, and the
kernel adjusting this counter when any VM_SHARED mapping is performed and
more generally implicitly assuming VM_SHARED implies writable.
It seems sensible that we should only update this mapping if VM_MAYWRITE
is specified, i.e. whether it is possible that this mapping could at any
point be written to.
If we do so then all we need to do to permit write seals to function as
documented is to clear VM_MAYWRITE when mapping read-only. It turns out
this functionality already exists for F_SEAL_FUTURE_WRITE - we can
therefore simply adapt this logic to do the same for F_SEAL_WRITE.
We then hit a chicken and egg situation in mmap_region() where the check
for VM_MAYWRITE occurs before we are able to clear this flag. To work
around this, perform this check after we invoke call_mmap(), with careful
consideration of error paths.
Thanks to Andy Lutomirski for the suggestion!
[1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@linux-foundation.org/
[2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238
This patch (of 3):
There is a general assumption that VMAs with the VM_SHARED flag set are
writable. If the VM_MAYWRITE flag is not set, then this is simply not the
case.
Update those checks which affect the struct address_space->i_mmap_writable
field to explicitly test for this by introducing
[vma_]is_shared_maywrite() helper functions.
This remains entirely conservative, as the lack of VM_MAYWRITE guarantees
that the VMA cannot be written to.
Link: https://lkml.kernel.org/r/cover.1697116581.git.lstoakes@gmail.com
Link: https://lkml.kernel.org/r/d978aefefa83ec42d18dfa964ad180dbcde34795.1697116581.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
[isaacmanjarres: resolved merge conflicts due to
due to refactoring that happened in upstream commit
5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour")]
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccab044697980c6c01ab51f43f48f13b8a3e5c33 upstream.
When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(),
we used to return true, letting the segment proceed through the TCP
receive path without a DSS mapping. Such segments can leave inconsistent
mapping state and trigger a mid-stream fallback to TCP, which in testing
collapsed (by artificially forcing failures in skb_ext_add) throughput
to zero.
Return false instead so the TCP input path drops the skb (see
tcp_data_queue() and step-7 processing). This is the safer choice
under memory pressure: it preserves MPTCP correctness and provides
backpressure to the sender.
Control packets remain unaffected: ACK updates and DATA_FIN handling
happen before attempting the extension allocation, and tcp_reset()
continues to ignore the return value.
With this change, MPTCP continues to work at high throughput if we
artificially inject failures into skb_ext_add.
Fixes: 6787b7e350 ("mptcp: avoid processing packet if a subflow reset")
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-1-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7af76e9d18a9fd6f8611b3313c86c190f9b6a5a7 upstream.
Receiving HSR frame with insufficient space to hold HSR tag in the skb
can result in a crash (kernel BUG):
[ 45.390915] skbuff: skb_under_panic: text:ffffffff86f32cac len:26 put:14 head:ffff888042418000 data:ffff888042417ff4 tail:0xe end:0x180 dev:bridge_slave_1
[ 45.392559] ------------[ cut here ]------------
[ 45.392912] kernel BUG at net/core/skbuff.c:211!
[ 45.393276] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 45.393809] CPU: 1 UID: 0 PID: 2496 Comm: reproducer Not tainted 6.15.0 #12 PREEMPT(undef)
[ 45.394433] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 45.395273] RIP: 0010:skb_panic+0x15b/0x1d0
<snip registers, remove unreliable trace>
[ 45.402911] Call Trace:
[ 45.403105] <IRQ>
[ 45.404470] skb_push+0xcd/0xf0
[ 45.404726] br_dev_queue_push_xmit+0x7c/0x6c0
[ 45.406513] br_forward_finish+0x128/0x260
[ 45.408483] __br_forward+0x42d/0x590
[ 45.409464] maybe_deliver+0x2eb/0x420
[ 45.409763] br_flood+0x174/0x4a0
[ 45.410030] br_handle_frame_finish+0xc7c/0x1bc0
[ 45.411618] br_handle_frame+0xac3/0x1230
[ 45.413674] __netif_receive_skb_core.constprop.0+0x808/0x3df0
[ 45.422966] __netif_receive_skb_one_core+0xb4/0x1f0
[ 45.424478] __netif_receive_skb+0x22/0x170
[ 45.424806] process_backlog+0x242/0x6d0
[ 45.425116] __napi_poll+0xbb/0x630
[ 45.425394] net_rx_action+0x4d1/0xcc0
[ 45.427613] handle_softirqs+0x1a4/0x580
[ 45.427926] do_softirq+0x74/0x90
[ 45.428196] </IRQ>
This issue was found by syzkaller.
The panic happens in br_dev_queue_push_xmit() once it receives a
corrupted skb with ETH header already pushed in linear data. When it
attempts the skb_push() call, there's not enough headroom and
skb_push() panics.
The corrupted skb is put on the queue by HSR layer, which makes a
sequence of unintended transformations when it receives a specific
corrupted HSR frame (with incomplete TAG).
Fix it by dropping and consuming frames that are not long enough to
contain both ethernet and hsr headers.
Alternative fix would be to check for enough headroom before skb_push()
in br_dev_queue_push_xmit().
In the reproducer, this is injected via AF_PACKET, but I don't easily
see why it couldn't be sent over the wire from adjacent network.
Further Details:
In the reproducer, the following network interface chain is set up:
┌────────────────┐ ┌────────────────┐
│ veth0_to_hsr ├───┤ hsr_slave0 ┼───┐
└────────────────┘ └────────────────┘ │
│ ┌──────┐
├─┤ hsr0 ├───┐
│ └──────┘ │
┌────────────────┐ ┌────────────────┐ │ │┌────────┐
│ veth1_to_hsr ┼───┤ hsr_slave1 ├───┘ └┤ │
└────────────────┘ └────────────────┘ ┌┼ bridge │
││ │
│└────────┘
│
┌───────┐ │
│ ... ├──────┘
└───────┘
To trigger the events leading up to crash, reproducer sends a corrupted
HSR frame with incomplete TAG, via AF_PACKET socket on 'veth0_to_hsr'.
The first HSR-layer function to process this frame is
hsr_handle_frame(). It and then checks if the
protocol is ETH_P_PRP or ETH_P_HSR. If it is, it calls
skb_set_network_header(skb, ETH_HLEN + HSR_HLEN), without checking that
the skb is long enough. For the crashing frame it is not, and hence the
skb->network_header and skb->mac_len fields are set incorrectly,
pointing after the end of the linear buffer.
I will call this a BUG#1 and it is what is addressed by this patch. In
the crashing scenario before the fix, the skb continues to go down the
hsr path as follows.
hsr_handle_frame() then calls this sequence
hsr_forward_skb()
fill_frame_info()
hsr->proto_ops->fill_frame_info()
hsr_fill_frame_info()
hsr_fill_frame_info() contains a check that intends to check whether the
skb actually contains the HSR header. But the check relies on the
skb->mac_len field which was erroneously setup due to BUG#1, so the
check passes and the execution continues back in the hsr_forward_skb():
hsr_forward_skb()
hsr_forward_do()
hsr->proto_ops->get_untagged_frame()
hsr_get_untagged_frame()
create_stripped_skb_hsr()
In create_stripped_skb_hsr(), a copy of the skb is created and is
further corrupted by operation that attempts to strip the HSR tag in a
call to __pskb_copy().
The skb enters create_stripped_skb_hsr() with ethernet header pushed in
linear buffer. The skb_pull(skb_in, HSR_HLEN) thus pulls 6 bytes of
ethernet header into the headroom, creating skb_in with a headroom of
size 8. The subsequent __pskb_copy() then creates an skb with headroom
of just 2 and skb->len of just 12, this is how it looks after the copy:
gdb) p skb->len
$10 = 12
(gdb) p skb->data
$11 = (unsigned char *) 0xffff888041e45382 "\252\252\252\252\252!\210\373",
(gdb) p skb->head
$12 = (unsigned char *) 0xffff888041e45380 ""
It seems create_stripped_skb_hsr() assumes that ETH header is pulled
in the headroom when it's entered, because it just pulls HSR header on
top. But that is not the case in our code-path and we end up with the
corrupted skb instead. I will call this BUG#2
*I got confused here because it seems that under no conditions can
create_stripped_skb_hsr() work well, the assumption it makes is not true
during the processing of hsr frames - since the skb_push() in
hsr_handle_frame to skb_pull in hsr_deliver_master(). I wonder whether I
missed something here.*
Next, the execution arrives in hsr_deliver_master(). It calls
skb_pull(ETH_HLEN), which just returns NULL - the SKB does not have
enough space for the pull (as it only has 12 bytes in total at this
point).
*The skb_pull() here further suggests that ethernet header is meant
to be pushed through the whole hsr processing and
create_stripped_skb_hsr() should pull it before doing the HSR header
pull.*
hsr_deliver_master() then puts the corrupted skb on the queue, it is
then picked up from there by bridge frame handling layer and finally
lands in br_dev_queue_push_xmit where it panics.
Cc: stable@kernel.org
Fixes: 48b491a5cc ("net: hsr: fix mac_len checks")
Reported-by: syzbot+a81f2759d022496b40ab@syzkaller.appspotmail.com
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250819082842.94378-1-acsjakub@amazon.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4db9cd554883e051df1840d4d58d636043101034 upstream.
dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr
with the dce_clk_mgr, causing incorrect behaviour on DCE6.
Fix it by removing the extra dce_clk_mgr_construct.
Fixes: 62eab49faa ("drm/amd/display: hide VGH asic specific structs")
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3477c1b0972dc1c8a46f78e8fb1fa6966095b5ec upstream.
amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for
the primary plane. If a commit affects a CRTC but not its primary plane,
it would previously not trigger a refresh cycle or affect LFC, violating
current UAPI semantics.
Fixes e.g. atomic commits affecting only the cursor plane being limited
to the minimum refresh rate.
Don't do this for the legacy cursor ioctls though, it would break the
UAPI semantics for those.
Suggested-by: Xaver Hugl <xaver.hugl@kde.org>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7bfba95966251b254cb970c21627124da3b7f4)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e58401a24e7b2d4ec619104e1a76590c1284a4c upstream.
Since KFD proc content was moved to kernel debugfs, we can't destroy KFD
debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior
to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens
when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but
kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line
debugfs_remove_recursive(entry->proc_dentry);
tries to remove /sys/kernel/debug/kfd/proc/<pid> while
/sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel
NULL pointer.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bae62cc989fa99ac9cb564eb573aad916d1eb61 upstream.
Update the client id mapping so the correct clients
get printed when there is a mmhub page fault.
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2a2681eda73b99a2c1ee8cdb006099ea5d0c2505)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3200144a2fa4209dc084a19941b9b203b43580f0 upstream.
Make sure the interrupt handler is initialized before the interrupt is
registered.
If the IRQ is registered before hfi_create(), it's possible that an
interrupt fires before the handler setup is complete, leading to a NULL
dereference.
This error condition has been observed during system boot on Rb3Gen2.
Fixes: af2c3834c8 ("[media] media: venus: adding core part and helper functions")
Cc: stable@vger.kernel.org
Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Dikshita Agarwal <quic_dikshita@quicinc.com> # RB5
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 640803003cd903cea73dc6a86bf6963e238e2b3f upstream.
Ensure the IRQ is disabled - and all pending handlers completed - before
dismantling the interrupt routing and clearing related pointers.
This prevents any possibility of the interrupt triggering after the
handler context has been invalidated.
Fixes: d96d3f30c0 ("[media] media: venus: hfi: add Venus HFI files")
Cc: stable@vger.kernel.org
Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com>
Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Dikshita Agarwal <quic_dikshita@quicinc.com> # RB5
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49befc830daa743e051a65468c05c2ff9e8580e6 upstream.
Add a check to ensure that the packet size does not exceed the number of
available words after reading the packet header from shared memory. This
ensures that the size provided by the firmware is safe to process and
prevent potential out-of-bounds memory access.
Fixes: d96d3f30c0 ("[media] media: venus: hfi: add Venus HFI files")
Cc: stable@vger.kernel.org
Signed-off-by: Vedang Nagar <quic_vnagar@quicinc.com>
Co-developed-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69080ec3d0daba8a894025476c98ab16b5a505a4 upstream.
A call to media_device_init() requires media_device_cleanup() counterpart
to complete cleanup and release any allocated resources.
This has been done in the driver .remove() right from the beginning, but
error paths on .probe() shall also be fixed.
Fixes: a1d7c116fc ("media: camms: Add core files")
Cc: stable@vger.kernel.org
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c92c49fc688cfadacc47ae99b06a31237702e9e upstream.
Both the ACE and CSI driver are missing a mei_cldev_disable() call in
their remove() function.
This causes the mei_cl client to stay part of the mei_device->file_list
list even though its memory is freed by mei_cl_bus_dev_release() calling
kfree(cldev->cl).
This leads to a use-after-free when mei_vsc_remove() runs mei_stop()
which first removes all mei bus devices calling mei_ace_remove() and
mei_csi_remove() followed by mei_cl_bus_dev_release() and then calls
mei_cl_all_disconnect() which walks over mei_device->file_list dereferecing
the just freed cldev->cl.
And mei_vsc_remove() it self is run at shutdown because of the
platform_device_unregister(tp->pdev) in vsc_tp_shutdown()
When building a kernel with KASAN this leads to the following KASAN report:
[ 106.634504] ==================================================================
[ 106.634623] BUG: KASAN: slab-use-after-free in mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei
[ 106.634683] Read of size 4 at addr ffff88819cb62018 by task systemd-shutdow/1
[ 106.634729]
[ 106.634767] Tainted: [E]=UNSIGNED_MODULE
[ 106.634770] Hardware name: Dell Inc. XPS 16 9640/09CK4V, BIOS 1.12.0 02/10/2025
[ 106.634773] Call Trace:
[ 106.634777] <TASK>
...
[ 106.634871] kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:636)
[ 106.634901] mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei
[ 106.634921] mei_cl_all_disconnect (drivers/misc/mei/client.c:2165 (discriminator 4)) mei
[ 106.634941] mei_reset (drivers/misc/mei/init.c:163) mei
...
[ 106.635042] mei_stop (drivers/misc/mei/init.c:348) mei
[ 106.635062] mei_vsc_remove (drivers/misc/mei/mei_dev.h:784 drivers/misc/mei/platform-vsc.c:393) mei_vsc
[ 106.635066] platform_remove (drivers/base/platform.c:1424)
Add the missing mei_cldev_disable() calls so that the mei_cl gets removed
from mei_device->file_list before it is freed to fix this.
Fixes: 78876f71b3 ("media: pci: intel: ivsc: Add ACE submodule")
Fixes: 29006e196a ("media: pci: intel: ivsc: Add CSI submodule")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hansg@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7af160aea26c7dc9e6734d19306128cce156ec40 upstream.
In the interrupt handler rain_interrupt(), the buffer full check on
rain->buf_len is performed before acquiring rain->buf_lock. This
creates a Time-of-Check to Time-of-Use (TOCTOU) race condition, as
rain->buf_len is concurrently accessed and modified in the work
handler rain_irq_work_handler() under the same lock.
Multiple interrupt invocations can race, with each reading buf_len
before it becomes full and then proceeding. This can lead to both
interrupts attempting to write to the buffer, incrementing buf_len
beyond its capacity (DATA_SIZE) and causing a buffer overflow.
Fix this bug by moving the spin_lock() to before the buffer full
check. This ensures that the check and the subsequent buffer modification
are performed atomically, preventing the race condition. An corresponding
spin_unlock() is added to the overflow path to correctly release the
lock.
This possible bug was found by an experimental static analysis tool
developed by our team.
Fixes: 0f314f6c2e ("[media] rainshadow-cec: new RainShadow Tech HDMI CEC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e40e0bb778907b2441bff68d73c3eb6b6cd319f upstream.
When an program is streaming (ffplay) and another program (qv4l2)
changes the TV standard from NTSC to PAL, the kernel crashes due to trying
to copy to unmapped memory.
Changing from NTSC to PAL increases the resolution in the usbtv struct,
but the video plane buffer isn't adjusted, so it overflows.
Fixes: 0e0fe3958f ("[media] usbtv: Add support for PAL video source")
Cc: stable@vger.kernel.org
Signed-off-by: Ludwig Disterhof <ludwig@disterhof.eu>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
[hverkuil: call vb2_is_busy instead of vb2_is_streaming]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a0400aca5fa7c6b8ba456c311a460e733571c88 upstream.
It's a common pattern in drivers to free the control handler's resources
and then return the handler's error code on drivers' error handling paths.
Alas, the v4l2_ctrl_handler_free() function also zeroes the error field,
effectively indicating successful return to the caller.
There's no apparent need to touch the error field while releasing the
control handler's resources and cleaning up stale pointers. Not touching
the handler's error field is a more certain way to address this problem
than changing all the users, in which case the pattern would be likely to
re-emerge in new drivers.
Do just that, don't touch the control handler's error field in
v4l2_ctrl_handler_free().
Fixes: 0996517cf8 ("V4L/DVB: v4l2: Add new control handling framework")
Cc: stable@vger.kernel.org
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01350185fe02ae3ea2c12d578e06af0d5186f33e upstream.
The desired clock frequency was correctly set to 400MHz in the device tree
but was lowered by the driver to 300MHz breaking 4K 60Hz content playback.
Fix the issue by removing the driver call to clk_set_rate(), which reduce
the amount of board specific code.
Fixes: 003afda97c ("media: verisilicon: Enable AV1 decoder on rk3588")
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e43442d4994c9e1e202c98129a87e330f7faaed upstream.
The pixel_array control size was calculated incorrectly:
the dimensions were swapped (dims[0] should be the height), and the
values should be the width or height divided by PIXEL_ARRAY_DIV
and rounded up. So don't use roundup, but use DIV_ROUND_UP instead.
This bug is harmless in the sense that nothing will break, except that
it consumes way too much memory for this control.
Fixes: 6bc7643d1b ("media: vivid: add pixel_array test control")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 020f602b068c9ce18d5056d02c8302199377d98d upstream.
Hynix hi556 support 8 test pattern modes:
hi556_test_pattern_menu[] = {
{
"Disabled",
"Solid Colour",
"100% Colour Bars",
"Fade To Grey Colour Bars",
"PN9",
"Gradient Horizontal",
"Gradient Vertical",
"Check Board",
"Slant Pattern",
}
The test pattern is set by a 8-bit register according to the
specification.
+--------+-------------------------------+
| BIT[0] | Solid color |
+--------+-------------------------------+
| BIT[1] | Color bar |
+--------+-------------------------------+
| BIT[2] | Fade to grey color bar |
+--------+-------------------------------+
| BIT[3] | PN9 |
+--------+-------------------------------+
| BIT[4] | Gradient horizontal |
+--------+-------------------------------+
| BIT[5] | Gradient vertical |
+--------+-------------------------------+
| BIT[6] | Check board |
+--------+-------------------------------+
| BIT[7] | Slant pattern |
+--------+-------------------------------+
Based on function above, current test pattern programming is wrong.
This patch fixes it by 'BIT(pattern - 1)'. If pattern is 0, driver
will disable the test pattern generation and set the pattern to 0.
Fixes: e62138403a ("media: hi556: Add support for Hi-556 sensor")
Cc: stable@vger.kernel.org
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aef89c0b2417da79cb2062a95476288f9f203ab0 upstream.
This sd_init() function reads the firmware. The firmware data holds a
series of records and the function reads each record and sends the data
to the device. The request_ihex_firmware() function
calls ihex_validate_fw() which ensures that the total length of all the
records won't read out of bounds of the fw->data[].
However, a potential issue is if there is a single very large
record (larger than PAGE_SIZE) and that would result in memory
corruption. Generally we trust the firmware, but it's always better to
double check.
Fixes: 49b61ec9b5 ("[media] gspca: Add new vicam subdriver")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb22f247f371bd206a88cf0e0c05d80b8b62fb26 upstream.
The following testcase exposed a problem with our read access checks
in get_user() and raw_copy_from_user():
#include <stdint.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/types.h>
int main(int argc, char **argv)
{
unsigned long page_size = sysconf(_SC_PAGESIZE);
char *p = malloc(3 * page_size);
char *p_aligned;
/* initialize memory region. If not initialized, write syscall below will correctly return EFAULT. */
if (1)
memset(p, 'X', 3 * page_size);
p_aligned = (char *) ((((uintptr_t) p) + (2*page_size - 1)) & ~(page_size - 1));
/* Drop PROT_READ protection. Kernel and userspace should fault when accessing that memory region */
mprotect(p_aligned, page_size, PROT_NONE);
/* the following write() should return EFAULT, since PROT_READ was dropped by previous mprotect() */
int ret = write(2, p_aligned, 1);
if (!ret || errno != EFAULT)
printf("\n FAILURE: write() did not returned expected EFAULT value\n");
return 0;
}
Because of the way _PAGE_READ is handled, kernel code never generates
a read access fault when it access a page as the kernel privilege level
is always less than PL1 in the PTE.
This patch reworks the comments in the make_insert_tlb macro to try
to make this clearer.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6334f4ae9a4e962ba74b026e1d965dfdf8cbef8 upstream.
We use load and stbys,e instructions to trigger memory reference
interruptions without writing to memory. Because of the way read
access support is implemented, read access interruptions are only
triggered at privilege levels 2 and 3. The kernel and gateway
page execute at privilege level 0, so this code never triggers
a read access interruption. Thus, it is currently possible for
user code to execute a LWS compare and swap operation at an
address that is read protected at privilege level 3 (PRIV_USER).
Fix this by probing read access rights at privilege level 3 and
branching to lws_fault if access isn't allowed.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89f686a0fb6e473a876a9a60a13aec67a62b9a7e upstream.
Because of the way read access support is implemented, read access
interruptions are only triggered at privilege levels 2 and 3. The
kernel executes at privilege level 0, so __get_user() never triggers
a read access interruption (code 26). Thus, it is currently possible
for user code to access a read protected address via a system call.
Fix this by probing read access rights at privilege level 3 (PRIV_USER)
and setting __gu_err to -EFAULT (-14) if access isn't allowed.
Note the cmpiclr instruction does a 32-bit compare because COND macro
doesn't work inside asm.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52ce9406a9625c4498c4eaa51e7a7ed9dcb9db16 upstream.
The local name used in cache.c conflicts the declaration in
include/asm-generic/tlb.h.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 802e55488bc2cc1ab6423b720255a785ccac42ce upstream.
When a PTE is changed, we need to flush the PTE. set_pte_at()
was lost in the folio update. PA-RISC version is the same as
the generic version.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91428ca9320edbab1211851d82429d33b9cd73ef upstream.
Because of the way the _PAGE_READ is handled in the parisc PTE, an
access interruption is not generated when the kernel reads from a
region where the _PAGE_READ is zero. The current code was written
assuming read access faults would also occur in the kernel.
This change adds user access checks to raw_copy_from_user(). The
prober_user() define checks whether user code has read access to
a virtual address. Note that page faults are not handled in the
exception support for the probe instruction. For this reason, we
precede the probe by a ldb access check.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6bcbce3359619d05bf387d4f5cc3af63668dbaa upstream.
After commit 13a4b7fb6260 ("pmdomain: core: Leave powered-on genpds on
until late_initcall_sync") was applied, the Tegra210 Jetson TX1 board
failed to boot. Looking into this issue, before this commit was applied,
if any of the Tegra power-domains were in 'on' state when the kernel
booted, they were being turned off by the genpd core before any driver
had chance to request them. This was purely by luck and a consequence of
the power-domains being turned off earlier during boot. After this
commit was applied, any power-domains in the 'on' state are kept on for
longer during boot and therefore, may never transitioned to the off
state before they are requested/used. The hang on the Tegra210 Jetson
TX1 is caused because devices in some power-domains are accessed without
the power-domain being turned off and on, indicating that the
power-domain is not in a completely on state.
>From reviewing the Tegra PMC driver code, if a power-domain is in the
'on' state there is no guarantee that all the necessary clocks
associated with the power-domain are on and even if they are they would
not have been requested via the clock framework and so could be turned
off later. Some power-domains also have a 'clamping' register that needs
to be configured as well. In short, if a power-domain is already 'on' it
is difficult to know if it has been configured correctly. Given that the
power-domains happened to be switched off during boot previously, to
ensure that they are in a good known state on boot, fix this by
switching off any power-domains that are on initially when registering
the power-domains with the genpd framework.
Note that commit 05cfb988a4 ("soc/tegra: pmc: Initialise resets
associated with a power partition") updated the
tegra_powergate_of_get_resets() function to pass the 'off' to ensure
that the resets for the power-domain are in the correct state on boot.
However, now that we may power off a domain on boot, if it is on, it is
better to move this logic into the tegra_powergate_add() function so
that there is a single place where we are handling the initial state of
the power-domain.
Fixes: a38045121b ("soc/tegra: pmc: Add generic PM domain support")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250731121832.213671-1-jonathanh@nvidia.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d98cf4632258720f18265a058e62fde120c0151 upstream.
Both jbd2_log_do_checkpoint() and jbd2_journal_shrink_checkpoint_list()
periodically release j_list_lock after processing a batch of buffers to
avoid long hold times on the j_list_lock. However, since both functions
contend for j_list_lock, the combined time spent waiting and processing
can be significant.
jbd2_journal_shrink_checkpoint_list() explicitly calls cond_resched() when
need_resched() is true to avoid softlockups during prolonged operations.
But jbd2_log_do_checkpoint() only exits its loop when need_resched() is
true, relying on potentially sleeping functions like __flush_batch() or
wait_on_buffer() to trigger rescheduling. If those functions do not sleep,
the kernel may hit a softlockup.
watchdog: BUG: soft lockup - CPU#3 stuck for 156s! [kworker/u129:2:373]
CPU: 3 PID: 373 Comm: kworker/u129:2 Kdump: loaded Not tainted 6.6.0+ #10
Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.27 06/13/2017
Workqueue: writeback wb_workfn (flush-7:2)
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : native_queued_spin_lock_slowpath+0x358/0x418
lr : jbd2_log_do_checkpoint+0x31c/0x438 [jbd2]
Call trace:
native_queued_spin_lock_slowpath+0x358/0x418
jbd2_log_do_checkpoint+0x31c/0x438 [jbd2]
__jbd2_log_wait_for_space+0xfc/0x2f8 [jbd2]
add_transaction_credits+0x3bc/0x418 [jbd2]
start_this_handle+0xf8/0x560 [jbd2]
jbd2__journal_start+0x118/0x228 [jbd2]
__ext4_journal_start_sb+0x110/0x188 [ext4]
ext4_do_writepages+0x3dc/0x740 [ext4]
ext4_writepages+0xa4/0x190 [ext4]
do_writepages+0x94/0x228
__writeback_single_inode+0x48/0x318
writeback_sb_inodes+0x204/0x590
__writeback_inodes_wb+0x54/0xf8
wb_writeback+0x2cc/0x3d8
wb_do_writeback+0x2e0/0x2f8
wb_workfn+0x80/0x2a8
process_one_work+0x178/0x3e8
worker_thread+0x234/0x3b8
kthread+0xf0/0x108
ret_from_fork+0x10/0x20
So explicitly call cond_resched() in jbd2_log_do_checkpoint() to avoid
softlockup.
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://patch.msgid.link/20250812063752.912130-1-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77de19b6867f2740cdcb6c9c7e50d522b47847a4 upstream.
As Jiaming Zhang reported:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x1c1/0x2a0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0x17e/0x800 mm/kasan/report.c:480
kasan_report+0x147/0x180 mm/kasan/report.c:593
data_blkaddr fs/f2fs/f2fs.h:3053 [inline]
f2fs_data_blkaddr fs/f2fs/f2fs.h:3058 [inline]
f2fs_get_dnode_of_data+0x1a09/0x1c40 fs/f2fs/node.c:855
f2fs_reserve_block+0x53/0x310 fs/f2fs/data.c:1195
prepare_write_begin fs/f2fs/data.c:3395 [inline]
f2fs_write_begin+0xf39/0x2190 fs/f2fs/data.c:3594
generic_perform_write+0x2c7/0x910 mm/filemap.c:4112
f2fs_buffered_write_iter fs/f2fs/file.c:4988 [inline]
f2fs_file_write_iter+0x1ec8/0x2410 fs/f2fs/file.c:5216
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x546/0xa90 fs/read_write.c:686
ksys_write+0x149/0x250 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x3d0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The root cause is in the corrupted image, there is a dnode has the same
node id w/ its inode, so during f2fs_get_dnode_of_data(), it tries to
access block address in dnode at offset 934, however it parses the dnode
as inode node, so that get_dnode_addr() returns 360, then it tries to
access page address from 360 + 934 * 4 = 4096 w/ 4 bytes.
To fix this issue, let's add sanity check for node id of all direct nodes
during f2fs_get_dnode_of_data().
Cc: stable@kernel.org
Reported-by: Jiaming Zhang <r772577952@gmail.com>
Closes: https://groups.google.com/g/syzkaller/c/-ZnaaOOfO3M
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>