mirror of
https://github.com/hardkernel/linux.git
synced 2026-06-09 20:32:04 +09:00
4b117370d1d12cc00bb14ee2f400b1a91e32bb69
1218854 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
4b117370d1 |
LoongArch: BPF: Don't sign extend function return value
[ Upstream commit 5d47ec2e6f4c64e30e392cfe9532df98c9beb106 ]
The `cls_redirect` test triggers a kernel panic like:
# ./test_progs -t cls_redirect
Can't find bpf_testmod.ko kernel module: -2
WARNING! Selftests relying on bpf_testmod.ko will be skipped.
[ 30.938489] CPU 3 Unable to handle kernel paging request at virtual address fffffffffd814de0, era == ffff800002009fb8, ra == ffff800002009f9c
[ 30.939331] Oops[#1]:
[ 30.939513] CPU: 3 PID: 1260 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
[ 30.939732] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
[ 30.939901] pc ffff800002009fb8 ra ffff800002009f9c tp 9000000104da4000 sp 9000000104da7ab0
[ 30.940038] a0 fffffffffd814de0 a1 9000000104da7a68 a2 0000000000000000 a3 9000000104da7c10
[ 30.940183] a4 9000000104da7c14 a5 0000000000000002 a6 0000000000000021 a7 00005555904d7f90
[ 30.940321] t0 0000000000000110 t1 0000000000000000 t2 fffffffffd814de0 t3 0004c4b400000000
[ 30.940456] t4 ffffffffffffffff t5 00000000c3f63600 t6 0000000000000000 t7 0000000000000000
[ 30.940590] t8 000000000006d803 u0 0000000000000020 s9 9000000104da7b10 s0 900000010504c200
[ 30.940727] s1 fffffffffd814de0 s2 900000010504c200 s3 9000000104da7c10 s4 9000000104da7ad0
[ 30.940866] s5 0000000000000000 s6 90000000030e65bc s7 9000000104da7b44 s8 90000000044f6fc0
[ 30.941015] ra: ffff800002009f9c bpf_prog_846803e5ae81417f_cls_redirect+0xa0/0x590
[ 30.941535] ERA: ffff800002009fb8 bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
[ 30.941696] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[ 30.942224] PRMD: 00000004 (PPLV0 +PIE -PWE)
[ 30.942330] EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[ 30.942453] ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
[ 30.942612] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
[ 30.942764] BADV: fffffffffd814de0
[ 30.942854] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[ 30.942974] Modules linked in:
[ 30.943078] Process test_progs (pid: 1260, threadinfo=00000000ce303226, task=000000007d10bb76)
[ 30.943306] Stack : 900000010a064000 90000000044f6fc0 9000000104da7b48 0000000000000000
[ 30.943495] 0000000000000000 9000000104da7c14 9000000104da7c10 900000010504c200
[ 30.943626] 0000000000000001 ffff80001b88c000 9000000104da7b70 90000000030e6668
[ 30.943785] 0000000000000000 9000000104da7b58 ffff80001b88c048 9000000003d05000
[ 30.943936] 900000000303ac88 0000000000000000 0000000000000000 9000000104da7b70
[ 30.944091] 0000000000000000 0000000000000001 0000000731eeab00 0000000000000000
[ 30.944245] ffff80001b88c000 0000000000000000 0000000000000000 54b99959429f83b8
[ 30.944402] ffff80001b88c000 90000000044f6fc0 9000000101d70000 ffff80001b88c000
[ 30.944538] 000000000000005a 900000010504c200 900000010a064000 900000010a067000
[ 30.944697] 9000000104da7d88 0000000000000000 9000000003d05000 90000000030e794c
[ 30.944852] ...
[ 30.944924] Call Trace:
[ 30.945120] [<ffff800002009fb8>] bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
[ 30.945650] [<90000000030e6668>] bpf_test_run+0x1ec/0x2f8
[ 30.945958] [<90000000030e794c>] bpf_prog_test_run_skb+0x31c/0x684
[ 30.946065] [<90000000026d4f68>] __sys_bpf+0x678/0x2724
[ 30.946159] [<90000000026d7288>] sys_bpf+0x20/0x2c
[ 30.946253] [<90000000032dd224>] do_syscall+0x7c/0x94
[ 30.946343] [<9000000002541c5c>] handle_syscall+0xbc/0x158
[ 30.946492]
[ 30.946549] Code: 0015030e 5c0009c0 5001d000 <28c00304> 02c00484 29c00304 00150009 2a42d2e4 0280200d
[ 30.946793]
[ 30.946971] ---[ end trace 0000000000000000 ]---
[ 32.093225] Kernel panic - not syncing: Fatal exception in interrupt
[ 32.093526] Kernel relocated by 0x2320000
[ 32.093630] .text @ 0x9000000002520000
[ 32.093725] .data @ 0x9000000003400000
[ 32.093792] .bss @ 0x9000000004413200
[ 34.971998] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
This is because we signed-extend function return values. When subprog
mode is enabled, we have:
cls_redirect()
-> get_global_metrics() returns pcpu ptr 0xfffffefffc00b480
The pointer returned is later signed-extended to 0xfffffffffc00b480 at
`BPF_JMP | BPF_EXIT`. During BPF prog run, this triggers unhandled page
fault and a kernel panic.
Drop the unnecessary signed-extension on return values like other
architectures do.
With this change, we have:
# ./test_progs -t cls_redirect
Can't find bpf_testmod.ko kernel module: -2
WARNING! Selftests relying on bpf_testmod.ko will be skipped.
#51/1 cls_redirect/cls_redirect_inlined:OK
#51/2 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
#51/3 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
#51/4 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
#51/5 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
#51/6 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
#51/7 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
#51/8 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
#51/9 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
#51/10 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
#51/11 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
#51/12 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
#51/13 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
#51/14 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
#51/15 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
#51/16 cls_redirect/cls_redirect_subprogs:OK
#51/17 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
#51/18 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
#51/19 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
#51/20 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
#51/21 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
#51/22 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
#51/23 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
#51/24 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
#51/25 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
#51/26 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
#51/27 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
#51/28 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
#51/29 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
#51/30 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
#51/31 cls_redirect/cls_redirect_dynptr:OK
#51/32 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
#51/33 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
#51/34 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
#51/35 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
#51/36 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
#51/37 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
#51/38 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
#51/39 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
#51/40 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
#51/41 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
#51/42 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
#51/43 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
#51/44 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
#51/45 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
#51 cls_redirect:OK
Summary: 1/45 PASSED, 0 SKIPPED, 0 FAILED
Fixes:
|
||
|
|
3275410b13 |
LoongArch: BPF: Don't sign extend memory load operand
[ Upstream commit fe5757553bf9ebe45ae8ecab5922f6937c8d8dfc ]
The `cgrp_local_storage` test triggers a kernel panic like:
# ./test_progs -t cgrp_local_storage
Can't find bpf_testmod.ko kernel module: -2
WARNING! Selftests relying on bpf_testmod.ko will be skipped.
[ 550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00
[ 550.931781] Oops[#1]:
[ 550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
[ 550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
[ 550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0
[ 550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558
[ 550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118
[ 550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0
[ 550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020
[ 550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700
[ 550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8
[ 550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050
[ 550.933520] ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200
[ 550.933911] ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
[ 550.934105] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[ 550.934596] PRMD: 00000004 (PPLV0 +PIE -PWE)
[ 550.934712] EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[ 550.934836] ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
[ 550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
[ 550.935097] BADV: 0000000000000080
[ 550.935181] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[ 550.935291] Modules linked in:
[ 550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55)
[ 550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001
[ 550.935844] 9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034
[ 550.935990] 0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09
[ 550.936175] 0000000000000001 0000000000000000 9000000108353ec0 0000000000000118
[ 550.936314] 9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000
[ 550.936479] 0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288
[ 550.936635] 00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003
[ 550.936779] 9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c
[ 550.936939] 0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0
[ 550.937083] ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558
[ 550.937224] ...
[ 550.937299] Call Trace:
[ 550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
[ 550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154
[ 550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200
[ 550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94
[ 550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158
[ 550.938477]
[ 550.938607] Code: 580009ae 50016000 262402e4 <28c20085> 14092084 03a00084 16000024 03240084 00150006
[ 550.938851]
[ 550.939021] ---[ end trace 0000000000000000 ]---
Further investigation shows that this panic is triggered by memory
load operations:
ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0,
BPF_LOCAL_STORAGE_GET_F_CREATE);
The expression `task->cgroups->dfl_cgrp` involves two memory load.
Since the field offset fits in imm12 or imm14, we use ldd or ldptrd
instructions. But both instructions have the side effect that it will
signed-extended the imm operand. Finally, we got the wrong addresses
and panics is inevitable.
Use a generic ldxd instruction to avoid this kind of issues.
With this change, we have:
# ./test_progs -t cgrp_local_storage
Can't find bpf_testmod.ko kernel module: -2
WARNING! Selftests relying on bpf_testmod.ko will be skipped.
test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
#48/1 cgrp_local_storage/tp_btf:OK
test_attach_cgroup:PASS:skel_open 0 nsec
test_attach_cgroup:PASS:prog_attach 0 nsec
test_attach_cgroup:PASS:prog_attach 0 nsec
libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
test_attach_cgroup:FAIL:prog_attach unexpected error: -524
#48/2 cgrp_local_storage/attach_cgroup:FAIL
test_recursion:PASS:skel_open_and_load 0 nsec
libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
libbpf: prog 'on_lookup': failed to auto-attach: -524
test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
#48/3 cgrp_local_storage/recursion:FAIL
#48/4 cgrp_local_storage/negative:OK
#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK
test_yes_rcu_lock:PASS:skel_open 0 nsec
test_yes_rcu_lock:PASS:skel_load 0 nsec
libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
#48/6 cgrp_local_storage/yes_rcu_lock:FAIL
#48/7 cgrp_local_storage/no_rcu_lock:OK
#48 cgrp_local_storage:FAIL
All error logs:
test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
test_attach_cgroup:PASS:skel_open 0 nsec
test_attach_cgroup:PASS:prog_attach 0 nsec
test_attach_cgroup:PASS:prog_attach 0 nsec
libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
test_attach_cgroup:FAIL:prog_attach unexpected error: -524
#48/2 cgrp_local_storage/attach_cgroup:FAIL
test_recursion:PASS:skel_open_and_load 0 nsec
libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
libbpf: prog 'on_lookup': failed to auto-attach: -524
test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
#48/3 cgrp_local_storage/recursion:FAIL
test_yes_rcu_lock:PASS:skel_open 0 nsec
test_yes_rcu_lock:PASS:skel_load 0 nsec
libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
#48/6 cgrp_local_storage/yes_rcu_lock:FAIL
#48 cgrp_local_storage:FAIL
Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED
No panics any more (The test still failed because lack of BPF trampoline
which I am actively working on).
Fixes:
|
||
|
|
0fdd1b8848 |
perf vendor events arm64: AmpereOne: Add missing DefaultMetricgroupName fields
[ Upstream commit 90fe70d4e23cb57253d2668a171d5695c332deb7 ]
AmpereOne metrics were missing DefaultMetricgroupName from metrics with
"Default" in group name resulting perf to segfault. Add the missing
field to address the issue.
Fixes: 59faeaf80d02 ("perf vendor events arm64: Fix for AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231201021550.1109196-2-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
f78fff4648 |
misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
[ Upstream commit ee6236027218f8531916f1c5caa5dc330379f287 ]
Clang static analyzer complains that value stored to 'rets' is never
read.Let 'buf_len = -EOVERFLOW' to make sure we can return '-EOVERFLOW'.
Fixes:
|
||
|
|
e2365ead01 |
misc: mei: client.c: return negative error code in mei_cl_write
[ Upstream commit 8f06aee8089cf42fd99a20184501bd1347ce61b9 ]
mei_msg_hdr_init() return negative error code, rets should be
'PTR_ERR(mei_hdr)' rather than '-PTR_ERR(mei_hdr)'.
Fixes:
|
||
|
|
c541d0edd8 |
coresight: ultrasoc-smb: Fix uninitialized before use buf_hw_base
[ Upstream commit 862c135bde8bc185e8aae2110374175e6a1b6ed5 ]
In smb_reset_buffer, the sdb->buf_hw_base variable is uninitialized
before use, which initializes it in smb_init_data_buffer. And the SMB
regiester are set in smb_config_inport.
So move the call after smb_config_inport.
Fixes:
|
||
|
|
ab5091e1cc |
coresight: ultrasoc-smb: Config SMB buffer before register sink
[ Upstream commit 830a7f54db102c889a3fe1c0a225f369ac05f07f ]
The SMB dirver register the enable/disable sysfs interface in function
smb_register_sink(), however the buffer depends on the following
configuration to work well. So it'll be possible for user to access an
unreset one.
Move the config buffer operation to before register_sink().
Ignore the return value, if smb_config_inport() fails. That will
cause the hardwares disable trace path to fail, should not affect
SMB driver remove. So we make smb_remove() return success,
Fixes:
|
||
|
|
ace850bd86 |
coresight: ultrasoc-smb: Fix sleep while close preempt in enable_smb
[ Upstream commit b8411287aef4a994eff0c68f5597910c4194dfe3 ]
When we to enable the SMB by perf, the perf sched will call perf_ctx_lock()
to close system preempt in event_function_call(). But SMB::enable_smb() use
mutex to lock the critical section, which may sleep.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 153023, name: perf
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48
softirqs last enabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 2 PID: 153023 Comm: perf Kdump: loaded Tainted: G W O 6.5.0-rc4+ #1
Call trace:
...
__mutex_lock+0xbc/0xa70
mutex_lock_nested+0x34/0x48
smb_update_buffer+0x58/0x360 [ultrasoc_smb]
etm_event_stop+0x204/0x2d8 [coresight]
etm_event_del+0x1c/0x30 [coresight]
event_sched_out+0x17c/0x3b8
group_sched_out.part.0+0x5c/0x208
__perf_event_disable+0x15c/0x210
event_function+0xe0/0x230
remote_function+0xb4/0xe8
generic_exec_single+0x160/0x268
smp_call_function_single+0x20c/0x2a0
event_function_call+0x20c/0x220
_perf_event_disable+0x5c/0x90
perf_event_for_each_child+0x58/0xc0
_perf_ioctl+0x34c/0x1250
perf_ioctl+0x64/0x98
...
Use spinlock to replace mutex to control driver data access to one at a
time. The function copy_to_user() may sleep, it cannot be in a spinlock
context, so we can't simply replace it in smb_read(). But we can ensure
that only one user gets the SMB device fd by smb_open(), so remove the
locks from smb_read() and buffer synchronization is guaranteed by the user.
Fixes:
|
||
|
|
359d3fbcbc |
hwtracing: hisi_ptt: Add dummy callback pmu::read()
[ Upstream commit 55e0a2fb0cb5ab7c9c99c1ad4d3e6954de8b73a0 ]
When start trace with perf option "-C $cpu" and immediately stop it
with SIGTERM or others, the perf core will invoke pmu::read() while
the driver doesn't implement it. Add a dummy pmu::read() to avoid
any issues.
Fixes:
|
||
|
|
2f6b1527db |
coresight: Fix crash when Perf and sysfs modes are used concurrently
[ Upstream commit 287e82cf69aa264a52bc37591bd0eb407e20f85c ] Partially revert the change in commit |
||
|
|
1b5d156c24 |
coresight: etm4x: Remove bogous __exit annotation for some functions
[ Upstream commit 348ddab81f7b0983d9fb158df910254f08d3f887 ] etm4_platform_driver (which lives in ".data" contains a reference to etm4_remove_platform_dev(). So the latter must not be marked with __exit which results in the function being discarded for a build with CONFIG_CORESIGHT_SOURCE_ETM4X=y which in turn makes the remove pointer contain invalid data. etm4x_amba_driver referencing etm4_remove_amba() has the same issue. Drop the __exit annotations for the two affected functions and a third one that is called by the other two. For reasons I don't understand this isn't catched by building with CONFIG_DEBUG_SECTION_MISMATCH=y. Fixes: |
||
|
|
b9cc170842 |
arm64: dts: mediatek: mt8186: Change gpu speedbin nvmem cell name
commit 59fa1e51ba54e1f513985a8177969b62973f7fd5 upstream.
MT8186's GPU speedbin value must be interpreted, or the value will not
be meaningful.
Use the correct "gpu-speedbin" nvmem cell name for the GPU speedbin to
allow triggering the cell info fixup handler, hence feeding the right
speedbin number to the users.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
b6eccbcb1b |
arm64: dts: mediatek: mt8186: fix clock names for power domains
commit 9adf7580f6d498a5839e02fa1d1535e934364602 upstream.
Clocks for each power domain are split into big categories: pd clocks
and subsys clocks.
According to the binding, all clocks which have a dash '-' in their name
are treated as subsys clocks, and must be placed at the end of the list.
The other clocks which are pd clocks must come first.
Fixed the naming and the placing of all clocks in the power domains.
For the avoidance of doubt, prefixed all subsys clocks with the 'subsys'
prefix. The binding does not enforce strict clock names, the driver
uses them in bulk, only making a difference for pd clocks vs subsys clocks.
The above problem appears to be trivial, however, it leads to incorrect
power up and power down sequence of the power domains, because some
clocks will be mistakenly taken for subsys clocks and viceversa.
One consequence is the fact that if the DIS power domain goes power down
and power back up during the boot process, when it comes back up, there
are still transactions left on the bus which makes the display inoperable.
Some of the clocks for the DIS power domain were wrongly using '_' instead
of '-', which again made these clocks being treated as pd clocks instead of
subsys clocks.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
2e465268df |
arm64: dts: mediatek: mt8183-evb: Fix unit_address_vs_reg warning on ntc
commit 9dea1c724fc36643e83216c1f5a26613412150db upstream.
The NTC is defined as ntc@0 but it doesn't need any address at all.
Fix the unit_address_vs_reg warning by dropping the unit address: since
the node name has to be generic also fully rename it from ntc@0 to
thermal-sensor.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
bfff27fb5d |
arm64: dts: mediatek: mt8183: Move thermal-zones to the root node
commit 5a60d63439694590cd5ab1f998fc917ff7ba1c1d upstream.
The thermal zones are not a soc bus device: move it to the root
node to solve simple_bus_reg warnings.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
d97373c3b1 |
arm64: dts: mediatek: mt8183: Fix unit address for scp reserved memory
commit 19cba9a6c071db57888dc6b2ec1d9bf8996ea681 upstream.
The reserved memory for scp had node name "scp_mem_region" and also
without unit-address: change the name to "memory@(address)".
This fixes a unit_address_vs_reg warning.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
9c4ae4801f |
arm64: dts: mediatek: mt8195: Fix PM suspend/resume with venc clocks
commit 61b94d54421a1f3670ddd5396ec70afe833e9405 upstream. Before suspending the LARBs we're making sure that any operation is done: this never happens because we are unexpectedly unclocking the LARB20 before executing the suspend handler for the MediaTek Smart Multimedia Interface (SMI) and the cause of this is incorrect clocks on this LARB. Fix this issue by changing the Local Arbiter 20 (used by the video encoder secondary core) apb clock to CLK_VENC_CORE1_VENC; furthermore, in order to make sure that both the PM resume and video encoder operation is stable, add the CLK_VENC(_CORE1)_LARB clock to the VENC (main core) and VENC_CORE1 power domains, as this IP cannot communicate with the rest of the system (the AP) without local arbiter clocks being operational. Cc: stable@vger.kernel.org Fixes: |
||
|
|
1253026694 |
arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names
commit 24165c5dad7ba7c7624d05575a5e0cc851396c71 upstream.
Fix a unit_address_vs_reg warning for the USB VBUS fixed regulators
by renaming the regulator nodes from regulator@{0,1} to regulator-usb-p0
and regulator-usb-p1.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
ac9a2f55bf |
arm64: dts: mediatek: cherry: Fix interrupt cells for MT6360 on I2C7
commit 5943b8f7449df9881b273db07bdde1e7120dccf0 upstream.
Change interrupt cells to 2 to suppress interrupts_property warning.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
d7646d79ea |
arm64: dts: mediatek: mt8183-kukui-jacuzzi: fix dsi unnecessary cells properties
commit 74543b303a9abfe4fa253d1fa215281baa05ff3a upstream.
dtbs_check throws a warning at the dsi node:
Warning (avoid_unnecessary_addr_size): /soc/dsi@14014000: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property
Other DTS have a panel child node with a reg, so the parent dtsi
must have the address-cells and size-cells, however this specific DT
has the panel removed, but not the cells, hence the warning above.
If panel is deleted then the cells must also be deleted since they are
tied together, as the child node in this DT does not have a reg.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
6a6df679ac |
arm64: dts: mediatek: mt7622: fix memory node warning check
commit 8e6ecbfd44b5542a7598c1c5fc9c6dcb5d367f2a upstream.
dtbs_check throws a warning at the memory node:
Warning (unit_address_vs_reg): /memory: node has a reg or ranges property, but no unit name
fix by adding the address into the node name.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
90dc20c8c5 |
arm64: dts: mt7986: fix emmc hs400 mode without uboot initialization
commit 8dfe51c3f6ef31502fca3e2da8cd250ebbe4b8f2 upstream.
Eric reports errors on emmc with hs400 mode when booting linux on bpi-r3
without uboot [1]. Booting with uboot does not show this because clocks
seem to be initialized by uboot.
Fix this by adding assigned-clocks and assigned-clock-parents like it's
done in uboot [2].
[1] https://forum.banana-pi.org/t/bpi-r3-kernel-fails-setting-emmc-clock-to-416m-depends-on-u-boot/15170
[2] https://github.com/u-boot/u-boot/blob/master/arch/arm/dts/mt7986.dtsi#L287
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
287b1c41de |
arm64: dts: mt7986: define 3W max power to both SFP on BPI-R3
commit 6413cbc17f89b3a160f3a6f3fad1232b1678fe40 upstream.
All SFP power supplies are connected to the system VDD33 which is 3v3/8A.
Set 3A per SFP slot to allow SFPs work which need more power than the
default 1W.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
5012eb0280 |
arm64: dts: mt7986: change cooling trips
commit 1fcda8ceb014aafd56f10b33e0077c93b5dd45d1 upstream. Add Critical and hot trips for emergency system shutdown and limiting system load. Change passive trip to active to make sure fan is activated on the lowest trip. Cc: stable@vger.kernel.org Fixes: |
||
|
|
8e1e489cdb |
drm/i915: Skip some timing checks on BXT/GLK DSI transcoders
commit 20c2dbff342aec13bf93c2f6c951da198916a455 upstream.
Apparently some BXT/GLK systems have DSI panels whose timings
don't agree with the normal cpu transcoder hblank>=32 limitation.
This is perhaps fine as there are no specific hblank/etc. limits
listed for the BXT/GLK DSI transcoders.
Move those checks out from the global intel_mode_valid() into
into connector specific .mode_valid() hooks, skipping BXT/GLK
DSI connectors. We'll leave the basic [hv]display/[hv]total
checks in intel_mode_valid() as those seem like sensible upper
limits regardless of the transcoder used.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9720
Fixes:
|
||
|
|
a0396af35c |
drm/i915/mst: Reject modes that require the bigjoiner
commit dd7eb65c493615fda7d459501c3d4a46e00ea5ba upstream.
We have no bigjoiner support in the MST code, so .mode_valid()
pretending otherwise is just going to result black screens for
users. Reject any mode that needs the joiner.
Cc: stable@vger.kernel.org
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes:
|
||
|
|
654748c6fc |
drm/i915/mst: Fix .mode_valid_ctx() return values
commit 7cf82b25dd91d7f330d9df2de868caca14289ba1 upstream.
.mode_valid_ctx() returns an errno, not the mode status. Fix
the code to do the right thing.
Cc: stable@vger.kernel.org
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes:
|
||
|
|
02650b3b98 |
drm/atomic-helpers: Invoke end_fb_access while owning plane state
commit e0f04e41e8eedd4e5a1275f2318df7e1841855f2 upstream.
Invoke drm_plane_helper_funcs.end_fb_access before
drm_atomic_helper_commit_hw_done(). The latter function hands over
ownership of the plane state to the following commit, which might
free it. Releasing resources in end_fb_access then operates on undefined
state. This bug has been observed with non-blocking commits when they
are being queued up quickly.
Here is an example stack trace from the bug report. The plane state has
been free'd already, so the pages for drm_gem_fb_vunmap() are gone.
Unable to handle kernel paging request at virtual address 0000000100000049
[...]
drm_gem_fb_vunmap+0x18/0x74
drm_gem_end_shadow_fb_access+0x1c/0x2c
drm_atomic_helper_cleanup_planes+0x58/0xd8
drm_atomic_helper_commit_tail+0x90/0xa0
commit_tail+0x15c/0x188
commit_work+0x14/0x20
Fix this by running end_fb_access immediately after updating all planes
in drm_atomic_helper_commit_planes(). The existing clean-up helper
drm_atomic_helper_cleanup_planes() now only handles cleanup_fb.
For aborted commits, roll back from drm_atomic_helper_prepare_planes()
in the new helper drm_atomic_helper_unprepare_planes(). This case is
different from regular cleanup, as we have to release the new state;
regular cleanup releases the old state. The new helper also invokes
cleanup_fb for all planes.
The changes mostly involve DRM's atomic helpers. Only two drivers, i915
and nouveau, implement their own commit function. Update them to invoke
drm_atomic_helper_unprepare_planes(). Drivers with custom commit_tail
function do not require changes.
v4:
* fix documentation (kernel test robot)
v3:
* add drm_atomic_helper_unprepare_planes() for rolling back
* use correct state for end_fb_access
v2:
* fix test in drm_atomic_helper_cleanup_planes()
Reported-by: Alyssa Ross <hi@alyssa.is>
Closes: https://lore.kernel.org/dri-devel/87leazm0ya.fsf@alyssa.is/
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Fixes:
|
||
|
|
4ce431c297 |
md/raid6: use valid sector values to determine if an I/O should wait on the reshape
commit c467e97f079f0019870c314996fae952cc768e82 upstream.
During a reshape or a RAID6 array such as expanding by adding an additional
disk, I/Os to the region of the array which have not yet been reshaped can
stall indefinitely. This is from errors in the stripe_ahead_of_reshape
function causing md to think the I/O is to a region in the actively
undergoing the reshape.
stripe_ahead_of_reshape fails to account for the q disk having a sector
value of 0. By not excluding the q disk from the for loop, raid6 will always
generate a min_sector value of 0, causing a return value which stalls.
The function's max_sector calculation also uses min() when it should use
max(), causing the max_sector value to always be 0. During a backwards
rebuild this can cause the opposite problem where it allows I/O to advance
when it should wait.
Fixing these errors will allow safe I/O to advance in a timely manner and
delay only I/O which is unsafe due to stripes in the middle of undergoing
the reshape.
Fixes:
|
||
|
|
aa581b37da |
powercap: DTPM: Fix missing cpufreq_cpu_put() calls
commit bdefd9913bdd453991ef756b6f7176e8ad80d786 upstream. The policy returned by cpufreq_cpu_get() has to be released with the help of cpufreq_cpu_put() to balance its kobject reference counter properly. Add the missing calls to cpufreq_cpu_put() in the code. Fixes: |
||
|
|
9e5d309674 |
mm/memory_hotplug: fix error handling in add_memory_resource()
commit f42ce5f087eb69e47294ababd2e7e6f88a82d308 upstream.
In add_memory_resource(), creation of memory block devices occurs after
successful call to arch_add_memory(). However, creation of memory block
devices could fail. In that case, arch_remove_memory() is called to
perform necessary cleanup.
Currently with or without altmap support, arch_remove_memory() is always
passed with altmap set to NULL during error handling. This leads to
freeing of struct pages using free_pages(), eventhough the allocation
might have been performed with altmap support via
altmap_alloc_block_buf().
Fix the error handling by passing altmap in arch_remove_memory(). This
ensures the following:
* When altmap is disabled, deallocation of the struct pages array occurs
via free_pages().
* When altmap is enabled, deallocation occurs via vmem_altmap_free().
Link: https://lkml.kernel.org/r/20231120145354.308999-3-sumanthk@linux.ibm.com
Fixes:
|
||
|
|
799f90c385 |
mm: fix oops when filemap_map_pmd() without prealloc_pte
commit 9aa1345d66b8132745ffb99b348b1492088da9e2 upstream.
syzbot reports oops in lockdep's __lock_acquire(), called from
__pte_offset_map_lock() called from filemap_map_pages(); or when I run the
repro, the oops comes in pmd_install(), called from filemap_map_pmd()
called from filemap_map_pages(), just before the __pte_offset_map_lock().
The problem is that filemap_map_pmd() has been assuming that when it finds
pmd_none(), a page table has already been prepared in prealloc_pte; and
indeed do_fault_around() has been careful to preallocate one there, when
it finds pmd_none(): but what if *pmd became none in between?
My 6.6 mods in mm/khugepaged.c, avoiding mmap_lock for write, have made it
easy for *pmd to be cleared while servicing a page fault; but even before
those, a huge *pmd might be zapped while a fault is serviced.
The difference in symptomatic stack traces comes from the "memory model"
in use: pmd_install() uses pmd_populate() uses page_to_pfn(): in some
models that is strict, and will oops on the NULL prealloc_pte; in other
models, it will construct a bogus value to be populated into *pmd, then
__pte_offset_map_lock() oops when trying to access split ptlock pointer
(or some other symptom in normal case of ptlock embedded not pointer).
Link: https://lore.kernel.org/linux-mm/20231115065506.19780-1-jose.pekkarinen@foxhound.fi/
Link: https://lkml.kernel.org/r/6ed0c50c-78ef-0719-b3c5-60c0c010431c@google.com
Fixes:
|
||
|
|
e0270ffad4 |
mm/memory_hotplug: add missing mem_hotplug_lock
commit 001002e73712cdf6b8d9a103648cda3040ad7647 upstream.
From Documentation/core-api/memory-hotplug.rst:
When adding/removing/onlining/offlining memory or adding/removing
heterogeneous/device memory, we should always hold the mem_hotplug_lock
in write mode to serialise memory hotplug (e.g. access to global/zone
variables).
mhp_(de)init_memmap_on_memory() functions can change zone stats and
struct page content, but they are currently called w/o the
mem_hotplug_lock.
When memory block is being offlined and when kmemleak goes through each
populated zone, the following theoretical race conditions could occur:
CPU 0: | CPU 1:
memory_offline() |
-> offline_pages() |
-> mem_hotplug_begin() |
... |
-> mem_hotplug_done() |
| kmemleak_scan()
| -> get_online_mems()
| ...
-> mhp_deinit_memmap_on_memory() |
[not protected by mem_hotplug_begin/done()]|
Marks memory section as offline, | Retrieves zone_start_pfn
poisons vmemmap struct pages and updates | and struct page members.
the zone related data |
| ...
| -> put_online_mems()
Fix this by ensuring mem_hotplug_lock is taken before performing
mhp_init_memmap_on_memory(). Also ensure that
mhp_deinit_memmap_on_memory() holds the lock.
online/offline_pages() are currently only called from
memory_block_online/offline(), so it is safe to move the locking there.
Link: https://lkml.kernel.org/r/20231120145354.308999-2-sumanthk@linux.ibm.com
Fixes:
|
||
|
|
83dd18e0b7 |
drivers/base/cpu: crash data showing should depends on KEXEC_CORE
commit 4e9e2e4c65136dfd32dd0afe555961433d1cf906 upstream. After commit |
||
|
|
512b420aaf |
hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write
commit 187da0f8250aa94bd96266096aef6f694e0b4cd2 upstream.
The routine __vma_private_lock tests for the existence of a reserve map
associated with a private hugetlb mapping. A pointer to the reserve map
is in vma->vm_private_data. __vma_private_lock was checking the pointer
for NULL. However, it is possible that the low bits of the pointer could
be used as flags. In such instances, vm_private_data is not NULL and not
a valid pointer. This results in the null-ptr-deref reported by syzbot:
general protection fault, probably for non-canonical address 0xdffffc000000001d:
0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]
CPU: 0 PID: 5048 Comm: syz-executor139 Not tainted 6.6.0-rc7-syzkaller-00142-g88
8cf78c29e2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 1
0/09/2023
RIP: 0010:__lock_acquire+0x109/0x5de0 kernel/locking/lockdep.c:5004
...
Call Trace:
<TASK>
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
down_write+0x93/0x200 kernel/locking/rwsem.c:1573
hugetlb_vma_lock_write mm/hugetlb.c:300 [inline]
hugetlb_vma_lock_write+0xae/0x100 mm/hugetlb.c:291
__hugetlb_zap_begin+0x1e9/0x2b0 mm/hugetlb.c:5447
hugetlb_zap_begin include/linux/hugetlb.h:258 [inline]
unmap_vmas+0x2f4/0x470 mm/memory.c:1733
exit_mmap+0x1ad/0xa60 mm/mmap.c:3230
__mmput+0x12a/0x4d0 kernel/fork.c:1349
mmput+0x62/0x70 kernel/fork.c:1371
exit_mm kernel/exit.c:567 [inline]
do_exit+0x9ad/0x2a20 kernel/exit.c:861
__do_sys_exit kernel/exit.c:991 [inline]
__se_sys_exit kernel/exit.c:989 [inline]
__x64_sys_exit+0x42/0x50 kernel/exit.c:989
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Mask off low bit flags before checking for NULL pointer. In addition, the
reserve map only 'belongs' to the OWNER (parent in parent/child
relationships) so also check for the OWNER flag.
Link: https://lkml.kernel.org/r/20231114012033.259600-1-mike.kravetz@oracle.com
Reported-by: syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/00000000000078d1e00608d7878b@google.com/
Fixes:
|
||
|
|
b2c562a7a8 |
workqueue: Make sure that wq_unbound_cpumask is never empty
commit 4a6c5607d4502ccd1b15b57d57f17d12b6f257a7 upstream. During boot, depending on how the housekeeping and workqueue.unbound_cpus masks are set, wq_unbound_cpumask can end up empty. Since |
||
|
|
7409c28cab |
platform/surface: aggregator: fix recv_buf() return value
commit c8820c92caf0770bec976b01fa9e82bb993c5865 upstream.
Serdev recv_buf() callback is supposed to return the amount of bytes
consumed, therefore an int in between 0 and count.
Do not return negative number in case of issue, when
ssam_controller_receive_buf() returns ESHUTDOWN just returns 0, e.g. no
bytes consumed, this keep the exact same behavior as it was before.
This fixes a potential WARN in serdev-ttyport.c:ttyport_receive_buf().
Fixes:
|
||
|
|
78c8fc3332 |
regmap: fix bogus error on regcache_sync success
commit fea88064445a59584460f7f67d102b6e5fc1ca1d upstream.
Since commit 0ec7731655de ("regmap: Ensure range selector registers
are updated after cache sync") opening pcm512x based soundcards fail
with EINVAL and dmesg shows sync cache and pm_runtime_get errors:
[ 228.794676] pcm512x 1-004c: Failed to sync cache: -22
[ 228.794740] pcm512x 1-004c: ASoC: error at snd_soc_pcm_component_pm_runtime_get on pcm512x.1-004c: -22
This is caused by the cache check result leaking out into the
regcache_sync return value.
Fix this by making the check local-only, as the comment above the
regcache_read call states a non-zero return value means there's
nothing to do so the return value should not be altered.
Fixes: 0ec7731655de ("regmap: Ensure range selector registers are updated after cache sync")
Cc: stable@vger.kernel.org
Signed-off-by: Matthias Reichl <hias@horus.com>
Link: https://lore.kernel.org/r/20231203222216.96547-1-hias@horus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
2e04cfdd3e |
r8169: fix rtl8125b PAUSE frames blasting when suspended
commit 4b0768b6556af56ee9b7cf4e68452a2b6289ae45 upstream.
When FIFO reaches near full state, device will issue pause frame.
If pause slot is enabled(set to 1), in this time, device will issue
pause frame only once. But if pause slot is disabled(set to 0), device
will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive
packets, device FIFO will reach near full state and keep sending
pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in
developer mode running a test image (and v5.10 kernel):
1) ping -f $CHROMEBOX (from workstation on same local network)
2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
3) ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to
stop working.
Fix this issue by enabling pause slot to only send pause frame once
when FIFO reaches near full state.
Fixes:
|
||
|
|
865b71579d |
packet: Move reference count in packet_sock to atomic_long_t
commit db3fadacaf0c817b222090290d06ca2a338422d0 upstream. In some potential instances the reference count on struct packet_sock could be saturated and cause overflows which gets the kernel a bit confused. To prevent this, move to a 64-bit atomic reference count on 64-bit architectures to prevent the possibility of this type to overflow. Because we can not handle saturation, using refcount_t is not possible in this place. Maybe someday in the future if it changes it could be used. Also, instead of using plain atomic64_t, use atomic_long_t instead. 32-bit machines tend to be memory-limited (i.e. anything that increases a reference uses so much memory that you can't actually get to 2**32 references). 32-bit architectures also tend to have serious problems with 64-bit atomics. Hence, atomic_long_t is the more natural solution. Reported-by: "The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk> Co-developed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231201131021.19999-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
9a89aad086 |
nfp: flower: fix for take a mutex lock in soft irq context and rcu lock
commit 0ad722bd9ee3a9bdfca9613148645e4c9b7f26cf upstream.
The neighbour event callback call the function nfp_tun_write_neigh,
this function will take a mutex lock and it is in soft irq context,
change the work queue to process the neighbour event.
Move the nfp_tun_write_neigh function out of range rcu_read_lock/unlock()
in function nfp_tunnel_request_route_v4 and nfp_tunnel_request_route_v6.
Fixes:
|
||
|
|
3c0adff939 |
leds: trigger: netdev: fix RTNL handling to prevent potential deadlock
commit fe2b1226656afae56702d1d84c6900f6b67df297 upstream.
When working on LED support for r8169 I got the following lockdep
warning. Easiest way to prevent this scenario seems to be to take
the RTNL lock before the trigger_data lock in set_device_name().
======================================================
WARNING: possible circular locking dependency detected
6.7.0-rc2-next-20231124+ #2 Not tainted
------------------------------------------------------
bash/383 is trying to acquire lock:
ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
but task is already holding lock:
ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (rtnl_mutex){+.+.}-{3:3}:
__mutex_lock+0x9b/0xb50
mutex_lock_nested+0x16/0x20
rtnl_lock+0x12/0x20
set_device_name+0xa9/0x120 [ledtrig_netdev]
netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
-> #0 (&trigger_data->lock){+.+.}-{3:3}:
__lock_acquire+0x1459/0x25a0
lock_acquire+0xc8/0x2d0
__mutex_lock+0x9b/0xb50
mutex_lock_nested+0x16/0x20
netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
call_netdevice_register_net_notifiers+0x5a/0x100
register_netdevice_notifier+0x85/0x120
netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock(&trigger_data->lock);
lock(rtnl_mutex);
lock(&trigger_data->lock);
*** DEADLOCK ***
8 locks held by bash/383:
#0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0
#1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210
#2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210
#3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140
#4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140
#5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140
#6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120
#7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
stack backtrace:
CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2
Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023
Call Trace:
<TASK>
dump_stack_lvl+0x5c/0xd0
dump_stack+0x10/0x20
print_circular_bug+0x2dd/0x410
check_noncircular+0x131/0x150
__lock_acquire+0x1459/0x25a0
lock_acquire+0xc8/0x2d0
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
__mutex_lock+0x9b/0xb50
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
? __this_cpu_preempt_check+0x13/0x20
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
? __cancel_work_timer+0x11c/0x1b0
? __mutex_lock+0x123/0xb50
mutex_lock_nested+0x16/0x20
? mutex_lock_nested+0x16/0x20
netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
call_netdevice_register_net_notifiers+0x5a/0x100
register_netdevice_notifier+0x85/0x120
netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
? preempt_count_add+0x49/0xc0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
RIP: 0033:0x7f269055d034
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034
RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001
RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007
R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000
</TASK>
Fixes:
|
||
|
|
7d97646474 |
tracing: Fix a possible race when disabling buffered events
commit c0591b1cccf708a47bc465c62436d669a4213323 upstream.
Function trace_buffered_event_disable() is responsible for freeing pages
backing buffered events and this process can run concurrently with
trace_event_buffer_lock_reserve().
The following race is currently possible:
* Function trace_buffered_event_disable() is called on CPU 0. It
increments trace_buffered_event_cnt on each CPU and waits via
synchronize_rcu() for each user of trace_buffered_event to complete.
* After synchronize_rcu() is finished, function
trace_buffered_event_disable() has the exclusive access to
trace_buffered_event. All counters trace_buffered_event_cnt are at 1
and all pointers trace_buffered_event are still valid.
* At this point, on a different CPU 1, the execution reaches
trace_event_buffer_lock_reserve(). The function calls
preempt_disable_notrace() and only now enters an RCU read-side
critical section. The function proceeds and reads a still valid
pointer from trace_buffered_event[CPU1] into the local variable
"entry". However, it doesn't yet read trace_buffered_event_cnt[CPU1]
which happens later.
* Function trace_buffered_event_disable() continues. It frees
trace_buffered_event[CPU1] and decrements
trace_buffered_event_cnt[CPU1] back to 0.
* Function trace_event_buffer_lock_reserve() continues. It reads and
increments trace_buffered_event_cnt[CPU1] from 0 to 1. This makes it
believe that it can use the "entry" that it already obtained but the
pointer is now invalid and any access results in a use-after-free.
Fix the problem by making a second synchronize_rcu() call after all
trace_buffered_event values are set to NULL. This waits on all potential
users in trace_event_buffer_lock_reserve() that still read a previous
pointer from trace_buffered_event.
Link: https://lore.kernel.org/all/20231127151248.7232-2-petr.pavlu@suse.com/
Link: https://lkml.kernel.org/r/20231205161736.19663-4-petr.pavlu@suse.com
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
fc9fa702db |
tracing: Fix incomplete locking when disabling buffered events
commit 7fed14f7ac9cf5e38c693836fe4a874720141845 upstream. The following warning appears when using buffered events: [ 203.556451] WARNING: CPU: 53 PID: 10220 at kernel/trace/ring_buffer.c:3912 ring_buffer_discard_commit+0x2eb/0x420 [...] [ 203.670690] CPU: 53 PID: 10220 Comm: stress-ng-sysin Tainted: G E 6.7.0-rc2-default #4 56e6d0fcf5581e6e51eaaecbdaec2a2338c80f3a [ 203.670704] Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017 [ 203.670709] RIP: 0010:ring_buffer_discard_commit+0x2eb/0x420 [ 203.735721] Code: 4c 8b 4a 50 48 8b 42 48 49 39 c1 0f 84 b3 00 00 00 49 83 e8 01 75 b1 48 8b 42 10 f0 ff 40 08 0f 0b e9 fc fe ff ff f0 ff 47 08 <0f> 0b e9 77 fd ff ff 48 8b 42 10 f0 ff 40 08 0f 0b e9 f5 fe ff ff [ 203.735734] RSP: 0018:ffffb4ae4f7b7d80 EFLAGS: 00010202 [ 203.735745] RAX: 0000000000000000 RBX: ffffb4ae4f7b7de0 RCX: ffff8ac10662c000 [ 203.735754] RDX: ffff8ac0c750be00 RSI: ffff8ac10662c000 RDI: ffff8ac0c004d400 [ 203.781832] RBP: ffff8ac0c039cea0 R08: 0000000000000000 R09: 0000000000000000 [ 203.781839] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 203.781842] R13: ffff8ac10662c000 R14: ffff8ac0c004d400 R15: ffff8ac10662c008 [ 203.781846] FS: 00007f4cd8a67740(0000) GS:ffff8ad798880000(0000) knlGS:0000000000000000 [ 203.781851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 203.781855] CR2: 0000559766a74028 CR3: 00000001804c4000 CR4: 00000000001506f0 [ 203.781862] Call Trace: [ 203.781870] <TASK> [ 203.851949] trace_event_buffer_commit+0x1ea/0x250 [ 203.851967] trace_event_raw_event_sys_enter+0x83/0xe0 [ 203.851983] syscall_trace_enter.isra.0+0x182/0x1a0 [ 203.851990] do_syscall_64+0x3a/0xe0 [ 203.852075] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 203.852090] RIP: 0033:0x7f4cd870fa77 [ 203.982920] Code: 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 43 0e 00 f7 d8 64 89 01 48 [ 203.982932] RSP: 002b:00007fff99717dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000089 [ 203.982942] RAX: ffffffffffffffda RBX: 0000558ea1d7b6f0 RCX: 00007f4cd870fa77 [ 203.982948] RDX: 0000000000000000 RSI: 00007fff99717de0 RDI: 0000558ea1d7b6f0 [ 203.982957] RBP: 00007fff99717de0 R08: 00007fff997180e0 R09: 00007fff997180e0 [ 203.982962] R10: 00007fff997180e0 R11: 0000000000000246 R12: 00007fff99717f40 [ 204.049239] R13: 00007fff99718590 R14: 0000558e9f2127a8 R15: 00007fff997180b0 [ 204.049256] </TASK> For instance, it can be triggered by running these two commands in parallel: $ while true; do echo hist:key=id.syscall:val=hitcount > \ /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger; done $ stress-ng --sysinfo $(nproc) The warning indicates that the current ring_buffer_per_cpu is not in the committing state. It happens because the active ring_buffer_event doesn't actually come from the ring_buffer_per_cpu but is allocated from trace_buffered_event. The bug is in function trace_buffered_event_disable() where the following normally happens: * The code invokes disable_trace_buffered_event() via smp_call_function_many() and follows it by synchronize_rcu(). This increments the per-CPU variable trace_buffered_event_cnt on each target CPU and grants trace_buffered_event_disable() the exclusive access to the per-CPU variable trace_buffered_event. * Maintenance is performed on trace_buffered_event, all per-CPU event buffers get freed. * The code invokes enable_trace_buffered_event() via smp_call_function_many(). This decrements trace_buffered_event_cnt and releases the access to trace_buffered_event. A problem is that smp_call_function_many() runs a given function on all target CPUs except on the current one. The following can then occur: * Task X executing trace_buffered_event_disable() runs on CPU 0. * The control reaches synchronize_rcu() and the task gets rescheduled on another CPU 1. * The RCU synchronization finishes. At this point, trace_buffered_event_disable() has the exclusive access to all trace_buffered_event variables except trace_buffered_event[CPU0] because trace_buffered_event_cnt[CPU0] is never incremented and if the buffer is currently unused, remains set to 0. * A different task Y is scheduled on CPU 0 and hits a trace event. The code in trace_event_buffer_lock_reserve() sees that trace_buffered_event_cnt[CPU0] is set to 0 and decides the use the buffer provided by trace_buffered_event[CPU0]. * Task X continues its execution in trace_buffered_event_disable(). The code incorrectly frees the event buffer pointed by trace_buffered_event[CPU0] and resets the variable to NULL. * Task Y writes event data to the now freed buffer and later detects the created inconsistency. The issue is observable since commit |
||
|
|
0486a1f9d9 |
tracing: Disable snapshot buffer when stopping instance tracers
commit b538bf7d0ec11ca49f536dfda742a5f6db90a798 upstream.
It use to be that only the top level instance had a snapshot buffer (for
latency tracers like wakeup and irqsoff). When stopping a tracer in an
instance would not disable the snapshot buffer. This could have some
unintended consequences if the irqsoff tracer is enabled.
Consolidate the tracing_start/stop() with tracing_start/stop_tr() so that
all instances behave the same. The tracing_start/stop() functions will
just call their respective tracing_start/stop_tr() with the global_array
passed in.
Link: https://lkml.kernel.org/r/20231205220011.041220035@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes:
|
||
|
|
12c48e88e5 |
tracing: Stop current tracer when resizing buffer
commit d78ab792705c7be1b91243b2544d1a79406a2ad7 upstream.
When the ring buffer is being resized, it can cause side effects to the
running tracer. For instance, there's a race with irqsoff tracer that
swaps individual per cpu buffers between the main buffer and the snapshot
buffer. The resize operation modifies the main buffer and then the
snapshot buffer. If a swap happens in between those two operations it will
break the tracer.
Simply stop the running tracer before resizing the buffers and enable it
again when finished.
Link: https://lkml.kernel.org/r/20231205220010.748996423@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes:
|
||
|
|
1741e17c39 |
tracing: Always update snapshot buffer size
commit 7be76461f302ec05cbd62b90b2a05c64299ca01f upstream.
It use to be that only the top level instance had a snapshot buffer (for
latency tracers like wakeup and irqsoff). The update of the ring buffer
size would check if the instance was the top level and if so, it would
also update the snapshot buffer as it needs to be the same as the main
buffer.
Now that lower level instances also has a snapshot buffer, they too need
to update their snapshot buffer sizes when the main buffer is changed,
otherwise the following can be triggered:
# cd /sys/kernel/tracing
# echo 1500 > buffer_size_kb
# mkdir instances/foo
# echo irqsoff > instances/foo/current_tracer
# echo 1000 > instances/foo/buffer_size_kb
Produces:
WARNING: CPU: 2 PID: 856 at kernel/trace/trace.c:1938 update_max_tr_single.part.0+0x27d/0x320
Which is:
ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu);
if (ret == -EBUSY) {
[..]
}
WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); <== here
That's because ring_buffer_swap_cpu() has:
int ret = -EINVAL;
[..]
/* At least make sure the two buffers are somewhat the same */
if (cpu_buffer_a->nr_pages != cpu_buffer_b->nr_pages)
goto out;
[..]
out:
return ret;
}
Instead, update all instances' snapshot buffer sizes when their main
buffer size is updated.
Link: https://lkml.kernel.org/r/20231205220010.454662151@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes:
|
||
|
|
f8f32f9126 |
checkstack: fix printed address
commit ee34db3f271cea4d4252048617919c2caafe698b upstream. All addresses printed by checkstack have an extra incorrect 0 appended at the end. This was introduced with commit |
||
|
|
9ec2d92673 |
cgroup_freezer: cgroup_freezing: Check if not frozen
commit cff5f49d433fcd0063c8be7dd08fa5bf190c6c37 upstream.
__thaw_task() was recently updated to warn if the task being thawed was
part of a freezer cgroup that is still currently freezing:
void __thaw_task(struct task_struct *p)
{
...
if (WARN_ON_ONCE(freezing(p)))
goto unlock;
This has exposed a bug in cgroup1 freezing where when CGROUP_FROZEN is
asserted, the CGROUP_FREEZING bits are not also cleared at the same
time. Meaning, when a cgroup is marked FROZEN it continues to be marked
FREEZING as well. This causes the WARNING to trigger, because
cgroup_freezing() thinks the cgroup is still freezing.
There are two ways to fix this:
1. Whenever FROZEN is set, clear FREEZING for the cgroup and all
children cgroups.
2. Update cgroup_freezing() to also verify that FROZEN is not set.
This patch implements option (2), since it's smaller and more
straightforward.
Signed-off-by: Tim Van Patten <timvp@google.com>
Tested-by: Mark Hasemeyer <markhas@chromium.org>
Fixes:
|
||
|
|
39f603a262 |
lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
commit 0263f92fadbb9d294d5971ac57743f882c93b2b3 upstream. group_cpus_evenly() could be part of storage driver's error handler, such as nvme driver, when may happen during CPU hotplug, in which storage queue has to drain its pending IOs because all CPUs associated with the queue are offline and the queue is becoming inactive. And handling IO needs error handler to provide forward progress. Then deadlock is caused: 1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's handler is waiting for inflight IO 2) error handler is waiting for CPU hotplug lock 3) inflight IO can't be completed in blk-mq's CPU hotplug handler because error handling can't provide forward progress. Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(), in which two stage spreads are taken: 1) the 1st stage is over all present CPUs; 2) the end stage is over all other CPUs. Turns out the two stage spread just needs consistent 'cpu_present_mask', and remove the CPU hotplug lock by storing it into one local cache. This way doesn't change correctness, because all CPUs are still covered. Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: Guangwu Zhang <guazhang@redhat.com> Tested-by: Guangwu Zhang <guazhang@redhat.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |