linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 20:32:04 +09:00

Author	SHA1	Message	Date
Hengqi Chen	4b117370d1	LoongArch: BPF: Don't sign extend function return value [ Upstream commit 5d47ec2e6f4c64e30e392cfe9532df98c9beb106 ] The `cls_redirect` test triggers a kernel panic like: # ./test_progs -t cls_redirect Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. [ 30.938489] CPU 3 Unable to handle kernel paging request at virtual address fffffffffd814de0, era == ffff800002009fb8, ra == ffff800002009f9c [ 30.939331] Oops[#1]: [ 30.939513] CPU: 3 PID: 1260 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6 [ 30.939732] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 30.939901] pc ffff800002009fb8 ra ffff800002009f9c tp 9000000104da4000 sp 9000000104da7ab0 [ 30.940038] a0 fffffffffd814de0 a1 9000000104da7a68 a2 0000000000000000 a3 9000000104da7c10 [ 30.940183] a4 9000000104da7c14 a5 0000000000000002 a6 0000000000000021 a7 00005555904d7f90 [ 30.940321] t0 0000000000000110 t1 0000000000000000 t2 fffffffffd814de0 t3 0004c4b400000000 [ 30.940456] t4 ffffffffffffffff t5 00000000c3f63600 t6 0000000000000000 t7 0000000000000000 [ 30.940590] t8 000000000006d803 u0 0000000000000020 s9 9000000104da7b10 s0 900000010504c200 [ 30.940727] s1 fffffffffd814de0 s2 900000010504c200 s3 9000000104da7c10 s4 9000000104da7ad0 [ 30.940866] s5 0000000000000000 s6 90000000030e65bc s7 9000000104da7b44 s8 90000000044f6fc0 [ 30.941015] ra: ffff800002009f9c bpf_prog_846803e5ae81417f_cls_redirect+0xa0/0x590 [ 30.941535] ERA: ffff800002009fb8 bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590 [ 30.941696] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 30.942224] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 30.942330] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 30.942453] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 30.942612] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 30.942764] BADV: fffffffffd814de0 [ 30.942854] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 30.942974] Modules linked in: [ 30.943078] Process test_progs (pid: 1260, threadinfo=00000000ce303226, task=000000007d10bb76) [ 30.943306] Stack : 900000010a064000 90000000044f6fc0 9000000104da7b48 0000000000000000 [ 30.943495] 0000000000000000 9000000104da7c14 9000000104da7c10 900000010504c200 [ 30.943626] 0000000000000001 ffff80001b88c000 9000000104da7b70 90000000030e6668 [ 30.943785] 0000000000000000 9000000104da7b58 ffff80001b88c048 9000000003d05000 [ 30.943936] 900000000303ac88 0000000000000000 0000000000000000 9000000104da7b70 [ 30.944091] 0000000000000000 0000000000000001 0000000731eeab00 0000000000000000 [ 30.944245] ffff80001b88c000 0000000000000000 0000000000000000 54b99959429f83b8 [ 30.944402] ffff80001b88c000 90000000044f6fc0 9000000101d70000 ffff80001b88c000 [ 30.944538] 000000000000005a 900000010504c200 900000010a064000 900000010a067000 [ 30.944697] 9000000104da7d88 0000000000000000 9000000003d05000 90000000030e794c [ 30.944852] ... [ 30.944924] Call Trace: [ 30.945120] [<ffff800002009fb8>] bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590 [ 30.945650] [<90000000030e6668>] bpf_test_run+0x1ec/0x2f8 [ 30.945958] [<90000000030e794c>] bpf_prog_test_run_skb+0x31c/0x684 [ 30.946065] [<90000000026d4f68>] __sys_bpf+0x678/0x2724 [ 30.946159] [<90000000026d7288>] sys_bpf+0x20/0x2c [ 30.946253] [<90000000032dd224>] do_syscall+0x7c/0x94 [ 30.946343] [<9000000002541c5c>] handle_syscall+0xbc/0x158 [ 30.946492] [ 30.946549] Code: 0015030e 5c0009c0 5001d000 <28c00304> 02c00484 29c00304 00150009 2a42d2e4 0280200d [ 30.946793] [ 30.946971] ---[ end trace 0000000000000000 ]--- [ 32.093225] Kernel panic - not syncing: Fatal exception in interrupt [ 32.093526] Kernel relocated by 0x2320000 [ 32.093630] .text @ 0x9000000002520000 [ 32.093725] .data @ 0x9000000003400000 [ 32.093792] .bss @ 0x9000000004413200 [ 34.971998] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- This is because we signed-extend function return values. When subprog mode is enabled, we have: cls_redirect() -> get_global_metrics() returns pcpu ptr 0xfffffefffc00b480 The pointer returned is later signed-extended to 0xfffffffffc00b480 at `BPF_JMP \| BPF_EXIT`. During BPF prog run, this triggers unhandled page fault and a kernel panic. Drop the unnecessary signed-extension on return values like other architectures do. With this change, we have: # ./test_progs -t cls_redirect Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. #51/1 cls_redirect/cls_redirect_inlined:OK #51/2 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK #51/3 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK #51/4 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK #51/5 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK #51/6 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK #51/7 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK #51/8 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK #51/9 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK #51/10 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK #51/11 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK #51/12 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK #51/13 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK #51/14 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK #51/15 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK #51/16 cls_redirect/cls_redirect_subprogs:OK #51/17 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK #51/18 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK #51/19 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK #51/20 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK #51/21 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK #51/22 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK #51/23 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK #51/24 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK #51/25 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK #51/26 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK #51/27 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK #51/28 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK #51/29 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK #51/30 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK #51/31 cls_redirect/cls_redirect_dynptr:OK #51/32 cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK #51/33 cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK #51/34 cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK #51/35 cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK #51/36 cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK #51/37 cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK #51/38 cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK #51/39 cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK #51/40 cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK #51/41 cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK #51/42 cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK #51/43 cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK #51/44 cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK #51/45 cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK #51 cls_redirect:OK Summary: 1/45 PASSED, 0 SKIPPED, 0 FAILED Fixes: `5dc615520c` ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Hengqi Chen	3275410b13	LoongArch: BPF: Don't sign extend memory load operand [ Upstream commit fe5757553bf9ebe45ae8ecab5922f6937c8d8dfc ] The `cgrp_local_storage` test triggers a kernel panic like: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. [ 550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00 [ 550.931781] Oops[#1]: [ 550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6 [ 550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0 [ 550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558 [ 550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118 [ 550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0 [ 550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020 [ 550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700 [ 550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8 [ 550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050 [ 550.933520] ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200 [ 550.933911] ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.934105] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 550.934596] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 550.934712] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 550.934836] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 550.935097] BADV: 0000000000000080 [ 550.935181] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 550.935291] Modules linked in: [ 550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55) [ 550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001 [ 550.935844] 9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034 [ 550.935990] 0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09 [ 550.936175] 0000000000000001 0000000000000000 9000000108353ec0 0000000000000118 [ 550.936314] 9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000 [ 550.936479] 0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288 [ 550.936635] 00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003 [ 550.936779] 9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c [ 550.936939] 0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0 [ 550.937083] ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558 [ 550.937224] ... [ 550.937299] Call Trace: [ 550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154 [ 550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200 [ 550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94 [ 550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158 [ 550.938477] [ 550.938607] Code: 580009ae 50016000 262402e4 <28c20085> 14092084 03a00084 16000024 03240084 00150006 [ 550.938851] [ 550.939021] ---[ end trace 0000000000000000 ]--- Further investigation shows that this panic is triggered by memory load operations: ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, BPF_LOCAL_STORAGE_GET_F_CREATE); The expression `task->cgroups->dfl_cgrp` involves two memory load. Since the field offset fits in imm12 or imm14, we use ldd or ldptrd instructions. But both instructions have the side effect that it will signed-extended the imm operand. Finally, we got the wrong addresses and panics is inevitable. Use a generic ldxd instruction to avoid this kind of issues. With this change, we have: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec #48/1 cgrp_local_storage/tp_btf:OK test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 #48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) #48/3 cgrp_local_storage/recursion:FAIL #48/4 cgrp_local_storage/negative:OK #48/5 cgrp_local_storage/cgroup_iter_sleepable:OK test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) #48/6 cgrp_local_storage/yes_rcu_lock:FAIL #48/7 cgrp_local_storage/no_rcu_lock:OK #48 cgrp_local_storage:FAIL All error logs: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 #48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) #48/3 cgrp_local_storage/recursion:FAIL test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) #48/6 cgrp_local_storage/yes_rcu_lock:FAIL #48 cgrp_local_storage:FAIL Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED No panics any more (The test still failed because lack of BPF trampoline which I am actively working on). Fixes: `5dc615520c` ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Ilkka Koskinen	0fdd1b8848	perf vendor events arm64: AmpereOne: Add missing DefaultMetricgroupName fields [ Upstream commit 90fe70d4e23cb57253d2668a171d5695c332deb7 ] AmpereOne metrics were missing DefaultMetricgroupName from metrics with "Default" in group name resulting perf to segfault. Add the missing field to address the issue. Fixes: 59faeaf80d02 ("perf vendor events arm64: Fix for AmpereOne metrics") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231201021550.1109196-2-ilkka@os.amperecomputing.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Su Hui	f78fff4648	misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write [ Upstream commit ee6236027218f8531916f1c5caa5dc330379f287 ] Clang static analyzer complains that value stored to 'rets' is never read.Let 'buf_len = -EOVERFLOW' to make sure we can return '-EOVERFLOW'. Fixes: `8c8d964ce9` ("mei: move hbuf_depth from the mei device to the hw modules") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231120095523.178385-2-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Su Hui	e2365ead01	misc: mei: client.c: return negative error code in mei_cl_write [ Upstream commit 8f06aee8089cf42fd99a20184501bd1347ce61b9 ] mei_msg_hdr_init() return negative error code, rets should be 'PTR_ERR(mei_hdr)' rather than '-PTR_ERR(mei_hdr)'. Fixes: `0cd7c01a60` ("mei: add support for mei extended header.") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231120095523.178385-1-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Junhao He	c541d0edd8	coresight: ultrasoc-smb: Fix uninitialized before use buf_hw_base [ Upstream commit 862c135bde8bc185e8aae2110374175e6a1b6ed5 ] In smb_reset_buffer, the sdb->buf_hw_base variable is uninitialized before use, which initializes it in smb_init_data_buffer. And the SMB regiester are set in smb_config_inport. So move the call after smb_config_inport. Fixes: `06f5c2926a` ("drivers/coresight: Add UltraSoc System Memory Buffer driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20231114133346.30489-4-hejunhao3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:29 +01:00
Junhao He	ab5091e1cc	coresight: ultrasoc-smb: Config SMB buffer before register sink [ Upstream commit 830a7f54db102c889a3fe1c0a225f369ac05f07f ] The SMB dirver register the enable/disable sysfs interface in function smb_register_sink(), however the buffer depends on the following configuration to work well. So it'll be possible for user to access an unreset one. Move the config buffer operation to before register_sink(). Ignore the return value, if smb_config_inport() fails. That will cause the hardwares disable trace path to fail, should not affect SMB driver remove. So we make smb_remove() return success, Fixes: `06f5c2926a` ("drivers/coresight: Add UltraSoc System Memory Buffer driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20231114133346.30489-3-hejunhao3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:28 +01:00
Junhao He	ace850bd86	coresight: ultrasoc-smb: Fix sleep while close preempt in enable_smb [ Upstream commit b8411287aef4a994eff0c68f5597910c4194dfe3 ] When we to enable the SMB by perf, the perf sched will call perf_ctx_lock() to close system preempt in event_function_call(). But SMB::enable_smb() use mutex to lock the critical section, which may sleep. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 153023, name: perf preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48 softirqs last enabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 2 PID: 153023 Comm: perf Kdump: loaded Tainted: G W O 6.5.0-rc4+ #1 Call trace: ... __mutex_lock+0xbc/0xa70 mutex_lock_nested+0x34/0x48 smb_update_buffer+0x58/0x360 [ultrasoc_smb] etm_event_stop+0x204/0x2d8 [coresight] etm_event_del+0x1c/0x30 [coresight] event_sched_out+0x17c/0x3b8 group_sched_out.part.0+0x5c/0x208 __perf_event_disable+0x15c/0x210 event_function+0xe0/0x230 remote_function+0xb4/0xe8 generic_exec_single+0x160/0x268 smp_call_function_single+0x20c/0x2a0 event_function_call+0x20c/0x220 _perf_event_disable+0x5c/0x90 perf_event_for_each_child+0x58/0xc0 _perf_ioctl+0x34c/0x1250 perf_ioctl+0x64/0x98 ... Use spinlock to replace mutex to control driver data access to one at a time. The function copy_to_user() may sleep, it cannot be in a spinlock context, so we can't simply replace it in smb_read(). But we can ensure that only one user gets the SMB device fd by smb_open(), so remove the locks from smb_read() and buffer synchronization is guaranteed by the user. Fixes: `06f5c2926a` ("drivers/coresight: Add UltraSoc System Memory Buffer driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20231114133346.30489-2-hejunhao3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:28 +01:00
Junhao He	359d3fbcbc	hwtracing: hisi_ptt: Add dummy callback pmu::read() [ Upstream commit 55e0a2fb0cb5ab7c9c99c1ad4d3e6954de8b73a0 ] When start trace with perf option "-C $cpu" and immediately stop it with SIGTERM or others, the perf core will invoke pmu::read() while the driver doesn't implement it. Add a dummy pmu::read() to avoid any issues. Fixes: `ff0de066b4` ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20231010084731.30450-6-yangyicong@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:28 +01:00
James Clark	2f6b1527db	coresight: Fix crash when Perf and sysfs modes are used concurrently [ Upstream commit 287e82cf69aa264a52bc37591bd0eb407e20f85c ] Partially revert the change in commit `6148652807` ("coresight: Enable and disable helper devices adjacent to the path") which changed the bare call from source_ops(csdev)->enable() to coresight_enable_source() for Perf sessions. It was missed that coresight_enable_source() is specifically for the sysfs interface, rather than being a generic call. This interferes with the sysfs reference counting to cause the following crash: $ perf record -e cs_etm/@tmc_etr0/ -C 0 & $ echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink $ echo 1 > /sys/bus/coresight/devices/etm0/enable_source $ echo 0 > /sys/bus/coresight/devices/etm0/enable_source Unable to handle kernel NULL pointer dereference at virtual address 00000000000001d0 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP ... Call trace: etm4_disable+0x54/0x150 [coresight_etm4x] coresight_disable_source+0x6c/0x98 [coresight] coresight_disable+0x74/0x1c0 [coresight] enable_source_store+0x88/0xa0 [coresight] dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 kernfs_fop_write_iter+0x120/0x1b8 vfs_write+0x2dc/0x3b0 ksys_write+0x70/0x108 __arm64_sys_write+0x24/0x38 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0x104/0x130 do_el0_svc+0x40/0xb8 el0_svc+0x2c/0xb8 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x1a4/0x1a8 Code: d53cd042 91002000 b9402a81 b8626800 (f940ead5) ---[ end trace 0000000000000000 ]--- This commit linked below also fixes the issue, but has unlocked updates to the mode which could potentially race. So until we come up with a more complete solution that takes all locking and interaction between both modes into account, just revert back to the old behavior for Perf. Reported-by: Junhao He <hejunhao3@huawei.com> Closes: https://lore.kernel.org/linux-arm-kernel/20230921132904.60996-1-hejunhao3@huawei.com/ Fixes: `6148652807` ("coresight: Enable and disable helper devices adjacent to the path") Tested-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20231006131452.646721-1-james.clark@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:28 +01:00
Uwe Kleine-König	1b5d156c24	coresight: etm4x: Remove bogous __exit annotation for some functions [ Upstream commit 348ddab81f7b0983d9fb158df910254f08d3f887 ] etm4_platform_driver (which lives in ".data" contains a reference to etm4_remove_platform_dev(). So the latter must not be marked with __exit which results in the function being discarded for a build with CONFIG_CORESIGHT_SOURCE_ETM4X=y which in turn makes the remove pointer contain invalid data. etm4x_amba_driver referencing etm4_remove_amba() has the same issue. Drop the __exit annotations for the two affected functions and a third one that is called by the other two. For reasons I don't understand this isn't catched by building with CONFIG_DEBUG_SECTION_MISMATCH=y. Fixes: `c23bc382ef` ("coresight: etm4x: Refactor probing routine") Fixes: `5214b56358` ("coresight: etm4x: Add support for sysreg only devices") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/all/20230929081540.yija47lsj35xtj4v@pengutronix.de/ Link: https://lore.kernel.org/r/20230929081637.2377335-1-u.kleine-koenig@pengutronix.de Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-13 18:45:28 +01:00
AngeloGioacchino Del Regno	b9cc170842	arm64: dts: mediatek: mt8186: Change gpu speedbin nvmem cell name commit 59fa1e51ba54e1f513985a8177969b62973f7fd5 upstream. MT8186's GPU speedbin value must be interpreted, or the value will not be meaningful. Use the correct "gpu-speedbin" nvmem cell name for the GPU speedbin to allow triggering the cell info fixup handler, hence feeding the right speedbin number to the users. Cc: stable@vger.kernel.org Fixes: `263d2fd02a` ("arm64: dts: mediatek: mt8186: Add GPU speed bin NVMEM cells") Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20231005151150.355536-1-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:28 +01:00
Eugen Hristev	b6eccbcb1b	arm64: dts: mediatek: mt8186: fix clock names for power domains commit 9adf7580f6d498a5839e02fa1d1535e934364602 upstream. Clocks for each power domain are split into big categories: pd clocks and subsys clocks. According to the binding, all clocks which have a dash '-' in their name are treated as subsys clocks, and must be placed at the end of the list. The other clocks which are pd clocks must come first. Fixed the naming and the placing of all clocks in the power domains. For the avoidance of doubt, prefixed all subsys clocks with the 'subsys' prefix. The binding does not enforce strict clock names, the driver uses them in bulk, only making a difference for pd clocks vs subsys clocks. The above problem appears to be trivial, however, it leads to incorrect power up and power down sequence of the power domains, because some clocks will be mistakenly taken for subsys clocks and viceversa. One consequence is the fact that if the DIS power domain goes power down and power back up during the boot process, when it comes back up, there are still transactions left on the bus which makes the display inoperable. Some of the clocks for the DIS power domain were wrongly using '_' instead of '-', which again made these clocks being treated as pd clocks instead of subsys clocks. Cc: stable@vger.kernel.org Fixes: `d9e43c1e7a` ("arm64: dts: mt8186: Add power domains controller") Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20231005103041.352478-1-eugen.hristev@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:28 +01:00
AngeloGioacchino Del Regno	2e465268df	arm64: dts: mediatek: mt8183-evb: Fix unit_address_vs_reg warning on ntc commit 9dea1c724fc36643e83216c1f5a26613412150db upstream. The NTC is defined as ntc@0 but it doesn't need any address at all. Fix the unit_address_vs_reg warning by dropping the unit address: since the node name has to be generic also fully rename it from ntc@0 to thermal-sensor. Cc: stable@vger.kernel.org Fixes: `ff9ea5c622` ("arm64: dts: mediatek: mt8183-evb: Add node for thermistor") Link: https://lore.kernel.org/r/20231025093816.44327-7-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
AngeloGioacchino Del Regno	bfff27fb5d	arm64: dts: mediatek: mt8183: Move thermal-zones to the root node commit 5a60d63439694590cd5ab1f998fc917ff7ba1c1d upstream. The thermal zones are not a soc bus device: move it to the root node to solve simple_bus_reg warnings. Cc: stable@vger.kernel.org Fixes: `b325ce3978` ("arm64: dts: mt8183: add thermal zone node") Link: https://lore.kernel.org/r/20231025093816.44327-9-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
AngeloGioacchino Del Regno	d97373c3b1	arm64: dts: mediatek: mt8183: Fix unit address for scp reserved memory commit 19cba9a6c071db57888dc6b2ec1d9bf8996ea681 upstream. The reserved memory for scp had node name "scp_mem_region" and also without unit-address: change the name to "memory@(address)". This fixes a unit_address_vs_reg warning. Cc: stable@vger.kernel.org Fixes: `1652dbf736` ("arm64: dts: mt8183: add scp node") Link: https://lore.kernel.org/r/20231025093816.44327-6-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
AngeloGioacchino Del Regno	9c4ae4801f	arm64: dts: mediatek: mt8195: Fix PM suspend/resume with venc clocks commit 61b94d54421a1f3670ddd5396ec70afe833e9405 upstream. Before suspending the LARBs we're making sure that any operation is done: this never happens because we are unexpectedly unclocking the LARB20 before executing the suspend handler for the MediaTek Smart Multimedia Interface (SMI) and the cause of this is incorrect clocks on this LARB. Fix this issue by changing the Local Arbiter 20 (used by the video encoder secondary core) apb clock to CLK_VENC_CORE1_VENC; furthermore, in order to make sure that both the PM resume and video encoder operation is stable, add the CLK_VENC(_CORE1)_LARB clock to the VENC (main core) and VENC_CORE1 power domains, as this IP cannot communicate with the rest of the system (the AP) without local arbiter clocks being operational. Cc: stable@vger.kernel.org Fixes: `3b5838d1d8` ("arm64: dts: mt8195: Add iommu and smi nodes") Fixes: `2b515194bf` ("arm64: dts: mt8195: Add power domains controller") Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20230706095841.109315-1-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
AngeloGioacchino Del Regno	1253026694	arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names commit 24165c5dad7ba7c7624d05575a5e0cc851396c71 upstream. Fix a unit_address_vs_reg warning for the USB VBUS fixed regulators by renaming the regulator nodes from regulator@{0,1} to regulator-usb-p0 and regulator-usb-p1. Cc: stable@vger.kernel.org Fixes: `c0891284a7` ("arm64: dts: mediatek: add USB3 DRD driver") Link: https://lore.kernel.org/r/20231025093816.44327-8-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
AngeloGioacchino Del Regno	ac9a2f55bf	arm64: dts: mediatek: cherry: Fix interrupt cells for MT6360 on I2C7 commit 5943b8f7449df9881b273db07bdde1e7120dccf0 upstream. Change interrupt cells to 2 to suppress interrupts_property warning. Cc: stable@vger.kernel.org Fixes: `0de0fe950f` ("arm64: dts: mediatek: cherry: Enable MT6360 sub-pmic on I2C7") Link: https://lore.kernel.org/r/20231127132026.165027-1-angelogioacchino.delregno@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:27 +01:00
Eugen Hristev	d7646d79ea	arm64: dts: mediatek: mt8183-kukui-jacuzzi: fix dsi unnecessary cells properties commit 74543b303a9abfe4fa253d1fa215281baa05ff3a upstream. dtbs_check throws a warning at the dsi node: Warning (avoid_unnecessary_addr_size): /soc/dsi@14014000: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property Other DTS have a panel child node with a reg, so the parent dtsi must have the address-cells and size-cells, however this specific DT has the panel removed, but not the cells, hence the warning above. If panel is deleted then the cells must also be deleted since they are tied together, as the child node in this DT does not have a reg. Cc: stable@vger.kernel.org Fixes: `cabc71b08e` ("arm64: dts: mt8183: Add kukui-jacuzzi-damu board") Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20230814071053.5459-1-eugen.hristev@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Eugen Hristev	6a6df679ac	arm64: dts: mediatek: mt7622: fix memory node warning check commit 8e6ecbfd44b5542a7598c1c5fc9c6dcb5d367f2a upstream. dtbs_check throws a warning at the memory node: Warning (unit_address_vs_reg): /memory: node has a reg or ranges property, but no unit name fix by adding the address into the node name. Cc: stable@vger.kernel.org Fixes: `0b6286dd96` ("arm64: dts: mt7622: add bananapi BPI-R64 board") Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20230814065042.4973-1-eugen.hristev@collabora.com Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Eric Woudstra	90dc20c8c5	arm64: dts: mt7986: fix emmc hs400 mode without uboot initialization commit 8dfe51c3f6ef31502fca3e2da8cd250ebbe4b8f2 upstream. Eric reports errors on emmc with hs400 mode when booting linux on bpi-r3 without uboot [1]. Booting with uboot does not show this because clocks seem to be initialized by uboot. Fix this by adding assigned-clocks and assigned-clock-parents like it's done in uboot [2]. [1] https://forum.banana-pi.org/t/bpi-r3-kernel-fails-setting-emmc-clock-to-416m-depends-on-u-boot/15170 [2] https://github.com/u-boot/u-boot/blob/master/arch/arm/dts/mt7986.dtsi#L287 Cc: stable@vger.kernel.org Fixes: `513b49d19b` ("arm64: dts: mt7986: add mmc related device nodes") Signed-off-by: Eric Woudstra <ericwouds@gmail.com> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20231025170832.78727-2-linux@fw-web.de Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Frank Wunderlich	287b1c41de	arm64: dts: mt7986: define 3W max power to both SFP on BPI-R3 commit 6413cbc17f89b3a160f3a6f3fad1232b1678fe40 upstream. All SFP power supplies are connected to the system VDD33 which is 3v3/8A. Set 3A per SFP slot to allow SFPs work which need more power than the default 1W. Cc: stable@vger.kernel.org Fixes: `8e01fb15b8` ("arm64: dts: mt7986: add Bananapi R3") Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20231025170832.78727-3-linux@fw-web.de Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Frank Wunderlich	5012eb0280	arm64: dts: mt7986: change cooling trips commit 1fcda8ceb014aafd56f10b33e0077c93b5dd45d1 upstream. Add Critical and hot trips for emergency system shutdown and limiting system load. Change passive trip to active to make sure fan is activated on the lowest trip. Cc: stable@vger.kernel.org Fixes: `1f5be05132` ("arm64: dts: mt7986: add thermal-zones") Fixes: `c26f779a22` ("arm64: dts: mt7986: add pwm-fan and cooling-maps to BPI-R3 dts") Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20231025170832.78727-4-linux@fw-web.de Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Ville Syrjälä	8e1e489cdb	drm/i915: Skip some timing checks on BXT/GLK DSI transcoders commit 20c2dbff342aec13bf93c2f6c951da198916a455 upstream. Apparently some BXT/GLK systems have DSI panels whose timings don't agree with the normal cpu transcoder hblank>=32 limitation. This is perhaps fine as there are no specific hblank/etc. limits listed for the BXT/GLK DSI transcoders. Move those checks out from the global intel_mode_valid() into into connector specific .mode_valid() hooks, skipping BXT/GLK DSI connectors. We'll leave the basic [hv]display/[hv]total checks in intel_mode_valid() as those seem like sensible upper limits regardless of the transcoder used. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9720 Fixes: `8f4b1068e7` ("drm/i915: Check some transcoder timing minimum limits") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127145028.4899-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit e0ef2daa8ca8ce4dbc2fd0959e383b753a87fd7d) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Ville Syrjälä	a0396af35c	drm/i915/mst: Reject modes that require the bigjoiner commit dd7eb65c493615fda7d459501c3d4a46e00ea5ba upstream. We have no bigjoiner support in the MST code, so .mode_valid() pretending otherwise is just going to result black screens for users. Reject any mode that needs the joiner. Cc: stable@vger.kernel.org Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Fixes: `d51f25eb47` ("drm/i915: Add DSC support to MST path") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127145028.4899-3-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 9c058492b16f90bb772cb0dad567e8acc68e155d) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:26 +01:00
Ville Syrjälä	654748c6fc	drm/i915/mst: Fix .mode_valid_ctx() return values commit 7cf82b25dd91d7f330d9df2de868caca14289ba1 upstream. .mode_valid_ctx() returns an errno, not the mode status. Fix the code to do the right thing. Cc: stable@vger.kernel.org Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Fixes: `d51f25eb47` ("drm/i915: Add DSC support to MST path") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127145028.4899-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit c1799032d2ef6616113b733428dfaa2199a5604b) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
Thomas Zimmermann	02650b3b98	drm/atomic-helpers: Invoke end_fb_access while owning plane state commit e0f04e41e8eedd4e5a1275f2318df7e1841855f2 upstream. Invoke drm_plane_helper_funcs.end_fb_access before drm_atomic_helper_commit_hw_done(). The latter function hands over ownership of the plane state to the following commit, which might free it. Releasing resources in end_fb_access then operates on undefined state. This bug has been observed with non-blocking commits when they are being queued up quickly. Here is an example stack trace from the bug report. The plane state has been free'd already, so the pages for drm_gem_fb_vunmap() are gone. Unable to handle kernel paging request at virtual address 0000000100000049 [...] drm_gem_fb_vunmap+0x18/0x74 drm_gem_end_shadow_fb_access+0x1c/0x2c drm_atomic_helper_cleanup_planes+0x58/0xd8 drm_atomic_helper_commit_tail+0x90/0xa0 commit_tail+0x15c/0x188 commit_work+0x14/0x20 Fix this by running end_fb_access immediately after updating all planes in drm_atomic_helper_commit_planes(). The existing clean-up helper drm_atomic_helper_cleanup_planes() now only handles cleanup_fb. For aborted commits, roll back from drm_atomic_helper_prepare_planes() in the new helper drm_atomic_helper_unprepare_planes(). This case is different from regular cleanup, as we have to release the new state; regular cleanup releases the old state. The new helper also invokes cleanup_fb for all planes. The changes mostly involve DRM's atomic helpers. Only two drivers, i915 and nouveau, implement their own commit function. Update them to invoke drm_atomic_helper_unprepare_planes(). Drivers with custom commit_tail function do not require changes. v4: * fix documentation (kernel test robot) v3: * add drm_atomic_helper_unprepare_planes() for rolling back * use correct state for end_fb_access v2: * fix test in drm_atomic_helper_cleanup_planes() Reported-by: Alyssa Ross <hi@alyssa.is> Closes: https://lore.kernel.org/dri-devel/87leazm0ya.fsf@alyssa.is/ Suggested-by: Daniel Vetter <daniel@ffwll.ch> Fixes: `94d879eaf7` ("drm/atomic-helper: Add {begin,end}_fb_access to plane helpers") Tested-by: Alyssa Ross <hi@alyssa.is> Reviewed-by: Alyssa Ross <hi@alyssa.is> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: <stable@vger.kernel.org> # v6.2+ Link: https://patchwork.freedesktop.org/patch/msgid/20231204083247.22006-1-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
David Jeffery	4ce431c297	md/raid6: use valid sector values to determine if an I/O should wait on the reshape commit c467e97f079f0019870c314996fae952cc768e82 upstream. During a reshape or a RAID6 array such as expanding by adding an additional disk, I/Os to the region of the array which have not yet been reshaped can stall indefinitely. This is from errors in the stripe_ahead_of_reshape function causing md to think the I/O is to a region in the actively undergoing the reshape. stripe_ahead_of_reshape fails to account for the q disk having a sector value of 0. By not excluding the q disk from the for loop, raid6 will always generate a min_sector value of 0, causing a return value which stalls. The function's max_sector calculation also uses min() when it should use max(), causing the max_sector value to always be 0. During a backwards rebuild this can cause the opposite problem where it allows I/O to advance when it should wait. Fixing these errors will allow safe I/O to advance in a timely manner and delay only I/O which is unsafe due to stripes in the middle of undergoing the reshape. Fixes: `486f605586` ("md/raid5: Check all disks in a stripe_head for reshape progress") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231128181233.6187-1-djeffery@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
Lukasz Luba	aa581b37da	powercap: DTPM: Fix missing cpufreq_cpu_put() calls commit bdefd9913bdd453991ef756b6f7176e8ad80d786 upstream. The policy returned by cpufreq_cpu_get() has to be released with the help of cpufreq_cpu_put() to balance its kobject reference counter properly. Add the missing calls to cpufreq_cpu_put() in the code. Fixes: `0aea2e4ec2` ("powercap/dtpm_cpu: Reset per_cpu variable in the release function") Fixes: `0e8f68d7f0` ("powercap/drivers/dtpm: Add CPU energy model based support") Cc: v5.16+ <stable@vger.kernel.org> # v5.16+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
Sumanth Korikkar	9e5d309674	mm/memory_hotplug: fix error handling in add_memory_resource() commit f42ce5f087eb69e47294ababd2e7e6f88a82d308 upstream. In add_memory_resource(), creation of memory block devices occurs after successful call to arch_add_memory(). However, creation of memory block devices could fail. In that case, arch_remove_memory() is called to perform necessary cleanup. Currently with or without altmap support, arch_remove_memory() is always passed with altmap set to NULL during error handling. This leads to freeing of struct pages using free_pages(), eventhough the allocation might have been performed with altmap support via altmap_alloc_block_buf(). Fix the error handling by passing altmap in arch_remove_memory(). This ensures the following: * When altmap is disabled, deallocation of the struct pages array occurs via free_pages(). * When altmap is enabled, deallocation occurs via vmem_altmap_free(). Link: https://lkml.kernel.org/r/20231120145354.308999-3-sumanthk@linux.ibm.com Fixes: `a08a2ae346` ("mm,memory_hotplug: allocate memmap from the added memory range") Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
Hugh Dickins	799f90c385	mm: fix oops when filemap_map_pmd() without prealloc_pte commit 9aa1345d66b8132745ffb99b348b1492088da9e2 upstream. syzbot reports oops in lockdep's __lock_acquire(), called from __pte_offset_map_lock() called from filemap_map_pages(); or when I run the repro, the oops comes in pmd_install(), called from filemap_map_pmd() called from filemap_map_pages(), just before the __pte_offset_map_lock(). The problem is that filemap_map_pmd() has been assuming that when it finds pmd_none(), a page table has already been prepared in prealloc_pte; and indeed do_fault_around() has been careful to preallocate one there, when it finds pmd_none(): but what if pmd became none in between? My 6.6 mods in mm/khugepaged.c, avoiding mmap_lock for write, have made it easy for pmd to be cleared while servicing a page fault; but even before those, a huge pmd might be zapped while a fault is serviced. The difference in symptomatic stack traces comes from the "memory model" in use: pmd_install() uses pmd_populate() uses page_to_pfn(): in some models that is strict, and will oops on the NULL prealloc_pte; in other models, it will construct a bogus value to be populated into pmd, then __pte_offset_map_lock() oops when trying to access split ptlock pointer (or some other symptom in normal case of ptlock embedded not pointer). Link: https://lore.kernel.org/linux-mm/20231115065506.19780-1-jose.pekkarinen@foxhound.fi/ Link: https://lkml.kernel.org/r/6ed0c50c-78ef-0719-b3c5-60c0c010431c@google.com Fixes: `f9ce0be71d` ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-and-tested-by: syzbot+89edd67979b52675ddec@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/0000000000005e44550608a0806c@google.com/ Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com>, Cc: José Pekkarinen <jose.pekkarinen@foxhound.fi> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> [5.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:25 +01:00
Sumanth Korikkar	e0270ffad4	mm/memory_hotplug: add missing mem_hotplug_lock commit 001002e73712cdf6b8d9a103648cda3040ad7647 upstream. From Documentation/core-api/memory-hotplug.rst: When adding/removing/onlining/offlining memory or adding/removing heterogeneous/device memory, we should always hold the mem_hotplug_lock in write mode to serialise memory hotplug (e.g. access to global/zone variables). mhp_(de)init_memmap_on_memory() functions can change zone stats and struct page content, but they are currently called w/o the mem_hotplug_lock. When memory block is being offlined and when kmemleak goes through each populated zone, the following theoretical race conditions could occur: CPU 0: \| CPU 1: memory_offline() \| -> offline_pages() \| -> mem_hotplug_begin() \| ... \| -> mem_hotplug_done() \| \| kmemleak_scan() \| -> get_online_mems() \| ... -> mhp_deinit_memmap_on_memory() \| [not protected by mem_hotplug_begin/done()]\| Marks memory section as offline, \| Retrieves zone_start_pfn poisons vmemmap struct pages and updates \| and struct page members. the zone related data \| \| ... \| -> put_online_mems() Fix this by ensuring mem_hotplug_lock is taken before performing mhp_init_memmap_on_memory(). Also ensure that mhp_deinit_memmap_on_memory() holds the lock. online/offline_pages() are currently only called from memory_block_online/offline(), so it is safe to move the locking there. Link: https://lkml.kernel.org/r/20231120145354.308999-2-sumanthk@linux.ibm.com Fixes: `a08a2ae346` ("mm,memory_hotplug: allocate memmap from the added memory range") Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Baoquan He	83dd18e0b7	drivers/base/cpu: crash data showing should depends on KEXEC_CORE commit 4e9e2e4c65136dfd32dd0afe555961433d1cf906 upstream. After commit `88a6f89944` ("crash: memory and CPU hotplug sysfs attributes"), on x86_64, if only below kernel configs related to kdump are set, compiling error are triggered. ---- CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y CONFIG_CRASH_HOTPLUG=y ------ ------------------------------------------------------ drivers/base/cpu.c: In function `crash_hotplug_show': drivers/base/cpu.c:309:40: error: implicit declaration of function `crash_hotplug_cpu_support'; did you mean `crash_hotplug_show'? [-Werror=implicit-function-declaration] 309 \| return sysfs_emit(buf, "%d\n", crash_hotplug_cpu_support()); \| ^~~~~~~~~~~~~~~~~~~~~~~~~ \| crash_hotplug_show cc1: some warnings being treated as errors ------------------------------------------------------ CONFIG_KEXEC is used to enable kexec_load interface, the crash_notes/crash_notes_size/crash_hotplug showing depends on CONFIG_KEXEC is incorrect. It should depend on KEXEC_CORE instead. Fix it now. Link: https://lkml.kernel.org/r/20231128055248.659808-1-bhe@redhat.com Fixes: `88a6f89944` ("crash: memory and CPU hotplug sysfs attributes") Signed-off-by: Baoquan He <bhe@redhat.com> Tested-by: Ignat Korchagin <ignat@cloudflare.com> [compile-time only] Tested-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Eric DeVolder <eric_devolder@yahoo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Mike Kravetz	512b420aaf	hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write commit 187da0f8250aa94bd96266096aef6f694e0b4cd2 upstream. The routine __vma_private_lock tests for the existence of a reserve map associated with a private hugetlb mapping. A pointer to the reserve map is in vma->vm_private_data. __vma_private_lock was checking the pointer for NULL. However, it is possible that the low bits of the pointer could be used as flags. In such instances, vm_private_data is not NULL and not a valid pointer. This results in the null-ptr-deref reported by syzbot: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] CPU: 0 PID: 5048 Comm: syz-executor139 Not tainted 6.6.0-rc7-syzkaller-00142-g88 8cf78c29e2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 1 0/09/2023 RIP: 0010:__lock_acquire+0x109/0x5de0 kernel/locking/lockdep.c:5004 ... Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5753 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 hugetlb_vma_lock_write mm/hugetlb.c:300 [inline] hugetlb_vma_lock_write+0xae/0x100 mm/hugetlb.c:291 __hugetlb_zap_begin+0x1e9/0x2b0 mm/hugetlb.c:5447 hugetlb_zap_begin include/linux/hugetlb.h:258 [inline] unmap_vmas+0x2f4/0x470 mm/memory.c:1733 exit_mmap+0x1ad/0xa60 mm/mmap.c:3230 __mmput+0x12a/0x4d0 kernel/fork.c:1349 mmput+0x62/0x70 kernel/fork.c:1371 exit_mm kernel/exit.c:567 [inline] do_exit+0x9ad/0x2a20 kernel/exit.c:861 __do_sys_exit kernel/exit.c:991 [inline] __se_sys_exit kernel/exit.c:989 [inline] __x64_sys_exit+0x42/0x50 kernel/exit.c:989 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Mask off low bit flags before checking for NULL pointer. In addition, the reserve map only 'belongs' to the OWNER (parent in parent/child relationships) so also check for the OWNER flag. Link: https://lkml.kernel.org/r/20231114012033.259600-1-mike.kravetz@oracle.com Reported-by: syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/00000000000078d1e00608d7878b@google.com/ Fixes: `bf4916922c` ("hugetlbfs: extend hugetlb_vma_lock to private VMAs") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: Edward Adam Davis <eadavis@qq.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Tejun Heo	b2c562a7a8	workqueue: Make sure that wq_unbound_cpumask is never empty commit 4a6c5607d4502ccd1b15b57d57f17d12b6f257a7 upstream. During boot, depending on how the housekeeping and workqueue.unbound_cpus masks are set, wq_unbound_cpumask can end up empty. Since `8639ecebc9` ("workqueue: Implement non-strict affinity scope for unbound workqueues"), this may end up feeding -1 as a CPU number into scheduler leading to oopses. BUG: unable to handle page fault for address: ffffffff8305e9c0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... Call Trace: <TASK> select_idle_sibling+0x79/0xaf0 select_task_rq_fair+0x1cb/0x7b0 try_to_wake_up+0x29c/0x5c0 wake_up_process+0x19/0x20 kick_pool+0x5e/0xb0 __queue_work+0x119/0x430 queue_work_on+0x29/0x30 ... An empty wq_unbound_cpumask is a clear misconfiguration and already disallowed once system is booted up. Let's warn on and ignore unbound_cpumask restrictions which lead to no unbound cpus. While at it, also remove now unncessary empty check on wq_unbound_cpumask in wq_select_unbound_cpu(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-Tested-by: Yong He <alexyonghe@tencent.com> Link: http://lkml.kernel.org/r/20231120121623.119780-1-alexyonghe@tencent.com Fixes: `8639ecebc9` ("workqueue: Implement non-strict affinity scope for unbound workqueues") Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Francesco Dolcini	7409c28cab	platform/surface: aggregator: fix recv_buf() return value commit c8820c92caf0770bec976b01fa9e82bb993c5865 upstream. Serdev recv_buf() callback is supposed to return the amount of bytes consumed, therefore an int in between 0 and count. Do not return negative number in case of issue, when ssam_controller_receive_buf() returns ESHUTDOWN just returns 0, e.g. no bytes consumed, this keep the exact same behavior as it was before. This fixes a potential WARN in serdev-ttyport.c:ttyport_receive_buf(). Fixes: `c167b9c7e3` ("platform/surface: Add Surface Aggregator subsystem") Cc: stable@vger.kernel.org Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20231128194935.11350-1-francesco@dolcini.it Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Matthias Reichl	78c8fc3332	regmap: fix bogus error on regcache_sync success commit fea88064445a59584460f7f67d102b6e5fc1ca1d upstream. Since commit 0ec7731655de ("regmap: Ensure range selector registers are updated after cache sync") opening pcm512x based soundcards fail with EINVAL and dmesg shows sync cache and pm_runtime_get errors: [ 228.794676] pcm512x 1-004c: Failed to sync cache: -22 [ 228.794740] pcm512x 1-004c: ASoC: error at snd_soc_pcm_component_pm_runtime_get on pcm512x.1-004c: -22 This is caused by the cache check result leaking out into the regcache_sync return value. Fix this by making the check local-only, as the comment above the regcache_read call states a non-zero return value means there's nothing to do so the return value should not be altered. Fixes: 0ec7731655de ("regmap: Ensure range selector registers are updated after cache sync") Cc: stable@vger.kernel.org Signed-off-by: Matthias Reichl <hias@horus.com> Link: https://lore.kernel.org/r/20231203222216.96547-1-hias@horus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
ChunHao Lin	2e04cfdd3e	r8169: fix rtl8125b PAUSE frames blasting when suspended commit 4b0768b6556af56ee9b7cf4e68452a2b6289ae45 upstream. When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state. When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network. This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel): 1) ping -f $CHROMEBOX (from workstation on same local network) 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX 3) ping $ROUTER (wait until ping fails from workstation) Takes about ~20-30 seconds after step 2 for the local network to stop working. Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state. Fixes: `f1bce4ad2f` ("r8169: add support for RTL8125") Reported-by: Grant Grundler <grundler@chromium.org> Tested-by: Grant Grundler <grundler@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin <hau@realtek.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20231129155350.5843-1-hau@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:24 +01:00
Daniel Borkmann	865b71579d	packet: Move reference count in packet_sock to atomic_long_t commit db3fadacaf0c817b222090290d06ca2a338422d0 upstream. In some potential instances the reference count on struct packet_sock could be saturated and cause overflows which gets the kernel a bit confused. To prevent this, move to a 64-bit atomic reference count on 64-bit architectures to prevent the possibility of this type to overflow. Because we can not handle saturation, using refcount_t is not possible in this place. Maybe someday in the future if it changes it could be used. Also, instead of using plain atomic64_t, use atomic_long_t instead. 32-bit machines tend to be memory-limited (i.e. anything that increases a reference uses so much memory that you can't actually get to 2**32 references). 32-bit architectures also tend to have serious problems with 64-bit atomics. Hence, atomic_long_t is the more natural solution. Reported-by: "The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk> Co-developed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231201131021.19999-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Hui Zhou	9a89aad086	nfp: flower: fix for take a mutex lock in soft irq context and rcu lock commit 0ad722bd9ee3a9bdfca9613148645e4c9b7f26cf upstream. The neighbour event callback call the function nfp_tun_write_neigh, this function will take a mutex lock and it is in soft irq context, change the work queue to process the neighbour event. Move the nfp_tun_write_neigh function out of range rcu_read_lock/unlock() in function nfp_tunnel_request_route_v4 and nfp_tunnel_request_route_v6. Fixes: `abc210952a` ("nfp: flower: tunnel neigh support bond offload") CC: stable@vger.kernel.org # 6.2+ Signed-off-by: Hui Zhou <hui.zhou@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Heiner Kallweit	3c0adff939	leds: trigger: netdev: fix RTNL handling to prevent potential deadlock commit fe2b1226656afae56702d1d84c6900f6b67df297 upstream. When working on LED support for r8169 I got the following lockdep warning. Easiest way to prevent this scenario seems to be to take the RTNL lock before the trigger_data lock in set_device_name(). ====================================================== WARNING: possible circular locking dependency detected 6.7.0-rc2-next-20231124+ #2 Not tainted ------------------------------------------------------ bash/383 is trying to acquire lock: ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev] but task is already holding lock: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}-{3:3}: __mutex_lock+0x9b/0xb50 mutex_lock_nested+0x16/0x20 rtnl_lock+0x12/0x20 set_device_name+0xa9/0x120 [ledtrig_netdev] netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 -> #0 (&trigger_data->lock){+.+.}-{3:3}: __lock_acquire+0x1459/0x25a0 lock_acquire+0xc8/0x2d0 __mutex_lock+0x9b/0xb50 mutex_lock_nested+0x16/0x20 netdev_trig_notify+0xec/0x190 [ledtrig_netdev] call_netdevice_register_net_notifiers+0x5a/0x100 register_netdevice_notifier+0x85/0x120 netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(&trigger_data->lock); lock(rtnl_mutex); lock(&trigger_data->lock); * DEADLOCK * 8 locks held by bash/383: #0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0 #1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210 #2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210 #3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140 #4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140 #5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140 #6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120 #7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 stack backtrace: CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2 Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023 Call Trace: <TASK> dump_stack_lvl+0x5c/0xd0 dump_stack+0x10/0x20 print_circular_bug+0x2dd/0x410 check_noncircular+0x131/0x150 __lock_acquire+0x1459/0x25a0 lock_acquire+0xc8/0x2d0 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] __mutex_lock+0x9b/0xb50 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] ? __this_cpu_preempt_check+0x13/0x20 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] ? __cancel_work_timer+0x11c/0x1b0 ? __mutex_lock+0x123/0xb50 mutex_lock_nested+0x16/0x20 ? mutex_lock_nested+0x16/0x20 netdev_trig_notify+0xec/0x190 [ledtrig_netdev] call_netdevice_register_net_notifiers+0x5a/0x100 register_netdevice_notifier+0x85/0x120 netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 ? preempt_count_add+0x49/0xc0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 RIP: 0033:0x7f269055d034 Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034 RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001 RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007 R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000 </TASK> Fixes: `d5e01266e7` ("leds: trigger: netdev: add additional specific link speed mode") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Petr Pavlu	7d97646474	tracing: Fix a possible race when disabling buffered events commit c0591b1cccf708a47bc465c62436d669a4213323 upstream. Function trace_buffered_event_disable() is responsible for freeing pages backing buffered events and this process can run concurrently with trace_event_buffer_lock_reserve(). The following race is currently possible: * Function trace_buffered_event_disable() is called on CPU 0. It increments trace_buffered_event_cnt on each CPU and waits via synchronize_rcu() for each user of trace_buffered_event to complete. * After synchronize_rcu() is finished, function trace_buffered_event_disable() has the exclusive access to trace_buffered_event. All counters trace_buffered_event_cnt are at 1 and all pointers trace_buffered_event are still valid. * At this point, on a different CPU 1, the execution reaches trace_event_buffer_lock_reserve(). The function calls preempt_disable_notrace() and only now enters an RCU read-side critical section. The function proceeds and reads a still valid pointer from trace_buffered_event[CPU1] into the local variable "entry". However, it doesn't yet read trace_buffered_event_cnt[CPU1] which happens later. * Function trace_buffered_event_disable() continues. It frees trace_buffered_event[CPU1] and decrements trace_buffered_event_cnt[CPU1] back to 0. * Function trace_event_buffer_lock_reserve() continues. It reads and increments trace_buffered_event_cnt[CPU1] from 0 to 1. This makes it believe that it can use the "entry" that it already obtained but the pointer is now invalid and any access results in a use-after-free. Fix the problem by making a second synchronize_rcu() call after all trace_buffered_event values are set to NULL. This waits on all potential users in trace_event_buffer_lock_reserve() that still read a previous pointer from trace_buffered_event. Link: https://lore.kernel.org/all/20231127151248.7232-2-petr.pavlu@suse.com/ Link: https://lkml.kernel.org/r/20231205161736.19663-4-petr.pavlu@suse.com Cc: stable@vger.kernel.org Fixes: `0fc1b09ff1` ("tracing: Use temp buffer when filtering events") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Petr Pavlu	fc9fa702db	tracing: Fix incomplete locking when disabling buffered events commit 7fed14f7ac9cf5e38c693836fe4a874720141845 upstream. The following warning appears when using buffered events: [ 203.556451] WARNING: CPU: 53 PID: 10220 at kernel/trace/ring_buffer.c:3912 ring_buffer_discard_commit+0x2eb/0x420 [...] [ 203.670690] CPU: 53 PID: 10220 Comm: stress-ng-sysin Tainted: G E 6.7.0-rc2-default #4 56e6d0fcf5581e6e51eaaecbdaec2a2338c80f3a [ 203.670704] Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017 [ 203.670709] RIP: 0010:ring_buffer_discard_commit+0x2eb/0x420 [ 203.735721] Code: 4c 8b 4a 50 48 8b 42 48 49 39 c1 0f 84 b3 00 00 00 49 83 e8 01 75 b1 48 8b 42 10 f0 ff 40 08 0f 0b e9 fc fe ff ff f0 ff 47 08 <0f> 0b e9 77 fd ff ff 48 8b 42 10 f0 ff 40 08 0f 0b e9 f5 fe ff ff [ 203.735734] RSP: 0018:ffffb4ae4f7b7d80 EFLAGS: 00010202 [ 203.735745] RAX: 0000000000000000 RBX: ffffb4ae4f7b7de0 RCX: ffff8ac10662c000 [ 203.735754] RDX: ffff8ac0c750be00 RSI: ffff8ac10662c000 RDI: ffff8ac0c004d400 [ 203.781832] RBP: ffff8ac0c039cea0 R08: 0000000000000000 R09: 0000000000000000 [ 203.781839] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 203.781842] R13: ffff8ac10662c000 R14: ffff8ac0c004d400 R15: ffff8ac10662c008 [ 203.781846] FS: 00007f4cd8a67740(0000) GS:ffff8ad798880000(0000) knlGS:0000000000000000 [ 203.781851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 203.781855] CR2: 0000559766a74028 CR3: 00000001804c4000 CR4: 00000000001506f0 [ 203.781862] Call Trace: [ 203.781870] <TASK> [ 203.851949] trace_event_buffer_commit+0x1ea/0x250 [ 203.851967] trace_event_raw_event_sys_enter+0x83/0xe0 [ 203.851983] syscall_trace_enter.isra.0+0x182/0x1a0 [ 203.851990] do_syscall_64+0x3a/0xe0 [ 203.852075] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 203.852090] RIP: 0033:0x7f4cd870fa77 [ 203.982920] Code: 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 43 0e 00 f7 d8 64 89 01 48 [ 203.982932] RSP: 002b:00007fff99717dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000089 [ 203.982942] RAX: ffffffffffffffda RBX: 0000558ea1d7b6f0 RCX: 00007f4cd870fa77 [ 203.982948] RDX: 0000000000000000 RSI: 00007fff99717de0 RDI: 0000558ea1d7b6f0 [ 203.982957] RBP: 00007fff99717de0 R08: 00007fff997180e0 R09: 00007fff997180e0 [ 203.982962] R10: 00007fff997180e0 R11: 0000000000000246 R12: 00007fff99717f40 [ 204.049239] R13: 00007fff99718590 R14: 0000558e9f2127a8 R15: 00007fff997180b0 [ 204.049256] </TASK> For instance, it can be triggered by running these two commands in parallel: $ while true; do echo hist:key=id.syscall:val=hitcount > \ /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger; done $ stress-ng --sysinfo $(nproc) The warning indicates that the current ring_buffer_per_cpu is not in the committing state. It happens because the active ring_buffer_event doesn't actually come from the ring_buffer_per_cpu but is allocated from trace_buffered_event. The bug is in function trace_buffered_event_disable() where the following normally happens: * The code invokes disable_trace_buffered_event() via smp_call_function_many() and follows it by synchronize_rcu(). This increments the per-CPU variable trace_buffered_event_cnt on each target CPU and grants trace_buffered_event_disable() the exclusive access to the per-CPU variable trace_buffered_event. * Maintenance is performed on trace_buffered_event, all per-CPU event buffers get freed. * The code invokes enable_trace_buffered_event() via smp_call_function_many(). This decrements trace_buffered_event_cnt and releases the access to trace_buffered_event. A problem is that smp_call_function_many() runs a given function on all target CPUs except on the current one. The following can then occur: * Task X executing trace_buffered_event_disable() runs on CPU 0. * The control reaches synchronize_rcu() and the task gets rescheduled on another CPU 1. * The RCU synchronization finishes. At this point, trace_buffered_event_disable() has the exclusive access to all trace_buffered_event variables except trace_buffered_event[CPU0] because trace_buffered_event_cnt[CPU0] is never incremented and if the buffer is currently unused, remains set to 0. * A different task Y is scheduled on CPU 0 and hits a trace event. The code in trace_event_buffer_lock_reserve() sees that trace_buffered_event_cnt[CPU0] is set to 0 and decides the use the buffer provided by trace_buffered_event[CPU0]. * Task X continues its execution in trace_buffered_event_disable(). The code incorrectly frees the event buffer pointed by trace_buffered_event[CPU0] and resets the variable to NULL. * Task Y writes event data to the now freed buffer and later detects the created inconsistency. The issue is observable since commit `dea499781a` ("tracing: Fix warning in trace_buffered_event_disable()") which moved the call of trace_buffered_event_disable() in __ftrace_event_enable_disable() earlier, prior to invoking call->class->reg(.. TRACE_REG_UNREGISTER ..). The underlying problem in trace_buffered_event_disable() is however present since the original implementation in commit `0fc1b09ff1` ("tracing: Use temp buffer when filtering events"). Fix the problem by replacing the two smp_call_function_many() calls with on_each_cpu_mask() which invokes a given callback on all CPUs. Link: https://lore.kernel.org/all/20231127151248.7232-2-petr.pavlu@suse.com/ Link: https://lkml.kernel.org/r/20231205161736.19663-2-petr.pavlu@suse.com Cc: stable@vger.kernel.org Fixes: `0fc1b09ff1` ("tracing: Use temp buffer when filtering events") Fixes: `dea499781a` ("tracing: Fix warning in trace_buffered_event_disable()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Steven Rostedt (Google)	0486a1f9d9	tracing: Disable snapshot buffer when stopping instance tracers commit b538bf7d0ec11ca49f536dfda742a5f6db90a798 upstream. It use to be that only the top level instance had a snapshot buffer (for latency tracers like wakeup and irqsoff). When stopping a tracer in an instance would not disable the snapshot buffer. This could have some unintended consequences if the irqsoff tracer is enabled. Consolidate the tracing_start/stop() with tracing_start/stop_tr() so that all instances behave the same. The tracing_start/stop() functions will just call their respective tracing_start/stop_tr() with the global_array passed in. Link: https://lkml.kernel.org/r/20231205220011.041220035@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: `6d9b3fa5e7` ("tracing: Move tracing_max_latency into trace_array") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Steven Rostedt (Google)	12c48e88e5	tracing: Stop current tracer when resizing buffer commit d78ab792705c7be1b91243b2544d1a79406a2ad7 upstream. When the ring buffer is being resized, it can cause side effects to the running tracer. For instance, there's a race with irqsoff tracer that swaps individual per cpu buffers between the main buffer and the snapshot buffer. The resize operation modifies the main buffer and then the snapshot buffer. If a swap happens in between those two operations it will break the tracer. Simply stop the running tracer before resizing the buffers and enable it again when finished. Link: https://lkml.kernel.org/r/20231205220010.748996423@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: `3928a8a2d9` ("ftrace: make work with new ring buffer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:23 +01:00
Steven Rostedt (Google)	1741e17c39	tracing: Always update snapshot buffer size commit 7be76461f302ec05cbd62b90b2a05c64299ca01f upstream. It use to be that only the top level instance had a snapshot buffer (for latency tracers like wakeup and irqsoff). The update of the ring buffer size would check if the instance was the top level and if so, it would also update the snapshot buffer as it needs to be the same as the main buffer. Now that lower level instances also has a snapshot buffer, they too need to update their snapshot buffer sizes when the main buffer is changed, otherwise the following can be triggered: # cd /sys/kernel/tracing # echo 1500 > buffer_size_kb # mkdir instances/foo # echo irqsoff > instances/foo/current_tracer # echo 1000 > instances/foo/buffer_size_kb Produces: WARNING: CPU: 2 PID: 856 at kernel/trace/trace.c:1938 update_max_tr_single.part.0+0x27d/0x320 Which is: ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu); if (ret == -EBUSY) { [..] } WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); <== here That's because ring_buffer_swap_cpu() has: int ret = -EINVAL; [..] /* At least make sure the two buffers are somewhat the same */ if (cpu_buffer_a->nr_pages != cpu_buffer_b->nr_pages) goto out; [..] out: return ret; } Instead, update all instances' snapshot buffer sizes when their main buffer size is updated. Link: https://lkml.kernel.org/r/20231205220010.454662151@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: `6d9b3fa5e7` ("tracing: Move tracing_max_latency into trace_array") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:22 +01:00
Heiko Carstens	f8f32f9126	checkstack: fix printed address commit ee34db3f271cea4d4252048617919c2caafe698b upstream. All addresses printed by checkstack have an extra incorrect 0 appended at the end. This was introduced with commit `677f1410e0` ("scripts/checkstack.pl: don't display $dre as different entity"): since then the address is taken from the line which contains the function name, instead of the line which contains stack consumption. E.g. on s390: 0000000000100a30 <do_one_initcall>: ... 100a44: e3 f0 ff 70 ff 71 lay %r15,-144(%r15) So the used regex which matches spaces and hexadecimal numbers to extract an address now matches a different substring. Subsequently replacing spaces with 0 appends a zero at the and, instead of replacing leading spaces. Fix this by using the proper regex, and simplify the code a bit. Link: https://lkml.kernel.org/r/20231120183719.2188479-2-hca@linux.ibm.com Fixes: `677f1410e0` ("scripts/checkstack.pl: don't display $dre as different entity") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Cc: Maninder Singh <maninder1.s@samsung.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Vaneet Narang <v.narang@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:22 +01:00
Tim Van Patten	9ec2d92673	cgroup_freezer: cgroup_freezing: Check if not frozen commit cff5f49d433fcd0063c8be7dd08fa5bf190c6c37 upstream. __thaw_task() was recently updated to warn if the task being thawed was part of a freezer cgroup that is still currently freezing: void __thaw_task(struct task_struct *p) { ... if (WARN_ON_ONCE(freezing(p))) goto unlock; This has exposed a bug in cgroup1 freezing where when CGROUP_FROZEN is asserted, the CGROUP_FREEZING bits are not also cleared at the same time. Meaning, when a cgroup is marked FROZEN it continues to be marked FREEZING as well. This causes the WARNING to trigger, because cgroup_freezing() thinks the cgroup is still freezing. There are two ways to fix this: 1. Whenever FROZEN is set, clear FREEZING for the cgroup and all children cgroups. 2. Update cgroup_freezing() to also verify that FROZEN is not set. This patch implements option (2), since it's smaller and more straightforward. Signed-off-by: Tim Van Patten <timvp@google.com> Tested-by: Mark Hasemeyer <markhas@chromium.org> Fixes: `f5d39b0208` ("freezer,sched: Rewrite core freezer logic") Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:22 +01:00
Ming Lei	39f603a262	lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly commit 0263f92fadbb9d294d5971ac57743f882c93b2b3 upstream. group_cpus_evenly() could be part of storage driver's error handler, such as nvme driver, when may happen during CPU hotplug, in which storage queue has to drain its pending IOs because all CPUs associated with the queue are offline and the queue is becoming inactive. And handling IO needs error handler to provide forward progress. Then deadlock is caused: 1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's handler is waiting for inflight IO 2) error handler is waiting for CPU hotplug lock 3) inflight IO can't be completed in blk-mq's CPU hotplug handler because error handling can't provide forward progress. Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(), in which two stage spreads are taken: 1) the 1st stage is over all present CPUs; 2) the end stage is over all other CPUs. Turns out the two stage spread just needs consistent 'cpu_present_mask', and remove the CPU hotplug lock by storing it into one local cache. This way doesn't change correctness, because all CPUs are still covered. Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: Guangwu Zhang <guazhang@redhat.com> Tested-by: Guangwu Zhang <guazhang@redhat.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:22 +01:00

1 2 3 4 5 ...

1218854 Commits