linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 10:58:48 +09:00

Author	SHA1	Message	Date
Halil Pasic	faf44487df	s390/ism: fix concurrency management in ism_cmd() [ Upstream commit 897e8601b9cff1d054cdd53047f568b0e1995726 ] The s390x ISM device data sheet clearly states that only one request-response sequence is allowable per ISM function at any point in time. Unfortunately as of today the s390/ism driver in Linux does not honor that requirement. This patch aims to rectify that. This problem was discovered based on Aliaksei's bug report which states that for certain workloads the ISM functions end up entering error state (with PEC 2 as seen from the logs) after a while and as a consequence connections handled by the respective function break, and for future connection requests the ISM device is not considered -- given it is in a dysfunctional state. During further debugging PEC 3A was observed as well. A kernel message like [ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a is a reliable indicator of the stated function entering error state with PEC 2. Let me also point out that a kernel message like [ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery is a reliable indicator that the ISM function won't be auto-recovered because the ISM driver currently lacks support for it. On a technical level, without this synchronization, commands (inputs to the FW) may be partially or fully overwritten (corrupted) by another CPU trying to issue commands on the same function. There is hard evidence that this can lead to DMB token values being used as DMB IOVAs, leading to PEC 2 PCI events indicating invalid DMA. But this is only one of the failure modes imaginable. In theory even completely losing one command and executing another one twice and then trying to interpret the outputs as if the command we intended to execute was actually executed and not the other one is also possible. Frankly, I don't feel confident about providing an exhaustive list of possible consequences. Fixes: `684b89bc39` ("s390/ism: add device driver for internal shared memory") Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Douglas Anderson	42e42d64d8	drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe() [ Upstream commit 15a7ca747d9538c2ad8b0c81dd4c1261e0736c82 ] As reported by the kernel test robot, a recent patch introduced an unnecessary semicolon. Remove it. Fixes: 55e8ff842051 ("drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506301704.0SBj6ply-lkp@intel.com/ Reviewed-by: Devarsh Thakkar <devarsht@ti.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250714130631.1.I1cfae3222e344a3b3c770d079ee6b6f7f3b5d636@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Marc Kleine-Budde	cf81a60a97	can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode [ Upstream commit c1f3f9797c1f44a762e6f5f72520b2e520537b52 ] Andrei Lalaev reported a NULL pointer deref when a CAN device is restarted from Bus Off and the driver does not implement the struct can_priv::do_set_mode callback. There are 2 code path that call struct can_priv::do_set_mode: - directly by a manual restart from the user space, via can_changelink() - delayed automatic restart after bus off (deactivated by default) To prevent the NULL pointer deference, refuse a manual restart or configure the automatic restart delay in can_changelink() and report the error via extack to user space. As an additional safety measure let can_restart() return an error if can_priv::do_set_mode is not set instead of dereferencing it unchecked. Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com> Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com Fixes: `39549eef35` ("can: CAN Network device driver and Netlink interface") Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b520097bb96@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Marc Kleine-Budde	359492c202	can: dev: can_restart(): move debug message and stats after successful restart [ Upstream commit f0e0c809c0be05fe865b9ac128ef3ee35c276021 ] Move the debug message "restarted" and the CAN restart stats_after_ the successful restart of the CAN device, because the restart may fail. While there update the error message from printing the error number to printing symbolic error names. Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-4-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> [mkl: mention stats in subject and description, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Stable-dep-of: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Marc Kleine-Budde	71a2dc442e	can: dev: can_restart(): reverse logic to remove need for goto [ Upstream commit 8f3ec204d340af183fb2bb21b8e797ac2ed012b2 ] Reverse the logic in the if statement and eliminate the need for a goto to simplify code readability. Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-3-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Stable-dep-of: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Xiang Mei	0ede6ce4fc	net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class [ Upstream commit cf074eca0065bc5142e6004ae236bb35a2687fdf ] might_sleep could be trigger in the atomic context in qfq_delete_class. qfq_destroy_class was moved into atomic context locked by sch_tree_lock to avoid a race condition bug on qfq_aggregate. However, might_sleep could be triggered by qfq_destroy_class, which introduced sleeping in atomic context (path: qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key ->might_sleep). Considering the race is on the qfq_aggregate objects, keeping qfq_rm_from_agg in the lock but moving the left part out can solve this issue. Fixes: 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on qfq_aggregate") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Kito Xu (veritas501)	e4f1564c5b	net: appletalk: Fix use-after-free in AARP proxy probe [ Upstream commit 6c4a92d07b0850342d3becf2e608f805e972467c ] The AARP proxy‐probe routine (aarp_proxy_probe_network) sends a probe, releases the aarp_lock, sleeps, then re-acquires the lock. During that window an expire timer thread (__aarp_expire_timer) can remove and kfree() the same entry, leading to a use-after-free. race condition: cpu 0 \| cpu 1 atalk_sendmsg() \| atif_proxy_probe_device() aarp_send_ddp() \| aarp_proxy_probe_network() mod_timer() \| lock(aarp_lock) // LOCK!! timeout around 200ms \| alloc(aarp_entry) and then call \| proxies[hash] = aarp_entry aarp_expire_timeout() \| aarp_send_probe() \| unlock(aarp_lock) // UNLOCK!! lock(aarp_lock) // LOCK!! \| msleep(100); __aarp_expire_timer(&proxies[ct]) \| free(aarp_entry) \| unlock(aarp_lock) // UNLOCK!! \| \| lock(aarp_lock) // LOCK!! \| UAF aarp_entry !! ================================================================== BUG: KASAN: slab-use-after-free in aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493 Read of size 4 at addr ffff8880123aa360 by task repro/13278 CPU: 3 UID: 0 PID: 13278 Comm: repro Not tainted 6.15.2 #3 PREEMPT(full) Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc1/0x630 mm/kasan/report.c:521 kasan_report+0xca/0x100 mm/kasan/report.c:634 aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline] atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818 sock_do_ioctl+0xdc/0x260 net/socket.c:1190 sock_ioctl+0x239/0x6a0 net/socket.c:1311 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl fs/ioctl.c:892 [inline] __x64_sys_ioctl+0x194/0x200 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcb/0x250 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Allocated: aarp_alloc net/appletalk/aarp.c:382 [inline] aarp_proxy_probe_network+0xd8/0x630 net/appletalk/aarp.c:468 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline] atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818 Freed: kfree+0x148/0x4d0 mm/slub.c:4841 __aarp_expire net/appletalk/aarp.c:90 [inline] __aarp_expire_timer net/appletalk/aarp.c:261 [inline] aarp_expire_timeout+0x480/0x6e0 net/appletalk/aarp.c:317 The buggy address belongs to the object at ffff8880123aa300 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 96 bytes inside of freed 192-byte region [ffff8880123aa300, ffff8880123aa3c0) Memory state around the buggy address: ffff8880123aa200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880123aa280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc >ffff8880123aa300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880123aa380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8880123aa400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com> Link: https://patch.msgid.link/20250717012843.880423-1-hxzene@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Jamie Bainbridge	d989748e9d	i40e: When removing VF MAC filters, only check PF-set MAC [ Upstream commit 5a0df02999dbe838c3feed54b1d59e9445f68b89 ] When the PF is processing an Admin Queue message to delete a VF's MACs from the MAC filter, we currently check if the PF set the MAC and if the VF is trusted. This results in undesirable behaviour, where if a trusted VF with a PF-set MAC sets itself down (which sends an AQ message to delete the VF's MAC filters) then the VF MAC is erased from the interface. This results in the VF losing its PF-set MAC which should not happen. There is no need to check for trust at all, because an untrusted VF cannot change its own MAC. The only check needed is whether the PF set the MAC. If the PF set the MAC, then don't erase the MAC on link-down. Resolve this by changing the deletion check only for PF-set MAC. (the out-of-tree driver has also intentionally removed the check for VF trust here with OOT driver version 2.26.8, this changes the Linux kernel driver behaviour and comment to match the OOT driver behaviour) Fixes: ea2a1cfc3b201 ("i40e: Fix VF MAC filter removal") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Dennis Chen	0c399fd6ac	i40e: report VF tx_dropped with tx_errors instead of tx_discards [ Upstream commit 50b2af451597ca6eefe9d4543f8bbf8de8aa00e7 ] Currently the tx_dropped field in VF stats is not updated correctly when reading stats from the PF. This is because it reads from i40e_eth_stats.tx_discards which seems to be unused for per VSI stats, as it is not updated by i40e_update_eth_stats() and the corresponding register, GLV_TDPC, is not implemented[1]. Use i40e_eth_stats.tx_errors instead, which is actually updated by i40e_update_eth_stats() by reading from GLV_TEPC. To test, create a VF and try to send bad packets through it: $ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs $ cat test.py from scapy.all import * vlan_pkt = Ether(dst="ff:ff:ff:ff:ff:ff") / Dot1Q(vlan=999) / IP(dst="192.168.0.1") / ICMP() ttl_pkt = IP(dst="8.8.8.8", ttl=0) / ICMP() print("Send packet with bad VLAN tag") sendp(vlan_pkt, iface="enp2s0f0v0") print("Send packet with TTL=0") sendp(ttl_pkt, iface="enp2s0f0v0") $ ip -s link show dev enp2s0f0 16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 0 $ python test.py Send packet with bad VLAN tag . Sent 1 packets. Send packet with TTL=0 . Sent 1 packets. $ ip -s link show dev enp2s0f0 16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 0 A packet with non-existent VLAN tag and a packet with TTL = 0 are sent, but tx_dropped is not incremented. After patch: $ ip -s link show dev enp2s0f0 19: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether 4a:b7:3d:37:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 2 Fixes: `dc645daef9` ("i40e: implement VF stats NDO") Signed-off-by: Dennis Chen <dechen@redhat.com> Link: https://www.intel.com/content/www/us/en/content-details/596333/intel-ethernet-controller-x710-tm4-at2-carlsville-datasheet.html Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Yajun Deng	a05b388308	i40e: Add rx_missed_errors for buffer exhaustion [ Upstream commit 5337d294973331660e84e41836a54014de22e5b0 ] As the comment in struct rtnl_link_stats64, rx_dropped should not include packets dropped by the device due to buffer exhaustion. They are counted in rx_missed_errors, procfs folds those two counters together. Add rx_missed_errors for buffer exhaustion, rx_missed_errors corresponds to rx_discards, rx_dropped corresponds to rx_discards_other. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: 50b2af451597 ("i40e: report VF tx_dropped with tx_errors instead of tx_discards") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:29 +01:00
Shahar Shitrit	e4ca597d31	net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch [ Upstream commit 5b4c56ad4da0aa00b258ab50b1f5775b7d3108c7 ] In the original design, it is assumed local and peer eswitches have the same number of vfs. However, in new firmware, local and peer eswitches can have different number of vfs configured by mlxconfig. In such configuration, it is incorrect to derive the number of vfs from the local device's eswitch. Fix this by updating the peer miss rules add and delete functions to use the peer device's eswitch and vf count instead of the local device's information, ensuring correct behavior regardless of vf configuration differences. Fixes: `ac004b8321` ("net/mlx5e: E-Switch, Add peer miss rules") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752753970-261832-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Chiara Meiohas	c3d8a80d95	net/mlx5: Fix memory leak in cmd_exec() [ Upstream commit 3afa3ae3db52e3c216d77bd5907a5a86833806cc ] If cmd_exec() is called with callback and mlx5_cmd_invoke() returns an error, resources allocated in cmd_exec() will not be freed. Fix the code to release the resources if mlx5_cmd_invoke() returns an error. Fixes: `f086470122` ("net/mlx5: cmdif, Return value improvements") Reported-by: Alex Tereshkin <atereshkin@nvidia.com> Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752753970-261832-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Eyal Birger	bfebdb8549	xfrm: interface: fix use-after-free after changing collect_md xfrm interface [ Upstream commit a90b2a1aaacbcf0f91d7e4868ad6c51c5dee814b ] collect_md property on xfrm interfaces can only be set on device creation, thus xfrmi_changelink() should fail when called on such interfaces. The check to enforce this was done only in the case where the xi was returned from xfrmi_locate() which doesn't look for the collect_md interface, and thus the validation was never reached. Calling changelink would thus errornously place the special interface xi in the xfrmi_net->xfrmi hash, but since it also exists in the xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when the net namespace was taken down [1]. Change the check to use the xi from netdev_priv which is available earlier in the function to prevent changes in xfrm collect_md interfaces. [1] resulting oops: [ 8.516540] kernel BUG at net/core/dev.c:12029! [ 8.516552] Oops: invalid opcode: 0000 [#1] SMP NOPTI [ 8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme #5 PREEMPT(voluntary) [ 8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 8.516569] Workqueue: netns cleanup_net [ 8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0 [ 8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24 [ 8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206 [ 8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60 [ 8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122 [ 8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100 [ 8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00 [ 8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00 [ 8.516615] FS: 0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000 [ 8.516619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0 [ 8.516625] PKRU: 55555554 [ 8.516627] Call Trace: [ 8.516632] <TASK> [ 8.516635] ? rtnl_is_locked+0x15/0x20 [ 8.516641] ? unregister_netdevice_queue+0x29/0xf0 [ 8.516650] ops_undo_list+0x1f2/0x220 [ 8.516659] cleanup_net+0x1ad/0x2e0 [ 8.516664] process_one_work+0x160/0x380 [ 8.516673] worker_thread+0x2aa/0x3c0 [ 8.516679] ? __pfx_worker_thread+0x10/0x10 [ 8.516686] kthread+0xfb/0x200 [ 8.516690] ? __pfx_kthread+0x10/0x10 [ 8.516693] ? __pfx_kthread+0x10/0x10 [ 8.516697] ret_from_fork+0x82/0xf0 [ 8.516705] ? __pfx_kthread+0x10/0x10 [ 8.516709] ret_from_fork_asm+0x1a/0x30 [ 8.516718] </TASK> Fixes: `abc340b38b` ("xfrm: interface: support collect metadata mode") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Stefan Wahren	2e18442d22	staging: vchiq_arm: Make vchiq_shutdown never fail [ Upstream commit f2b8ebfb867011ddbefbdf7b04ad62626cbc2afd ] Most of the users of vchiq_shutdown ignore the return value, which is bad because this could lead to resource leaks. So instead of changing all calls to vchiq_shutdown, it's easier to make vchiq_shutdown never fail. Fixes: `71bad7f086` ("staging: add bcm2708 vchiq driver") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20250715161108.3411-4-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Torsten Hilbrich	6758f73f17	platform/x86: Fix initialization order for firmware_attributes_class [ Upstream commit 2bfe3ae1aa45f8b61cb0dc462114fd0c9636ad32 ] The think-lmi driver uses the firwmare_attributes_class. But this class is registered after think-lmi, causing the "think-lmi" directory in "/sys/class/firmware-attributes" to be missing when the driver is compiled as builtin. Fixes: 55922403807a ("platform/x86: think-lmi: Directly use firmware_attributes_class") Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/r/7dce5f7f-c348-4350-ac53-d14a8e1e8034@secunet.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Nuno Das Neves	d72c97b616	x86/hyperv: Fix usage of cpu_online_mask to get valid cpu [ Upstream commit bb169f80ed5a156ec3405e0e49c6b8e9ae264718 ] Accessing cpu_online_mask here is problematic because the cpus read lock is not held in this context. However, cpu_online_mask isn't needed here since the effective affinity mask is guaranteed to be valid in this callback. So, just use cpumask_first() to get the cpu instead of ANDing it with cpus_online_mask unnecessarily. Fixes: `e39397d1fd` ("x86/hyperv: implement an MSI domain for root partition") Reported-by: Michael Kelley <mhklinux@outlook.com> Closes: https://lore.kernel.org/linux-hyperv/SN6PR02MB4157639630F8AD2D8FD8F52FD475A@SN6PR02MB4157.namprd02.prod.outlook.com/ Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Abdun Nihaal	0915f1856c	regmap: fix potential memory leak of regmap_bus [ Upstream commit c871c199accb39d0f4cb941ad0dccabfc21e9214 ] When __regmap_init() is called from __regmap_init_i2c() and __regmap_init_spi() (and their devm versions), the bus argument obtained from regmap_get_i2c_bus() and regmap_get_spi_bus(), may be allocated using kmemdup() to support quirks. In those cases, the bus->free_on_exit field is set to true. However, inside __regmap_init(), buf is not freed on any error path. This could lead to a memory leak of regmap_bus when __regmap_init() fails. Fix that by freeing bus on error path when free_on_exit is set. Fixes: `ea030ca688` ("regmap-i2c: Set regmap max raw r/w from quirks") Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com> Link: https://patch.msgid.link/20250626172823.18725-1-abdun.nihaal@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
David Lechner	881f3066fd	iio: adc: ad7949: use spi_is_bpw_supported() [ Upstream commit 7b86482632788acd48d7b9ee1867f5ad3a32ccbb ] Use spi_is_bpw_supported() instead of directly accessing spi->controller ->bits_per_word_mask. bits_per_word_mask may be 0, which implies that 8-bits-per-word is supported. spi_is_bpw_supported() takes this into account while spi_ctrl_mask == SPI_BPW_MASK(8) does not. Fixes: `0b2a740b42` ("iio: adc: ad7949: enable use with non 14/16-bit controllers") Closes: https://lore.kernel.org/linux-spi/c8b8a963-6cef-4c9b-bfef-dab2b7bd0b0f@sirena.org.uk/ Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://patch.msgid.link/20250611-iio-adc-ad7949-use-spi_is_bpw_supported-v1-1-c4e15bfd326e@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Xilin Wu	6a85c96e61	interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node [ Upstream commit 886a94f008dd1a1702ee66dd035c266f70fd9e90 ] This allows adding interconnect paths for PCIe 1 in device tree later. Fixes: `46bdcac533` ("interconnect: qcom: Add SC7280 interconnect provider driver") Signed-off-by: Xilin Wu <sophon@radxa.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250613-sc7280-icc-pcie1-fix-v1-1-0b09813e3b09@radxa.com Signed-off-by: Georgi Djakov <djakov@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Maor Gottlieb	fb93e41ecc	RDMA/core: Rate limit GID cache warning messages [ Upstream commit 333e4d79316c9ed5877d7aac8b8ed22efc74e96d ] The GID cache warning messages can flood the kernel log when there are multiple failed attempts to add GIDs. This can happen when creating many virtual interfaces without having enough space for their GIDs in the GID table. Change pr_warn to pr_warn_ratelimited to prevent log flooding while still maintaining visibility of the issue. Link: https://patch.msgid.link/r/fd45ed4a1078e743f498b234c3ae816610ba1b18.1750062357.git.leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Alessandro Carminati	233d3c54c9	regulator: core: fix NULL dereference on unbind due to stale coupling data [ Upstream commit ca46946a482238b0cdea459fb82fc837fb36260e ] Failing to reset coupling_desc.n_coupled after freeing coupled_rdevs can lead to NULL pointer dereference when regulators are accessed post-unbind. This can happen during runtime PM or other regulator operations that rely on coupling metadata. For example, on ridesx4, unbinding the 'reg-dummy' platform device triggers a panic in regulator_lock_recursive() due to stale coupling state. Ensure n_coupled is set to 0 to prevent access to invalid pointers. Signed-off-by: Alessandro Carminati <acarmina@redhat.com> Link: https://patch.msgid.link/20250626083809.314842-1-acarmina@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Laurent Vivier	0e8c65939b	virtio_ring: Fix error reporting in virtqueue_resize [ Upstream commit 45ebc7e6c125ce93d2ddf82cd5bea20121bb0258 ] The virtqueue_resize() function was not correctly propagating error codes from its internal resize helper functions, specifically virtqueue_resize_packet() and virtqueue_resize_split(). If these helpers returned an error, but the subsequent call to virtqueue_enable_after_reset() succeeded, the original error from the resize operation would be masked. Consequently, virtqueue_resize() could incorrectly report success to its caller despite an underlying resize failure. This change restores the original code behavior: if (vdev->config->enable_vq_after_reset(_vq)) return -EBUSY; return err; Fix: commit `ad48d53b5b` ("virtio_ring: separate the logic of reset/enable from virtqueue_resize") Cc: xuanzhuo@linux.alibaba.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250521092236.661410-2-lvivier@redhat.com Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-01 09:47:28 +01:00
Fabrice Gasnier	fa53beab47	Input: gpio-keys - fix a sleep while atomic with PREEMPT_RT commit f4a8f561d08e39f7833d4a278ebfb12a41eef15f upstream. When enabling PREEMPT_RT, the gpio_keys_irq_timer() callback runs in hard irq context, but the input_event() takes a spin_lock, which isn't allowed there as it is converted to a rt_spin_lock(). [ 4054.289999] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 4054.290028] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0 ... [ 4054.290195] __might_resched+0x13c/0x1f4 [ 4054.290209] rt_spin_lock+0x54/0x11c [ 4054.290219] input_event+0x48/0x80 [ 4054.290230] gpio_keys_irq_timer+0x4c/0x78 [ 4054.290243] __hrtimer_run_queues+0x1a4/0x438 [ 4054.290257] hrtimer_interrupt+0xe4/0x240 [ 4054.290269] arch_timer_handler_phys+0x2c/0x44 [ 4054.290283] handle_percpu_devid_irq+0x8c/0x14c [ 4054.290297] handle_irq_desc+0x40/0x58 [ 4054.290307] generic_handle_domain_irq+0x1c/0x28 [ 4054.290316] gic_handle_irq+0x44/0xcc Considering the gpio_keys_irq_isr() can run in any context, e.g. it can be threaded, it seems there's no point in requesting the timer isr to run in hard irq context. Relax the hrtimer not to use the hard context. Fixes: `019002f20c` ("Input: gpio-keys - use hrtimer for release timer") Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com> Link: https://lore.kernel.org/r/20250528-gpio_keys_preempt_rt-v2-1-3fc55a9c3619@foss.st.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> [ adjusted context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-08-01 09:47:28 +01:00
Greg Kroah-Hartman	dbcb8d8e41	Linux 6.6.100 Link: https://lore.kernel.org/r/20250722134333.375479548@linuxfoundation.org Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Hardik Garg <hargar@linux.microsoft.com> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Manuel Andreas	3ee59c38ae	KVM: x86/xen: Fix cleanup logic in emulation of Xen schedop poll hypercalls commit 5a53249d149f48b558368c5338b9921b76a12f8c upstream. kvm_xen_schedop_poll does a kmalloc_array() when a VM polls the host for more than one event channel potr (nr_ports > 1). After the kmalloc_array(), the error paths need to go through the "out" label, but the call to kvm_read_guest_virt() does not. Fixes: `92c58965e9` ("KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly") Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Manuel Andreas <manuel.andreas@tum.de> [Adjusted commit message. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Michael C. Pratt	48e8791843	nvmem: layouts: u-boot-env: remove crc32 endianness conversion commit 2d7521aa26ec2dc8b877bb2d1f2611a2df49a3cf upstream. On 11 Oct 2022, it was reported that the crc32 verification of the u-boot environment failed only on big-endian systems for the u-boot-env nvmem layout driver with the following error. Invalid calculated CRC32: 0x88cd6f09 (expected: 0x096fcd88) This problem has been present since the driver was introduced, and before it was made into a layout driver. The suggested fix at the time was to use further endianness conversion macros in order to have both the stored and calculated crc32 values to compare always represented in the system's endianness. This was not accepted due to sparse warnings and some disagreement on how to handle the situation. Later on in a newer revision of the patch, it was proposed to use cpu_to_le32() for both values to compare instead of le32_to_cpu() and store the values as __le32 type to remove compilation errors. The necessity of this is based on the assumption that the use of crc32() requires endianness conversion because the algorithm uses little-endian, however, this does not prove to be the case and the issue is unrelated. Upon inspecting the current kernel code, there already is an existing use of le32_to_cpu() in this driver, which suggests there already is special handling for big-endian systems, however, it is big-endian systems that have the problem. This, being the only functional difference between architectures in the driver combined with the fact that the suggested fix was to use the exact same endianness conversion for the values brings up the possibility that it was not necessary to begin with, as the same endianness conversion for two values expected to be the same is expected to be equivalent to no conversion at all. After inspecting the u-boot environment of devices of both endianness and trying to remove the existing endianness conversion, the problem is resolved in an equivalent way as the other suggested fixes. Ultimately, it seems that u-boot is agnostic to endianness at least for the purpose of environment variables. In other words, u-boot reads and writes the stored crc32 value with the same endianness that the crc32 value is calculated with in whichever endianness a certain architecture runs on. Therefore, the u-boot-env driver does not need to convert endianness. Remove the usage of endianness macros in the u-boot-env driver, and change the type of local variables to maintain the same return type. If there is a special situation in the case of endianness, it would be a corner case and should be handled by a unique "compatible". Even though it is not necessary to use endianness conversion macros here, it may be useful to use them in the future for consistent error printing. Fixes: `d5542923f2` ("nvmem: add driver handling U-Boot environment variables") Reported-by: INAGAKI Hiroshi <musashino.open@gmail.com> Link: https://lore.kernel.org/all/20221011024928.1807-1-musashino.open@gmail.com Cc: stable@vger.kernel.org Signed-off-by: "Michael C. Pratt" <mcpratt@pm.me> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250716144210.4804-1-srini@kernel.org [ applied changes to drivers/nvmem/u-boot-env.c before code was moved to drivers/nvmem/layouts/u-boot-env.c ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Johan Hovold	35542cbe66	i2c: omap: fix deprecated of_property_read_bool() use commit e66b0a8f048bc8590eb1047480f946898a3f80c9 upstream. Using of_property_read_bool() for non-boolean properties is deprecated and results in a warning during runtime since commit c141ecc3cecd ("of: Warn when of_property_read_bool() is used on non-boolean properties"). Fixes: b6ef830c60b6 ("i2c: omap: Add support for setting mux") Cc: Jayesh Choudhary <j-choudhary@ti.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com> Link: https://lore.kernel.org/r/20250415075230.16235-1-johan+linaro@kernel.org Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Shung-Hsi Yu	056b65a02e	Revert "selftests/bpf: dummy_st_ops should reject 0 for non-nullable params" This reverts commit `e7d193073a` which is commit 6a2d30d3c5bf9f088dcfd5f3746b04d84f2fab83 upstream. The dummy_st_ops/dummy_sleepable_reject_null test requires commit 980ca8ceeae6 ("bpf: check bpf_dummy_struct_ops program params for test runs"), which in turn depends on "Support PTR_MAYBE_NULL for struct_ops arguments" series (see link below), neither are backported to stable 6.6. Link: https://lore.kernel.org/all/20240209023750.1153905-1-thinker.li@gmail.com/ Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Shung-Hsi Yu	c148b72828	Revert "selftests/bpf: adjust dummy_st_ops_success to detect additional error" This reverts commit `264451a364` which is commit 3b3b84aacb4420226576c9732e7b539ca7b79633 upstream. The updated dummy_st_ops test requires commit 1479eaff1f16 ("bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable"), which in turn depends on "Support PTR_MAYBE_NULL for struct_ops arguments" series (see link below), neither are backported to stable 6.6. Without them the kernel simply panics from null pointer dereference half way through running BPF selftests. #68/1 deny_namespace/unpriv_userns_create_no_bpf:OK #68/2 deny_namespace/userns_create_bpf:OK #68 deny_namespace:OK [ 26.829153] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 26.831136] #PF: supervisor read access in kernel mode [ 26.832635] #PF: error_code(0x0000) - not-present page [ 26.833999] PGD 0 P4D 0 [ 26.834771] Oops: 0000 [#1] PREEMPT SMP PTI [ 26.835997] CPU: 2 PID: 119 Comm: test_progs Tainted: G OE 6.6.66-00003-gd80551078e71 #3 [ 26.838774] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 26.841152] RIP: 0010:bpf_prog_8ee9cbe7c9b5a50f_test_1+0x17/0x24 [ 26.842877] Code: 00 00 00 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 0f 1e fa 48 8b 7f 00 <8b> 47 00 be 5a 00 00 00 89 77 00 c9 c3 cc cc cc cc cc cc cc cc c0 [ 26.847953] RSP: 0018:ffff9e6b803b7d88 EFLAGS: 00010202 [ 26.849425] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 2845e103d7dffb60 [ 26.851483] RDX: 0000000000000000 RSI: 0000000084d09025 RDI: 0000000000000000 [ 26.853508] RBP: ffff9e6b803b7d88 R08: 0000000000000001 R09: 0000000000000000 [ 26.855670] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9754c0b5f700 [ 26.857824] R13: ffff9754c09cc800 R14: ffff9754c0b5f680 R15: ffff9754c0b5f760 [ 26.859741] FS: 00007f77dee12740(0000) GS:ffff9754fbc80000(0000) knlGS:0000000000000000 [ 26.862087] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.863705] CR2: 0000000000000000 CR3: 00000001020e6003 CR4: 0000000000170ee0 [ 26.865689] Call Trace: [ 26.866407] <TASK> [ 26.866982] ? __die+0x24/0x70 [ 26.867774] ? page_fault_oops+0x15b/0x450 [ 26.868882] ? search_bpf_extables+0xb0/0x160 [ 26.870076] ? fixup_exception+0x26/0x330 [ 26.871214] ? exc_page_fault+0x64/0x190 [ 26.872293] ? asm_exc_page_fault+0x26/0x30 [ 26.873352] ? bpf_prog_8ee9cbe7c9b5a50f_test_1+0x17/0x24 [ 26.874705] ? __bpf_prog_enter+0x3f/0xc0 [ 26.875718] ? bpf_struct_ops_test_run+0x1b8/0x2c0 [ 26.876942] ? __sys_bpf+0xc4e/0x2c30 [ 26.877898] ? __x64_sys_bpf+0x20/0x30 [ 26.878812] ? do_syscall_64+0x37/0x90 [ 26.879704] ? entry_SYSCALL_64_after_hwframe+0x78/0xe2 [ 26.880918] </TASK> [ 26.881409] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod(OE)] [ 26.883095] CR2: 0000000000000000 [ 26.883934] ---[ end trace 0000000000000000 ]--- [ 26.885099] RIP: 0010:bpf_prog_8ee9cbe7c9b5a50f_test_1+0x17/0x24 [ 26.886452] Code: 00 00 00 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 0f 1e fa 48 8b 7f 00 <8b> 47 00 be 5a 00 00 00 89 77 00 c9 c3 cc cc cc cc cc cc cc cc c0 [ 26.890379] RSP: 0018:ffff9e6b803b7d88 EFLAGS: 00010202 [ 26.891450] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 2845e103d7dffb60 [ 26.892779] RDX: 0000000000000000 RSI: 0000000084d09025 RDI: 0000000000000000 [ 26.894254] RBP: ffff9e6b803b7d88 R08: 0000000000000001 R09: 0000000000000000 [ 26.895630] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9754c0b5f700 [ 26.897008] R13: ffff9754c09cc800 R14: ffff9754c0b5f680 R15: ffff9754c0b5f760 [ 26.898337] FS: 00007f77dee12740(0000) GS:ffff9754fbc80000(0000) knlGS:0000000000000000 [ 26.899972] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.901076] CR2: 0000000000000000 CR3: 00000001020e6003 CR4: 0000000000170ee0 [ 26.902336] Kernel panic - not syncing: Fatal exception [ 26.903639] Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 26.905693] ---[ end Kernel panic - not syncing: Fatal exception ]--- Link: https://lore.kernel.org/all/20240209023750.1153905-1-thinker.li@gmail.com/ Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:22 +02:00
Arun Raghavan	b9e50a5169	ASoC: fsl_sai: Force a software reset when starting in consumer mode commit dc78f7e59169d3f0e6c3c95d23dc8e55e95741e2 upstream. On an imx8mm platform with an external clock provider, when running the receiver (arecord) and triggering an xrun with xrun_injection, we see a channel swap/offset. This happens sometimes when running only the receiver, but occurs reliably if a transmitter (aplay) is also concurrently running. It seems that the SAI loses track of frame sync during the trigger stop -> trigger start cycle that occurs during an xrun. Doing just a FIFO reset in this case does not suffice, and only a software reset seems to get it back on track. This looks like the same h/w bug that is already handled for the producer case, so we now do the reset unconditionally on config disable. Signed-off-by: Arun Raghavan <arun@asymptotic.io> Reported-by: Pieterjan Camerlynck <p.camerlynck@televic.com> Fixes: `3e3f8bd569` ("ASoC: fsl_sai: fix no frame clk in master mode") Cc: stable@vger.kernel.org Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://patch.msgid.link/20250626130858.163825-1-arun@arunraghavan.net Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Martin Blumenstingl	8f2852c1d7	regulator: pwm-regulator: Manage boot-on with disabled PWM channels commit b3cbdcc191819b75c04178142e2d0d4c668f68c0 upstream. Odroid-C1 uses a Monolithic Power Systems MP2161 controlled via PWM for the VDDEE voltage supply of the Meson8b SoC. Commit `6b9352f3f8` ("pwm: meson: modify and simplify calculation in meson_pwm_get_state") results in my Odroid-C1 crashing with memory corruption in many different places (seemingly at random). It turns out that this is due to a currently not supported corner case. The VDDEE regulator can generate between 860mV (duty cycle of ~91%) and 1140mV (duty cycle of 0%). We consider it to be enabled by the bootloader (which is why it has the regulator-boot-on flag in .dts) as well as being always-on (which is why it has the regulator-always-on flag in .dts) because the VDDEE voltage is generally required for the Meson8b SoC to work. The public S805 datasheet [0] states on page 17 (where "A5" refers to the Cortex-A5 CPU cores): [...] So if EE domains is shut off, A5 memory is also shut off. That does not matter. Before EE power domain is shut off, A5 should be shut off at first. It turns out that at least some bootloader versions are keeping the PWM output disabled. This is not a problem due to the specific design of the regulator: when the PWM output is disabled the output pin is pulled LOW, effectively achieving a 0% duty cycle (which in return means that VDDEE voltage is at 1140mV). The problem comes when the pwm-regulator driver tries to initialize the PWM output. To do so it reads the current state from the hardware, which is: period: 3666ns duty cycle: 3333ns (= ~91%) enabled: false Then those values are translated using the continuous voltage range to 860mV. Later, when the regulator is being enabled (either by the regulator core due to the always-on flag or first consumer - in this case the lima driver for the Mali-450 GPU) the pwm-regulator driver tries to keep the voltage (at 860mV) and just enable the PWM output. This is when things start to go wrong as the typical voltage used for VDDEE is 1100mV. Commit `6b9352f3f8` ("pwm: meson: modify and simplify calculation in meson_pwm_get_state") triggers above condition as before that change period and duty cycle were both at 0. Since the change to the pwm-meson driver is considered correct the solution is to be found in the pwm-regulator driver. Update the duty cycle during driver probe if the regulator is flagged as boot-on so that a call to pwm_regulator_enable() (by the regulator core during initialization of a regulator flagged with boot-on) without any preceding call to pwm_regulator_set_voltage() does not change the output voltage. [0] https://dn.odroid.com/S805/Datasheet/S805_Datasheet%20V0.8%2020150126.pdf Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://msgid.link/r/20240113224628.377993-4-martin.blumenstingl@googlemail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Martin Blumenstingl	cad3ec23e3	regulator: pwm-regulator: Calculate the output voltage for disabled PWMs commit 6a7d11efd6915d80a025f2a0be4ae09d797b91ec upstream. If a PWM output is disabled then it's voltage has to be calculated based on a zero duty cycle (for normal polarity) or duty cycle being equal to the PWM period (for inverted polarity). Add support for this to pwm_regulator_get_voltage(). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://msgid.link/r/20240113224628.377993-3-martin.blumenstingl@googlemail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Christophe JAILLET	7e5ec0059e	i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe() commit a9503a2ecd95e23d7243bcde7138192de8c1c281 upstream. omap_i2c_init() can fail. Handle this error in omap_i2c_probe(). Fixes: `010d442c4a` ("i2c: New bus driver for TI OMAP boards") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: <stable@vger.kernel.org> # v2.6.19+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/565311abf9bafd7291ca82bcecb48c1fac1e727b.1751701715.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Christophe JAILLET	a7b84035ba	i2c: omap: Fix an error handling path in omap_i2c_probe() commit 666c23af755dccca8c25b5d5200ca28153c69a05 upstream. If an error occurs after calling mux_state_select(), mux_state_deselect() should be called as already done in the remove function. Fixes: b6ef830c60b6 ("i2c: omap: Add support for setting mux") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/998542981b6d2435c057dd8b9fe71743927babab.1749913149.git.christophe.jaillet@wanadoo.fr Stable-dep-of: a9503a2ecd95 ("i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Jayesh Choudhary	caa86f8b6c	i2c: omap: Add support for setting mux commit b6ef830c60b6f4adfb72d0780b4363df3a1feb9c upstream. Some SoCs require muxes in the routing for SDA and SCL lines. Therefore, add support for setting the mux by reading the mux-states property from the dt-node. Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com> Link: https://lore.kernel.org/r/20250318103622.29979-3-j-choudhary@ti.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Stable-dep-of: a9503a2ecd95 ("i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Krishna Kurapati	6cfbff5f8d	usb: dwc3: qcom: Don't leave BCR asserted commit ef8abc0ba49ce717e6bc4124e88e59982671f3b5 upstream. Leaving the USB BCR asserted prevents the associated GDSC to turn on. This blocks any subsequent attempts of probing the device, e.g. after a probe deferral, with the following showing in the log: [ 1.332226] usb30_prim_gdsc status stuck at 'off' Leave the BCR deasserted when exiting the driver to avoid this issue. Cc: stable <stable@kernel.org> Fixes: `a4333c3a6b` ("usb: dwc3: Add Qualcomm DWC3 glue driver") Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250709132900.3408752-1-krishna.kurapati@oss.qualcomm.com [ adapted to individual clock management instead of bulk clock operations ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Mathias Nyman	824fa25c85	usb: hub: Don't try to recover devices lost during warm reset. commit 2521106fc732b0b75fd3555c689b1ed1d29d273c upstream. Hub driver warm-resets ports in SS.Inactive or Compliance mode to recover a possible connected device. The port reset code correctly detects if a connection is lost during reset, but hub driver port_event() fails to take this into account in some cases. port_event() ends up using stale values and assumes there is a connected device, and will try all means to recover it, including power-cycling the port. Details: This case was triggered when xHC host was suspended with DbC (Debug Capability) enabled and connected. DbC turns one xHC port into a simple usb debug device, allowing debugging a system with an A-to-A USB debug cable. xhci DbC code disables DbC when xHC is system suspended to D3, and enables it back during resume. We essentially end up with two hosts connected to each other during suspend, and, for a short while during resume, until DbC is enabled back. The suspended xHC host notices some activity on the roothub port, but can't train the link due to being suspended, so xHC hardware sets a CAS (Cold Attach Status) flag for this port to inform xhci host driver that the port needs to be warm reset once xHC resumes. CAS is xHCI specific, and not part of USB specification, so xhci driver tells usb core that the port has a connection and link is in compliance mode. Recovery from complinace mode is similar to CAS recovery. xhci CAS driver support that fakes a compliance mode connection was added in commit `8bea2bd37d` ("usb: Add support for root hub port status CAS") Once xHCI resumes and DbC is enabled back, all activity on the xHC roothub host side port disappears. The hub driver will anyway think port has a connection and link is in compliance mode, and hub driver will try to recover it. The port power-cycle during recovery seems to cause issues to the active DbC connection. Fix this by clearing connect_change flag if hub_port_reset() returns -ENOTCONN, thus avoiding the whole unnecessary port recovery and initialization attempt. Cc: stable@vger.kernel.org Fixes: `8bea2bd37d` ("usb: Add support for root hub port status CAS") Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250623133947.3144608-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Mathias Nyman	668c7b47a5	usb: hub: Fix flushing of delayed work used for post resume purposes commit 9bd9c8026341f75f25c53104eb7e656e357ca1a2 upstream. Delayed work that prevents USB3 hubs from runtime-suspending too early needed to be flushed in hub_quiesce() to resolve issues detected on QC SC8280XP CRD board during suspend resume testing. This flushing did however trigger new issues on Raspberry Pi 3B+, which doesn't have USB3 ports, and doesn't queue any post resume delayed work. The flushed 'hub->init_work' item is used for several purposes, and is originally initialized with a 'NULL' work function. The work function is also changed on the fly, which may contribute to the issue. Solve this by creating a dedicated delayed work item for post resume work, and flush that delayed work in hub_quiesce() Cc: stable <stable@kernel.org> Fixes: a49e1e2e785f ("usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/linux-usb/aF5rNp1l0LWITnEB@finisterre.sirena.org.uk Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # SC8280XP CRD Tested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250627164348.3982628-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:21 +02:00
Mathias Nyman	71f5c98d29	usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm commit a49e1e2e785fb3621f2d748581881b23a364998a upstream. Delayed work to prevent USB3 hubs from runtime-suspending immediately after resume was added in commit 8f5b7e2bec1c ("usb: hub: fix detection of high tier USB3 devices behind suspended hubs"). This delayed work needs be flushed if system suspends, or hub needs to be quiesced for other reasons right after resume. Not flushing it triggered issues on QC SC8280XP CRD board during suspend/resume testing. Fix it by flushing the delayed resume work in hub_quiesce() The delayed work item that allow hub runtime suspend is also scheduled just before calling autopm get. Alan pointed out there is a small risk that work is run before autopm get, which would call autopm put before get, and mess up the runtime pm usage order. Swap the order of work sheduling and calling autopm get to solve this. Cc: stable <stable@kernel.org> Fixes: 8f5b7e2bec1c ("usb: hub: fix detection of high tier USB3 devices behind suspended hubs") Reported-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Closes: https://lore.kernel.org/linux-usb/acaaa928-832c-48ca-b0ea-d202d5cd3d6c@oss.qualcomm.com Reported-by: Alan Stern <stern@rowland.harvard.edu> Closes: https://lore.kernel.org/linux-usb/c73fbead-66d7-497a-8fa1-75ea4761090a@rowland.harvard.edu Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250626130102.3639861-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Mathias Nyman	15fea75a78	usb: hub: fix detection of high tier USB3 devices behind suspended hubs commit 8f5b7e2bec1c36578fdaa74a6951833541103e27 upstream. USB3 devices connected behind several external suspended hubs may not be detected when plugged in due to aggressive hub runtime pm suspend. The hub driver immediately runtime-suspends hubs if there are no active children or port activity. There is a delay between the wake signal causing hub resume, and driver visible port activity on the hub downstream facing ports. Most of the LFPS handshake, resume signaling and link training done on the downstream ports is not visible to the hub driver until completed, when device then will appear fully enabled and running on the port. This delay between wake signal and detectable port change is even more significant with chained suspended hubs where the wake signal will propagate upstream first. Suspended hubs will only start resuming downstream ports after upstream facing port resumes. The hub driver may resume a USB3 hub, read status of all ports, not yet see any activity, and runtime suspend back the hub before any port activity is visible. This exact case was seen when conncting USB3 devices to a suspended Thunderbolt dock. USB3 specification defines a 100ms tU3WakeupRetryDelay, indicating USB3 devices expect to be resumed within 100ms after signaling wake. if not then device will resend the wake signal. Give the USB3 hubs twice this time (200ms) to detect any port changes after resume, before allowing hub to runtime suspend again. Cc: stable <stable@kernel.org> Fixes: `2839f5bcfc` ("USB: Turn on auto-suspend for USB 3.0 hubs.") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250611112441.2267883-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Mark Brown	d5024dc5e6	arm64: Filter out SME hwcaps when FEAT_SME isn't implemented commit a75ad2fc76a2ab70817c7eed3163b66ea84ca6ac upstream. We have a number of hwcaps for various SME subfeatures enumerated via ID_AA64SMFR0_EL1. Currently we advertise these without cross checking against the main SME feature, advertised in ID_AA64PFR1_EL1.SME which means that if the two are out of sync userspace can see a confusing situation where SME subfeatures are advertised without the base SME hwcap. This can be readily triggered by using the arm64.nosme override which only masks out ID_AA64PFR1_EL1.SME, and there have also been reports of VMMs which do the same thing. Fix this as we did previously for SVE in 064737920bdb ("arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented") by filtering out the SME subfeature hwcaps when FEAT_SME is not present. Fixes: `5e64b862c4` ("arm64/sme: Basic enumeration support") Reported-by: Yury Khrustalev <yury.khrustalev@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250620-arm64-sme-filter-hwcaps-v1-1-02b9d3c2d8ef@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Al Viro	dc6a664089	clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns commit c28f922c9dcee0e4876a2c095939d77fe7e15116 upstream. What we want is to verify there is that clone won't expose something hidden by a mount we wouldn't be able to undo. "Wouldn't be able to undo" may be a result of MNT_LOCKED on a child, but it may also come from lacking admin rights in the userns of the namespace mount belongs to. clone_private_mnt() checks the former, but not the latter. There's a number of rather confusing CAP_SYS_ADMIN checks in various userns during the mount, especially with the new mount API; they serve different purposes and in case of clone_private_mnt() they usually, but not always end up covering the missing check mentioned above. Reviewed-by: Christian Brauner <brauner@kernel.org> Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com> Fixes: `427215d85e` ("ovl: prevent private clone if bind mount is not allowed") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ merge conflict resolution: clone_private_mount() was reworked in db04662e2f4f ("fs: allow detached mounts in clone_private_mount()"). Tweak the relevant ns_capable check so that it works on older kernels ] Signed-off-by: Noah Orlando <Noah.Orlando@deshaw.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Eric Dumazet	4cb17b11c8	ipv6: make addrconf_wq single threaded commit dfd2ee086a63c730022cb095576a8b3a5a752109 upstream. Both addrconf_verify_work() and addrconf_dad_work() acquire rtnl, there is no point trying to have one thread per cpu. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240201173031.3654257-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Brett A C Sheffield <bacs@librecast.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Aruna Ramakrishna	496efa228f	sched: Change nr_uninterruptible type to unsigned long commit 36569780b0d64de283f9d6c2195fd1a43e221ee8 upstream. The commit `e6fe3f422b` ("sched: Make multiple runqueue task counters 32-bit") changed nr_uninterruptible to an unsigned int. But the nr_uninterruptible values for each of the CPU runqueues can grow to large numbers, sometimes exceeding INT_MAX. This is valid, if, over time, a large number of tasks are migrated off of one CPU after going into an uninterruptible state. Only the sum of all nr_interruptible values across all CPUs yields the correct result, as explained in a comment in kernel/sched/loadavg.c. Change the type of nr_uninterruptible back to unsigned long to prevent overflows, and thus the miscalculation of load average. Fixes: `e6fe3f422b` ("sched: Make multiple runqueue task counters 32-bit") Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250709173328.606794-1-aruna.ramakrishna@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-24 08:53:20 +02:00
Chen Ridong	f371ad6471	Revert "cgroup_freezer: cgroup_freezing: Check if not frozen" [ Upstream commit 14a67b42cb6f3ab66f41603c062c5056d32ea7dd ] This reverts commit cff5f49d433fcd0063c8be7dd08fa5bf190c6c37. Commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") modified the cgroup_freezing() logic to verify that the FROZEN flag is not set, affecting the return value of the freezing() function, in order to address a warning in __thaw_task. A race condition exists that may allow tasks to escape being frozen. The following scenario demonstrates this issue: CPU 0 (get_signal path) CPU 1 (freezer.state reader) try_to_freeze read freezer.state __refrigerator freezer_read update_if_frozen WRITE_ONCE(current->__state, TASK_FROZEN); ... /* Task is now marked frozen / / frozen(task) == true / / Assuming other tasks are frozen / freezer->state \|= CGROUP_FROZEN; / freezing(current) returns false / / because cgroup is frozen (not freezing) / break out __set_current_state(TASK_RUNNING); / Bug: Task resumes running when it should remain frozen */ The existing !frozen(p) check in __thaw_task makes the WARN_ON_ONCE(freezing(p)) warning redundant. Removing this warning enables reverting the commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to resolve the issue. The warning has been removed in the previous patch. This patch revert the commit cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") to complete the fix. Fixes: cff5f49d433f ("cgroup_freezer: cgroup_freezing: Check if not frozen") Reported-by: Zhong Jiawei<zhongjiawei1@huawei.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:20 +02:00
David Howells	74bb4de32d	rxrpc: Fix transmission of an abort in response to an abort [ Upstream commit e9c0b96ec0a34fcacdf9365713578d83cecac34c ] Under some circumstances, such as when a server socket is closing, ABORT packets will be generated in response to incoming packets. Unfortunately, this also may include generating aborts in response to incoming aborts - which may cause a cycle. It appears this may be made possible by giving the client a multicast address. Fix this such that rxrpc_reject_packet() will refuse to generate aborts in response to aborts. Fixes: `248f219cb8` ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:20 +02:00
David Howells	7692bde890	rxrpc: Fix recv-recv race of completed call [ Upstream commit 962fb1f651c2cf2083e0c3ef53ba69e3b96d3fbc ] If a call receives an event (such as incoming data), the call gets placed on the socket's queue and a thread in recvmsg can be awakened to go and process it. Once the thread has picked up the call off of the queue, further events will cause it to be requeued, and once the socket lock is dropped (recvmsg uses call->user_mutex to allow the socket to be used in parallel), a second thread can come in and its recvmsg can pop the call off the socket queue again. In such a case, the first thread will be receiving stuff from the call and the second thread will be blocked on call->user_mutex. The first thread can, at this point, process both the event that it picked call for and the event that the second thread picked the call for and may see the call terminate - in which case the call will be "released", decoupling the call from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control message). The first thread will return okay, but then the second thread will wake up holding the user_mutex and, if it sees that the call has been released by the first thread, it will BUG thusly: kernel BUG at net/rxrpc/recvmsg.c:474! Fix this by just dequeuing the call and ignoring it if it is seen to be already released. We can't tell userspace about it anyway as the user call ID has become stale. Fixes: `248f219cb8` ("rxrpc: Rewrite the data and ack handling code") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:20 +02:00
William Liu	7ff2d83ecf	net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree [ Upstream commit 0e1d5d9b5c5966e2e42e298670808590db5ed628 ] htb_lookup_leaf has a BUG_ON that can trigger with the following: tc qdisc del dev lo root tc qdisc add dev lo root handle 1: htb default 1 tc class add dev lo parent 1: classid 1:1 htb rate 64bit tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2:1 handle 3: blackhole ping -I lo -c1 -W0.001 127.0.0.1 The root cause is the following: 1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on the selected leaf qdisc 2. netem_dequeue calls enqueue on the child qdisc 3. blackhole_enqueue drops the packet and returns a value that is not just NET_XMIT_SUCCESS 4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate -> htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase 5. As this is the only class in the selected hprio rbtree, __rb_change_child in __rb_erase_augmented sets the rb_root pointer to NULL 6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL, which causes htb_dequeue_tree to call htb_lookup_leaf with the same hprio rbtree, and fail the BUG_ON The function graph for this scenario is shown here: 0) \| htb_enqueue() { 0) + 13.635 us \| netem_enqueue(); 0) 4.719 us \| htb_activate_prios(); 0) # 2249.199 us \| } 0) \| htb_dequeue() { 0) 2.355 us \| htb_lookup_leaf(); 0) \| netem_dequeue() { 0) + 11.061 us \| blackhole_enqueue(); 0) \| qdisc_tree_reduce_backlog() { 0) \| qdisc_lookup_rcu() { 0) 1.873 us \| qdisc_match_from_root(); 0) 6.292 us \| } 0) 1.894 us \| htb_search(); 0) \| htb_qlen_notify() { 0) 2.655 us \| htb_deactivate_prios(); 0) 6.933 us \| } 0) + 25.227 us \| } 0) 1.983 us \| blackhole_dequeue(); 0) + 86.553 us \| } 0) # 2932.761 us \| qdisc_warn_nonwc(); 0) \| htb_lookup_leaf() { 0) \| BUG_ON(); ------------------------------------------ The full original bug report can be seen here [1]. We can fix this just by returning NULL instead of the BUG_ON, as htb_dequeue_tree returns NULL when htb_lookup_leaf returns NULL. [1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/ Fixes: `512bb43eb5` ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.") Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:19 +02:00
Joseph Huang	7b0d423183	net: bridge: Do not offload IGMP/MLD messages [ Upstream commit 683dc24da8bf199bb7446e445ad7f801c79a550e ] Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports being unintentionally flooded to Hosts. Instead, let the bridge decide where to send these IGMP/MLD messages. Consider the case where the local host is sending out reports in response to a remote querier like the following: mcast-listener-process (IP_ADD_MEMBERSHIP) \ br0 / \ swp1 swp2 \| \| QUERIER SOME-OTHER-HOST In the above setup, br0 will want to br_forward() reports for mcast-listener-process's group(s) via swp1 to QUERIER; but since the source hwdom is 0, the report is eligible for tx offloading, and is flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as well. (Example and illustration provided by Tobias.) Fixes: `472111920f` ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded") Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:19 +02:00
Dong Chenchen	bb515c4130	net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime [ Upstream commit 579d4f9ca9a9a605184a9b162355f6ba131f678d ] Assuming the "rx-vlan-filter" feature is enabled on a net device, the 8021q module will automatically add or remove VLAN 0 when the net device is put administratively up or down, respectively. There are a couple of problems with the above scheme. The first problem is a memory leak that can happen if the "rx-vlan-filter" feature is disabled while the device is running: # ip link add bond1 up type bond mode 0 # ethtool -K bond1 rx-vlan-filter off # ip link del dev bond1 When the device is put administratively down the "rx-vlan-filter" feature is disabled, so the 8021q module will not remove VLAN 0 and the memory will be leaked [1]. Another problem that can happen is that the kernel can automatically delete VLAN 0 when the device is put administratively down despite not adding it when the device was put administratively up since during that time the "rx-vlan-filter" feature was disabled. null-ptr-unref or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance if toggling filtering during runtime: $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ip link del vlan0 Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //vlan_info is null Fix both problems by noting in the VLAN info whether VLAN 0 was automatically added upon NETDEV_UP and based on that decide whether it should be deleted upon NETDEV_DOWN, regardless of the state of the "rx-vlan-filter" feature. [1] unreferenced object 0xffff8880068e3100 (size 256): comm "ip", pid 384, jiffies 4296130254 hex dump (first 32 bytes): 00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 81ce31fa): __kmalloc_cache_noprof+0x2b5/0x340 vlan_vid_add+0x434/0x940 vlan_device_event.cold+0x75/0xa8 notifier_call_chain+0xca/0x150 __dev_notify_flags+0xe3/0x250 rtnl_configure_link+0x193/0x260 rtnl_newlink_create+0x383/0x8e0 __rtnl_newlink+0x22c/0xa40 rtnl_newlink+0x627/0xb00 rtnetlink_rcv_msg+0x6fb/0xb70 netlink_rcv_skb+0x11f/0x350 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Fixes: `ad1afb0039` ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-24 08:53:19 +02:00

1 2 3 4 5 ...

1235937 Commits