From 43e3118c677afb65027e34fb43f85b5284cd77a7 Mon Sep 17 00:00:00 2001 From: Lizhi Hou Date: Mon, 18 Aug 2025 08:22:21 -0700 Subject: [PATCH 01/76] of: dynamic: Fix memleak when of_pci_add_properties() failed [ Upstream commit c81f6ce16785cc07ae81f53deb07b662ed0bb3a5 ] When of_pci_add_properties() failed, of_changeset_destroy() is called to free the changeset. And of_changeset_destroy() puts device tree node in each entry but does not free property in the entry. This leads to memory leak in the failure case. In of_changeset_add_prop_helper(), add the property to the device tree node deadprops list. Thus, the property will also be freed along with device tree node. Fixes: b544fc2b8606 ("of: dynamic: Add interfaces for creating device node dynamically") Reported-by: Lorenzo Pieralisi Closes: https://lore.kernel.org/all/aJms+YT8TnpzpCY8@lpieralisi/ Tested-by: Lorenzo Pieralisi Signed-off-by: Lizhi Hou Link: https://lore.kernel.org/r/20250818152221.3685724-1-lizhi.hou@amd.com Signed-off-by: Rob Herring (Arm) Signed-off-by: Sasha Levin --- drivers/of/dynamic.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index 4d57a4e34105..7f78fba502ec 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -939,6 +939,9 @@ static int of_changeset_add_prop_helper(struct of_changeset *ocs, kfree(new_pp); } + new_pp->next = np->deadprops; + np->deadprops = new_pp; + return ret; } From 6e59b8483e6ef0c560caafc4ba3836b8c71d4ed2 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Thu, 14 Aug 2025 19:27:21 -0700 Subject: [PATCH 02/76] pinctrl: STMFX: add missing HAS_IOMEM dependency [ Upstream commit a12946bef0407cf2db0899c83d42c47c00af3fbc ] When building on ARCH=um (which does not set HAS_IOMEM), kconfig reports an unmet dependency caused by PINCTRL_STMFX. It selects MFD_STMFX, which depends on HAS_IOMEM. To stop this warning, PINCTRL_STMFX should also depend on HAS_IOMEM. kconfig warning: WARNING: unmet direct dependencies detected for MFD_STMFX Depends on [n]: HAS_IOMEM [=n] && I2C [=y] && OF [=y] Selected by [y]: - PINCTRL_STMFX [=y] && PINCTRL [=y] && I2C [=y] && OF_GPIO [=y] Fixes: 1490d9f841b1 ("pinctrl: Add STMFX GPIO expander Pinctrl/GPIO driver") Signed-off-by: Randy Dunlap Link: https://lore.kernel.org/20250815022721.1650885-1-rdunlap@infradead.org Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig index 7dfb7190580e..ab3908a923e3 100644 --- a/drivers/pinctrl/Kconfig +++ b/drivers/pinctrl/Kconfig @@ -438,6 +438,7 @@ config PINCTRL_STMFX tristate "STMicroelectronics STMFX GPIO expander pinctrl driver" depends on I2C depends on OF_GPIO + depends on HAS_IOMEM select GENERIC_PINCONF select GPIOLIB_IRQCHIP select MFD_STMFX From e877b861dab9addd5e38b352c603c04e946639bb Mon Sep 17 00:00:00 2001 From: Aleksander Jan Bajkowski Date: Sun, 17 Aug 2025 14:49:06 +0200 Subject: [PATCH 03/76] mips: dts: lantiq: danube: add missing burst length property [ Upstream commit 7b28232921782aa38048249132899c337405eaa8 ] The upstream dts lacks the lantiq,{rx/tx}-burst-length property. Other issues were also fixed: arch/mips/boot/dts/lantiq/danube_easy50712.dtb: etop@e180000 (lantiq,etop-xway): 'interrupt-names' is a required property from schema $id: http://devicetree.org/schemas/net/lantiq,etop-xway.yaml# arch/mips/boot/dts/lantiq/danube_easy50712.dtb: etop@e180000 (lantiq,etop-xway): 'lantiq,tx-burst-length' is a required property from schema $id: http://devicetree.org/schemas/net/lantiq,etop-xway.yaml# arch/mips/boot/dts/lantiq/danube_easy50712.dtb: etop@e180000 (lantiq,etop-xway): 'lantiq,rx-burst-length' is a required property from schema $id: http://devicetree.org/schemas/net/lantiq,etop-xway.yaml# Fixes: 14d4e308e0aa ("net: lantiq: configure the burst length in ethernet drivers") Signed-off-by: Aleksander Jan Bajkowski Acked-by: Jakub Kicinski Signed-off-by: Sasha Levin --- arch/mips/boot/dts/lantiq/danube_easy50712.dts | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/mips/boot/dts/lantiq/danube_easy50712.dts b/arch/mips/boot/dts/lantiq/danube_easy50712.dts index 1ce20b7d05cb..d8b3cd69eda3 100644 --- a/arch/mips/boot/dts/lantiq/danube_easy50712.dts +++ b/arch/mips/boot/dts/lantiq/danube_easy50712.dts @@ -87,8 +87,11 @@ reg = <0xe180000 0x40000>; interrupt-parent = <&icu0>; interrupts = <73 78>; + interrupt-names = "tx", "rx"; phy-mode = "rmii"; mac-address = [ 00 11 22 33 44 55 ]; + lantiq,rx-burst-length = <4>; + lantiq,tx-burst-length = <4>; }; stp0: stp@e100bb0 { From d3be2b8cff6f7bd054174725010fef935d63c855 Mon Sep 17 00:00:00 2001 From: Aleksander Jan Bajkowski Date: Sun, 17 Aug 2025 14:49:07 +0200 Subject: [PATCH 04/76] mips: lantiq: xway: sysctrl: rename the etop node MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 8c431ea8f3f795c4b9cfa57a85bc4166b9cce0ac ] Bindig requires a node name matching ‘^ethernet@[0-9a-f]+$’. This patch changes the clock name from “etop” to “ethernet”. This fixes the following warning: arch/mips/boot/dts/lantiq/danube_easy50712.dtb: etop@e180000 (lantiq,etop-xway): $nodename:0: 'etop@e180000' does not match '^ethernet@[0-9a-f]+$' from schema $id: http://devicetree.org/schemas/net/lantiq,etop-xway.yaml# Fixes: dac0bad93741 ("dt-bindings: net: lantiq,etop-xway: Document Lantiq Xway ETOP bindings") Signed-off-by: Aleksander Jan Bajkowski Acked-by: Jakub Kicinski Signed-off-by: Sasha Levin --- arch/mips/boot/dts/lantiq/danube_easy50712.dts | 2 +- arch/mips/lantiq/xway/sysctrl.c | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/mips/boot/dts/lantiq/danube_easy50712.dts b/arch/mips/boot/dts/lantiq/danube_easy50712.dts index d8b3cd69eda3..c4d7aa5753b0 100644 --- a/arch/mips/boot/dts/lantiq/danube_easy50712.dts +++ b/arch/mips/boot/dts/lantiq/danube_easy50712.dts @@ -82,7 +82,7 @@ }; }; - etop@e180000 { + ethernet@e180000 { compatible = "lantiq,etop-xway"; reg = <0xe180000 0x40000>; interrupt-parent = <&icu0>; diff --git a/arch/mips/lantiq/xway/sysctrl.c b/arch/mips/lantiq/xway/sysctrl.c index 3ed078225222..4c72b59fdf98 100644 --- a/arch/mips/lantiq/xway/sysctrl.c +++ b/arch/mips/lantiq/xway/sysctrl.c @@ -478,7 +478,7 @@ void __init ltq_soc_init(void) ifccr = CGU_IFCCR_VR9; pcicr = CGU_PCICR_VR9; } else { - clkdev_add_pmu("1e180000.etop", NULL, 1, 0, PMU_PPE); + clkdev_add_pmu("1e180000.ethernet", NULL, 1, 0, PMU_PPE); } if (!of_machine_is_compatible("lantiq,ase")) @@ -512,9 +512,9 @@ void __init ltq_soc_init(void) CLOCK_133M, CLOCK_133M); clkdev_add_pmu("1e101000.usb", "otg", 1, 0, PMU_USB0); clkdev_add_pmu("1f203018.usb2-phy", "phy", 1, 0, PMU_USB0_P); - clkdev_add_pmu("1e180000.etop", "ppe", 1, 0, PMU_PPE); - clkdev_add_cgu("1e180000.etop", "ephycgu", CGU_EPHY); - clkdev_add_pmu("1e180000.etop", "ephy", 1, 0, PMU_EPHY); + clkdev_add_pmu("1e180000.ethernet", "ppe", 1, 0, PMU_PPE); + clkdev_add_cgu("1e180000.ethernet", "ephycgu", CGU_EPHY); + clkdev_add_pmu("1e180000.ethernet", "ephy", 1, 0, PMU_EPHY); clkdev_add_pmu("1e103000.sdio", NULL, 1, 0, PMU_ASE_SDIO); clkdev_add_pmu("1e116000.mei", "dfe", 1, 0, PMU_DFE); } else if (of_machine_is_compatible("lantiq,grx390")) { @@ -573,7 +573,7 @@ void __init ltq_soc_init(void) clkdev_add_pmu("1e101000.usb", "otg", 1, 0, PMU_USB0 | PMU_AHBM); clkdev_add_pmu("1f203034.usb2-phy", "phy", 1, 0, PMU_USB1_P); clkdev_add_pmu("1e106000.usb", "otg", 1, 0, PMU_USB1 | PMU_AHBM); - clkdev_add_pmu("1e180000.etop", "switch", 1, 0, PMU_SWITCH); + clkdev_add_pmu("1e180000.ethernet", "switch", 1, 0, PMU_SWITCH); clkdev_add_pmu("1e103000.sdio", NULL, 1, 0, PMU_SDIO); clkdev_add_pmu("1e103100.deu", NULL, 1, 0, PMU_DEU); clkdev_add_pmu("1e116000.mei", "dfe", 1, 0, PMU_DFE); From 749137b41e7064ee8b8b1f04a5a27478226d09fc Mon Sep 17 00:00:00 2001 From: Rob Herring Date: Tue, 9 Apr 2024 13:59:39 -0500 Subject: [PATCH 05/76] of: Add a helper to free property struct [ Upstream commit 1c5e3d9bf33b811e1c6dd9081b322004acc4a1fd ] Freeing a property struct is 3 kfree()'s which is duplicated in multiple spots. Add a helper, __of_prop_free(), and replace all the open coded cases in the DT code. Reviewed-by: Saravana Kannan Reviewed-by: Jonathan Cameron Link: https://lore.kernel.org/r/20240409-dt-cleanup-free-v2-1-5b419a4af38d@kernel.org Signed-off-by: Rob Herring Stable-dep-of: 80af3745ca46 ("of: dynamic: Fix use after free in of_changeset_add_prop_helper()") Signed-off-by: Sasha Levin --- drivers/of/dynamic.c | 26 ++++++++++++-------------- drivers/of/of_private.h | 1 + drivers/of/overlay.c | 11 +++-------- drivers/of/unittest.c | 12 +++--------- 4 files changed, 19 insertions(+), 31 deletions(-) diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index 7f78fba502ec..72531f44adf0 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -306,15 +306,20 @@ int of_detach_node(struct device_node *np) } EXPORT_SYMBOL_GPL(of_detach_node); +void __of_prop_free(struct property *prop) +{ + kfree(prop->name); + kfree(prop->value); + kfree(prop); +} + static void property_list_free(struct property *prop_list) { struct property *prop, *next; for (prop = prop_list; prop != NULL; prop = next) { next = prop->next; - kfree(prop->name); - kfree(prop->value); - kfree(prop); + __of_prop_free(prop); } } @@ -427,9 +432,7 @@ struct property *__of_prop_dup(const struct property *prop, gfp_t allocflags) return new; err_free: - kfree(new->name); - kfree(new->value); - kfree(new); + __of_prop_free(new); return NULL; } @@ -471,9 +474,7 @@ struct device_node *__of_node_dup(const struct device_node *np, if (!new_pp) goto err_prop; if (__of_add_property(node, new_pp)) { - kfree(new_pp->name); - kfree(new_pp->value); - kfree(new_pp); + __of_prop_free(new_pp); goto err_prop; } } @@ -933,11 +934,8 @@ static int of_changeset_add_prop_helper(struct of_changeset *ocs, return -ENOMEM; ret = of_changeset_add_property(ocs, np, new_pp); - if (ret) { - kfree(new_pp->name); - kfree(new_pp->value); - kfree(new_pp); - } + if (ret) + __of_prop_free(new_pp); new_pp->next = np->deadprops; np->deadprops = new_pp; diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index 21f8f5e80917..73b55f4f84a3 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -123,6 +123,7 @@ extern void *__unflatten_device_tree(const void *blob, * own the devtree lock or work on detached trees only. */ struct property *__of_prop_dup(const struct property *prop, gfp_t allocflags); +void __of_prop_free(struct property *prop); struct device_node *__of_node_dup(const struct device_node *np, const char *full_name); diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c index a9a292d6d59b..dc1329958641 100644 --- a/drivers/of/overlay.c +++ b/drivers/of/overlay.c @@ -262,9 +262,7 @@ static struct property *dup_and_fixup_symbol_prop( return new_prop; err_free_new_prop: - kfree(new_prop->name); - kfree(new_prop->value); - kfree(new_prop); + __of_prop_free(new_prop); err_free_target_path: kfree(target_path); @@ -361,11 +359,8 @@ static int add_changeset_property(struct overlay_changeset *ovcs, pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n", target->np, new_prop->name); - if (ret) { - kfree(new_prop->name); - kfree(new_prop->value); - kfree(new_prop); - } + if (ret) + __of_prop_free(new_prop); return ret; } diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c index 3b22c36bfb0b..5bfec440b4fd 100644 --- a/drivers/of/unittest.c +++ b/drivers/of/unittest.c @@ -800,15 +800,11 @@ static void __init of_unittest_property_copy(void) new = __of_prop_dup(&p1, GFP_KERNEL); unittest(new && propcmp(&p1, new), "empty property didn't copy correctly\n"); - kfree(new->value); - kfree(new->name); - kfree(new); + __of_prop_free(new); new = __of_prop_dup(&p2, GFP_KERNEL); unittest(new && propcmp(&p2, new), "non-empty property didn't copy correctly\n"); - kfree(new->value); - kfree(new->name); - kfree(new); + __of_prop_free(new); #endif } @@ -3665,9 +3661,7 @@ static __init void of_unittest_overlay_high_level(void) goto err_unlock; } if (__of_add_property(of_symbols, new_prop)) { - kfree(new_prop->name); - kfree(new_prop->value); - kfree(new_prop); + __of_prop_free(new_prop); /* "name" auto-generated by unflatten */ if (!strcmp(prop->name, "name")) continue; From 9e0743eb6dcfd2e7e998d1b3182344a3221c32ee Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 22 Aug 2025 11:08:46 +0300 Subject: [PATCH 06/76] of: dynamic: Fix use after free in of_changeset_add_prop_helper() [ Upstream commit 80af3745ca465c6c47e833c1902004a7fa944f37 ] If the of_changeset_add_property() function call fails, then this code frees "new_pp" and then dereference it on the next line. Return the error code directly instead. Fixes: c81f6ce16785 ("of: dynamic: Fix memleak when of_pci_add_properties() failed") Signed-off-by: Dan Carpenter Link: https://lore.kernel.org/r/aKgljjhnpa4lVpdx@stanley.mountain Signed-off-by: Rob Herring (Arm) Signed-off-by: Sasha Levin --- drivers/of/dynamic.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index 72531f44adf0..18393800546c 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -934,13 +934,15 @@ static int of_changeset_add_prop_helper(struct of_changeset *ocs, return -ENOMEM; ret = of_changeset_add_property(ocs, np, new_pp); - if (ret) + if (ret) { __of_prop_free(new_pp); + return ret; + } new_pp->next = np->deadprops; np->deadprops = new_pp; - return ret; + return 0; } /** From 28c8fb7ae2ad27d81c8de3c4fe608c509f6a18aa Mon Sep 17 00:00:00 2001 From: Tengda Wu Date: Fri, 22 Aug 2025 03:33:43 +0000 Subject: [PATCH 07/76] ftrace: Fix potential warning in trace_printk_seq during ftrace_dump [ Upstream commit 4013aef2ced9b756a410f50d12df9ebe6a883e4a ] When calling ftrace_dump_one() concurrently with reading trace_pipe, a WARN_ON_ONCE() in trace_printk_seq() can be triggered due to a race condition. The issue occurs because: CPU0 (ftrace_dump) CPU1 (reader) echo z > /proc/sysrq-trigger !trace_empty(&iter) trace_iterator_reset(&iter) <- len = size = 0 cat /sys/kernel/tracing/trace_pipe trace_find_next_entry_inc(&iter) __find_next_entry ring_buffer_empty_cpu <- all empty return NULL trace_printk_seq(&iter.seq) WARN_ON_ONCE(s->seq.len >= s->seq.size) In the context between trace_empty() and trace_find_next_entry_inc() during ftrace_dump, the ring buffer data was consumed by other readers. This caused trace_find_next_entry_inc to return NULL, failing to populate `iter.seq`. At this point, due to the prior trace_iterator_reset, both `iter.seq.len` and `iter.seq.size` were set to 0. Since they are equal, the WARN_ON_ONCE condition is triggered. Move the trace_printk_seq() into the if block that checks to make sure the return value of trace_find_next_entry_inc() is non-NULL in ftrace_dump_one(), ensuring the 'iter.seq' is properly populated before subsequent operations. Cc: Masami Hiramatsu Cc: Mark Rutland Cc: Mathieu Desnoyers Cc: Ingo Molnar Link: https://lore.kernel.org/20250822033343.3000289-1-wutengda@huaweicloud.com Fixes: d769041f8653 ("ring_buffer: implement new locking") Signed-off-by: Tengda Wu Signed-off-by: Steven Rostedt (Google) Signed-off-by: Sasha Levin --- kernel/trace/trace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 907e45361939..a32c8637503d 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -10162,10 +10162,10 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) ret = print_trace_line(&iter); if (ret != TRACE_TYPE_NO_CONSUME) trace_consume(&iter); + + trace_printk_seq(&iter.seq); } touch_nmi_watchdog(); - - trace_printk_seq(&iter.seq); } if (!cnt) From 43662b846c7a22ffc368502ab39e8caaaf4d111e Mon Sep 17 00:00:00 2001 From: Damien Le Moal Date: Mon, 28 Jul 2025 13:17:00 +0900 Subject: [PATCH 08/76] scsi: core: sysfs: Correct sysfs attributes access rights [ Upstream commit a2f54ff15c3bdc0132e20aae041607e2320dbd73 ] The SCSI sysfs attributes "supported_mode" and "active_mode" do not define a store method and thus cannot be modified. Correct the DEVICE_ATTR() call for these two attributes to not include S_IWUSR to allow write access as they are read-only. Signed-off-by: Damien Le Moal Link: https://lore.kernel.org/r/20250728041700.76660-1-dlemoal@kernel.org Reviewed-by: John Garry Reviewed-by: Johannes Thumshin Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/scsi_sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 24f6eefb6803..df37ac81620e 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -265,7 +265,7 @@ show_shost_supported_mode(struct device *dev, struct device_attribute *attr, return show_shost_mode(supported_mode, buf); } -static DEVICE_ATTR(supported_mode, S_IRUGO | S_IWUSR, show_shost_supported_mode, NULL); +static DEVICE_ATTR(supported_mode, S_IRUGO, show_shost_supported_mode, NULL); static ssize_t show_shost_active_mode(struct device *dev, @@ -279,7 +279,7 @@ show_shost_active_mode(struct device *dev, return show_shost_mode(shost->active_mode, buf); } -static DEVICE_ATTR(active_mode, S_IRUGO | S_IWUSR, show_shost_active_mode, NULL); +static DEVICE_ATTR(active_mode, S_IRUGO, show_shost_active_mode, NULL); static int check_reset_type(const char *str) { From bc1427a48371808378ef0c8cc1d21a27352fbeb1 Mon Sep 17 00:00:00 2001 From: Paulo Alcantara Date: Fri, 8 Aug 2025 12:20:17 -0300 Subject: [PATCH 09/76] smb: client: fix race with concurrent opens in unlink(2) [ Upstream commit 0af1561b2d60bab2a2b00720a5c7b292ecc549ec ] According to some logs reported by customers, CIFS client might end up reporting unlinked files as existing in stat(2) due to concurrent opens racing with unlink(2). Besides sending the removal request to the server, the unlink process could involve closing any deferred close as well as marking all existing open handles as deleted to prevent them from deferring closes, which increases the race window for potential concurrent opens. Fix this by unhashing the dentry in cifs_unlink() to prevent any subsequent opens. Any open attempts, while we're still unlinking, will block on parent's i_rwsem. Reported-by: Jay Shin Signed-off-by: Paulo Alcantara (Red Hat) Reviewed-by: David Howells Cc: Al Viro Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Sasha Levin --- fs/smb/client/inode.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index d93ebd58ecae..df01029918fd 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -1856,15 +1856,24 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) struct cifs_sb_info *cifs_sb = CIFS_SB(sb); struct tcon_link *tlink; struct cifs_tcon *tcon; + __u32 dosattr = 0, origattr = 0; struct TCP_Server_Info *server; struct iattr *attrs = NULL; - __u32 dosattr = 0, origattr = 0; + bool rehash = false; cifs_dbg(FYI, "cifs_unlink, dir=0x%p, dentry=0x%p\n", dir, dentry); if (unlikely(cifs_forced_shutdown(cifs_sb))) return -EIO; + /* Unhash dentry in advance to prevent any concurrent opens */ + spin_lock(&dentry->d_lock); + if (!d_unhashed(dentry)) { + __d_drop(dentry); + rehash = true; + } + spin_unlock(&dentry->d_lock); + tlink = cifs_sb_tlink(cifs_sb); if (IS_ERR(tlink)) return PTR_ERR(tlink); @@ -1915,7 +1924,8 @@ psx_del_no_retry: cifs_drop_nlink(inode); } } else if (rc == -ENOENT) { - d_drop(dentry); + if (simple_positive(dentry)) + d_delete(dentry); } else if (rc == -EBUSY) { if (server->ops->rename_pending_delete) { rc = server->ops->rename_pending_delete(full_path, @@ -1968,6 +1978,8 @@ unlink_out: kfree(attrs); free_xid(xid); cifs_put_tlink(tlink); + if (rehash) + d_rehash(dentry); return rc; } From 24b9ed739c8c5b464d983e12cf308982f3ae93c2 Mon Sep 17 00:00:00 2001 From: Paulo Alcantara Date: Fri, 8 Aug 2025 11:43:29 -0300 Subject: [PATCH 10/76] smb: client: fix race with concurrent opens in rename(2) [ Upstream commit d84291fc7453df7881a970716f8256273aca5747 ] Besides sending the rename request to the server, the rename process also involves closing any deferred close, waiting for outstanding I/O to complete as well as marking all existing open handles as deleted to prevent them from deferring closes, which increases the race window for potential concurrent opens on the target file. Fix this by unhashing the dentry in advance to prevent any concurrent opens on the target. Signed-off-by: Paulo Alcantara (Red Hat) Reviewed-by: David Howells Cc: Al Viro Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Sasha Levin --- fs/smb/client/inode.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index df01029918fd..6c16c4f34d88 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2379,6 +2379,7 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, struct cifs_sb_info *cifs_sb; struct tcon_link *tlink; struct cifs_tcon *tcon; + bool rehash = false; unsigned int xid; int rc, tmprc; int retry_count = 0; @@ -2394,6 +2395,17 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, if (unlikely(cifs_forced_shutdown(cifs_sb))) return -EIO; + /* + * Prevent any concurrent opens on the target by unhashing the dentry. + * VFS already unhashes the target when renaming directories. + */ + if (d_is_positive(target_dentry) && !d_is_dir(target_dentry)) { + if (!d_unhashed(target_dentry)) { + d_drop(target_dentry); + rehash = true; + } + } + tlink = cifs_sb_tlink(cifs_sb); if (IS_ERR(tlink)) return PTR_ERR(tlink); @@ -2433,6 +2445,8 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, } } + if (!rc) + rehash = false; /* * No-replace is the natural behavior for CIFS, so skip unlink hacks. */ @@ -2491,12 +2505,16 @@ unlink_target: goto cifs_rename_exit; rc = cifs_do_rename(xid, source_dentry, from_name, target_dentry, to_name); + if (!rc) + rehash = false; } /* force revalidate to go get info when needed */ CIFS_I(source_dir)->time = CIFS_I(target_dir)->time = 0; cifs_rename_exit: + if (rehash) + d_rehash(target_dentry); kfree(info_buf_source); free_dentry_path(page2); free_dentry_path(page1); From 8f8e6a7817837d041bc9eecd189aa4be93146eb4 Mon Sep 17 00:00:00 2001 From: Alexey Klimov Date: Wed, 6 Aug 2025 15:00:30 +0100 Subject: [PATCH 11/76] ASoC: codecs: tx-macro: correct tx_macro_component_drv name [ Upstream commit 43e0da37d5cfb23eec6aeee9422f84d86621ce2b ] We already have a component driver named "RX-MACRO", which is lpass-rx-macro.c. The tx macro component driver's name should be "TX-MACRO" accordingly. Fix it. Cc: Srinivas Kandagatla Signed-off-by: Alexey Klimov Reviewed-by: Neil Armstrong Link: https://patch.msgid.link/20250806140030.691477-1-alexey.klimov@linaro.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/lpass-tx-macro.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/lpass-tx-macro.c b/sound/soc/codecs/lpass-tx-macro.c index ebddfa74ce0a..150ed10c8377 100644 --- a/sound/soc/codecs/lpass-tx-macro.c +++ b/sound/soc/codecs/lpass-tx-macro.c @@ -1940,7 +1940,7 @@ static int tx_macro_register_mclk_output(struct tx_macro *tx) } static const struct snd_soc_component_driver tx_macro_component_drv = { - .name = "RX-MACRO", + .name = "TX-MACRO", .probe = tx_macro_component_probe, .controls = tx_macro_snd_controls, .num_controls = ARRAY_SIZE(tx_macro_snd_controls), From 612527136e0cb2944f8be262c8f3bbb84fee3f93 Mon Sep 17 00:00:00 2001 From: Junli Liu Date: Tue, 5 Aug 2025 09:19:58 +0800 Subject: [PATCH 12/76] erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC [ Upstream commit c99fab6e80b76422741d34aafc2f930a482afbdd ] Since EROFS handles decompression in non-atomic contexts due to uncontrollable decompression latencies and vmap() usage, it tries to detect atomic contexts and only kicks off a kworker on demand in order to reduce unnecessary scheduling overhead. However, the current approach is insufficient and can lead to sleeping function calls in invalid contexts, causing kernel warnings and potential system instability. See the stacktrace [1] and previous discussion [2]. The current implementation only checks rcu_read_lock_any_held(), which behaves inconsistently across different kernel configurations: - When CONFIG_DEBUG_LOCK_ALLOC is enabled: correctly detects RCU critical sections by checking rcu_lock_map - When CONFIG_DEBUG_LOCK_ALLOC is disabled: compiles to "!preemptible()", which only checks preempt_count and misses RCU critical sections This patch introduces z_erofs_in_atomic() to provide comprehensive atomic context detection: 1. Check RCU preemption depth when CONFIG_PREEMPTION is enabled, as RCU critical sections may not affect preempt_count but still require atomic handling 2. Always use async processing when CONFIG_PREEMPT_COUNT is disabled, as preemption state cannot be reliably determined 3. Fall back to standard preemptible() check for remaining cases The function replaces the previous complex condition check and ensures that z_erofs always uses (kthread_)work in atomic contexts to minimize scheduling overhead and prevent sleeping in invalid contexts. [1] Problem stacktrace [ 61.266692] BUG: sleeping function called from invalid context at kernel/locking/rtmutex_api.c:510 [ 61.266702] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 107, name: irq/54-ufshcd [ 61.266704] preempt_count: 0, expected: 0 [ 61.266705] RCU nest depth: 2, expected: 0 [ 61.266710] CPU: 0 UID: 0 PID: 107 Comm: irq/54-ufshcd Tainted: G W O 6.12.17 #1 [ 61.266714] Tainted: [W]=WARN, [O]=OOT_MODULE [ 61.266715] Hardware name: schumacher (DT) [ 61.266717] Call trace: [ 61.266718] dump_backtrace+0x9c/0x100 [ 61.266727] show_stack+0x20/0x38 [ 61.266728] dump_stack_lvl+0x78/0x90 [ 61.266734] dump_stack+0x18/0x28 [ 61.266736] __might_resched+0x11c/0x180 [ 61.266743] __might_sleep+0x64/0xc8 [ 61.266745] mutex_lock+0x2c/0xc0 [ 61.266748] z_erofs_decompress_queue+0xe8/0x978 [ 61.266753] z_erofs_decompress_kickoff+0xa8/0x190 [ 61.266756] z_erofs_endio+0x168/0x288 [ 61.266758] bio_endio+0x160/0x218 [ 61.266762] blk_update_request+0x244/0x458 [ 61.266766] scsi_end_request+0x38/0x278 [ 61.266770] scsi_io_completion+0x4c/0x600 [ 61.266772] scsi_finish_command+0xc8/0xe8 [ 61.266775] scsi_complete+0x88/0x148 [ 61.266777] blk_mq_complete_request+0x3c/0x58 [ 61.266780] scsi_done_internal+0xcc/0x158 [ 61.266782] scsi_done+0x1c/0x30 [ 61.266783] ufshcd_compl_one_cqe+0x12c/0x438 [ 61.266786] __ufshcd_transfer_req_compl+0x2c/0x78 [ 61.266788] ufshcd_poll+0xf4/0x210 [ 61.266789] ufshcd_transfer_req_compl+0x50/0x88 [ 61.266791] ufshcd_intr+0x21c/0x7c8 [ 61.266792] irq_forced_thread_fn+0x44/0xd8 [ 61.266796] irq_thread+0x1a4/0x358 [ 61.266799] kthread+0x12c/0x138 [ 61.266802] ret_from_fork+0x10/0x20 [2] https://lore.kernel.org/r/58b661d0-0ebb-4b45-a10d-c5927fb791cd@paulmck-laptop Signed-off-by: Junli Liu Reviewed-by: Gao Xiang Link: https://lore.kernel.org/r/20250805011957.911186-1-liujunli@lixiang.com [ Gao Xiang: Use the original trace in v1. ] Signed-off-by: Gao Xiang Signed-off-by: Sasha Levin --- fs/erofs/zdata.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index d852b43ac43e..c1f802ecc47b 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1401,6 +1401,16 @@ static void z_erofs_decompressqueue_kthread_work(struct kthread_work *work) } #endif +/* Use (kthread_)work in atomic contexts to minimize scheduling overhead */ +static inline bool z_erofs_in_atomic(void) +{ + if (IS_ENABLED(CONFIG_PREEMPTION) && rcu_preempt_depth()) + return true; + if (!IS_ENABLED(CONFIG_PREEMPT_COUNT)) + return true; + return !preemptible(); +} + static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, int bios) { @@ -1415,8 +1425,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, if (atomic_add_return(bios, &io->pending_bios)) return; - /* Use (kthread_)work and sync decompression for atomic contexts only */ - if (!in_task() || irqs_disabled() || rcu_read_lock_any_held()) { + if (z_erofs_in_atomic()) { #ifdef CONFIG_EROFS_FS_PCPU_KTHREAD struct kthread_worker *worker; From b15342e09644effde11e8c9aefc67390bdeaaac1 Mon Sep 17 00:00:00 2001 From: Werner Sembach Date: Thu, 8 May 2025 13:16:18 +0200 Subject: [PATCH 13/76] ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list commit 9cd51eefae3c871440b93c03716c5398f41bdf78 upstream. Add the TUXEDO InfinityBook Pro AMD Gen9 to the acpi_ec_no_wakeup[] quirk list to prevent spurious wakeups. Signed-off-by: Werner Sembach Link: https://patch.msgid.link/20250508111625.12149-1-wse@tuxedocomputers.com Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- drivers/acpi/ec.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index 77d6af611589..8e304efde342 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -2329,6 +2329,12 @@ static const struct dmi_system_id acpi_ec_no_wakeup[] = { DMI_MATCH(DMI_PRODUCT_NAME, "83Q3"), } }, + { + // TUXEDO InfinityBook Pro AMD Gen9 + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GXxHRXx"), + }, + }, { }, }; From 9a1963404cc2eef69d2f8a42861bdf63d087dd5d Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Mon, 1 Jul 2024 07:26:52 +0200 Subject: [PATCH 14/76] nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests commit 25edbcac6e32eab345e470d56ca9974a577b878b upstream. Fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests to prepare for future changes to this code, and move the helpers to write.c as well. Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Signed-off-by: Anna Schumaker Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman --- fs/nfs/pagelist.c | 77 ---------------------------------------- fs/nfs/write.c | 75 ++++++++++++++++++++++++++++++++++---- include/linux/nfs_page.h | 1 - 3 files changed, 69 insertions(+), 84 deletions(-) diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 040b6b79c75e..30e2488eb84c 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -206,83 +206,6 @@ nfs_page_group_lock_head(struct nfs_page *req) return head; } -/* - * nfs_unroll_locks - unlock all newly locked reqs and wait on @req - * @head: head request of page group, must be holding head lock - * @req: request that couldn't lock and needs to wait on the req bit lock - * - * This is a helper function for nfs_lock_and_join_requests - * returns 0 on success, < 0 on error. - */ -static void -nfs_unroll_locks(struct nfs_page *head, struct nfs_page *req) -{ - struct nfs_page *tmp; - - /* relinquish all the locks successfully grabbed this run */ - for (tmp = head->wb_this_page ; tmp != req; tmp = tmp->wb_this_page) { - if (!kref_read(&tmp->wb_kref)) - continue; - nfs_unlock_and_release_request(tmp); - } -} - -/* - * nfs_page_group_lock_subreq - try to lock a subrequest - * @head: head request of page group - * @subreq: request to lock - * - * This is a helper function for nfs_lock_and_join_requests which - * must be called with the head request and page group both locked. - * On error, it returns with the page group unlocked. - */ -static int -nfs_page_group_lock_subreq(struct nfs_page *head, struct nfs_page *subreq) -{ - int ret; - - if (!kref_get_unless_zero(&subreq->wb_kref)) - return 0; - while (!nfs_lock_request(subreq)) { - nfs_page_group_unlock(head); - ret = nfs_wait_on_request(subreq); - if (!ret) - ret = nfs_page_group_lock(head); - if (ret < 0) { - nfs_unroll_locks(head, subreq); - nfs_release_request(subreq); - return ret; - } - } - return 0; -} - -/* - * nfs_page_group_lock_subrequests - try to lock the subrequests - * @head: head request of page group - * - * This is a helper function for nfs_lock_and_join_requests which - * must be called with the head request locked. - */ -int nfs_page_group_lock_subrequests(struct nfs_page *head) -{ - struct nfs_page *subreq; - int ret; - - ret = nfs_page_group_lock(head); - if (ret < 0) - return ret; - /* lock each request in the page group */ - for (subreq = head->wb_this_page; subreq != head; - subreq = subreq->wb_this_page) { - ret = nfs_page_group_lock_subreq(head, subreq); - if (ret < 0) - return ret; - } - nfs_page_group_unlock(head); - return 0; -} - /* * nfs_page_set_headlock - set the request PG_HEADLOCK * @req: request that is to be locked diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 7d03811f44a4..289ff8e9f78a 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -548,6 +548,57 @@ void nfs_join_page_group(struct nfs_page *head, struct nfs_commit_info *cinfo, nfs_destroy_unlinked_subrequests(destroy_list, head, inode); } +/* + * nfs_unroll_locks - unlock all newly locked reqs and wait on @req + * @head: head request of page group, must be holding head lock + * @req: request that couldn't lock and needs to wait on the req bit lock + * + * This is a helper function for nfs_lock_and_join_requests + * returns 0 on success, < 0 on error. + */ +static void +nfs_unroll_locks(struct nfs_page *head, struct nfs_page *req) +{ + struct nfs_page *tmp; + + /* relinquish all the locks successfully grabbed this run */ + for (tmp = head->wb_this_page ; tmp != req; tmp = tmp->wb_this_page) { + if (!kref_read(&tmp->wb_kref)) + continue; + nfs_unlock_and_release_request(tmp); + } +} + +/* + * nfs_page_group_lock_subreq - try to lock a subrequest + * @head: head request of page group + * @subreq: request to lock + * + * This is a helper function for nfs_lock_and_join_requests which + * must be called with the head request and page group both locked. + * On error, it returns with the page group unlocked. + */ +static int +nfs_page_group_lock_subreq(struct nfs_page *head, struct nfs_page *subreq) +{ + int ret; + + if (!kref_get_unless_zero(&subreq->wb_kref)) + return 0; + while (!nfs_lock_request(subreq)) { + nfs_page_group_unlock(head); + ret = nfs_wait_on_request(subreq); + if (!ret) + ret = nfs_page_group_lock(head); + if (ret < 0) { + nfs_unroll_locks(head, subreq); + nfs_release_request(subreq); + return ret; + } + } + return 0; +} + /* * nfs_lock_and_join_requests - join all subreqs to the head req * @folio: the folio used to lookup the "page group" of nfs_page structures @@ -566,7 +617,7 @@ void nfs_join_page_group(struct nfs_page *head, struct nfs_commit_info *cinfo, static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio) { struct inode *inode = folio_file_mapping(folio)->host; - struct nfs_page *head; + struct nfs_page *head, *subreq; struct nfs_commit_info cinfo; int ret; @@ -580,16 +631,28 @@ static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio) if (IS_ERR_OR_NULL(head)) return head; + ret = nfs_page_group_lock(head); + if (ret < 0) + goto out_unlock; + /* lock each request in the page group */ - ret = nfs_page_group_lock_subrequests(head); - if (ret < 0) { - nfs_unlock_and_release_request(head); - return ERR_PTR(ret); + for (subreq = head->wb_this_page; + subreq != head; + subreq = subreq->wb_this_page) { + ret = nfs_page_group_lock_subreq(head, subreq); + if (ret < 0) + goto out_unlock; } - nfs_join_page_group(head, &cinfo, inode); + nfs_page_group_unlock(head); + nfs_init_cinfo_from_inode(&cinfo, inode); + nfs_join_page_group(head, &cinfo, inode); return head; + +out_unlock: + nfs_unlock_and_release_request(head); + return ERR_PTR(ret); } static void nfs_write_error(struct nfs_page *req, int error) diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index 1c315f854ea8..3a0f7ebe5388 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -156,7 +156,6 @@ extern int nfs_wait_on_request(struct nfs_page *); extern void nfs_unlock_request(struct nfs_page *req); extern void nfs_unlock_and_release_request(struct nfs_page *); extern struct nfs_page *nfs_page_group_lock_head(struct nfs_page *req); -extern int nfs_page_group_lock_subrequests(struct nfs_page *head); extern void nfs_join_page_group(struct nfs_page *head, struct nfs_commit_info *cinfo, struct inode *inode); From 181feb41f0b268e6288bf9a7b984624d7fe2031d Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 16 Aug 2025 07:25:20 -0700 Subject: [PATCH 15/76] NFS: Fix a race when updating an existing write commit 76d2e3890fb169168c73f2e4f8375c7cc24a765e upstream. After nfs_lock_and_join_requests() tests for whether the request is still attached to the mapping, nothing prevents a call to nfs_inode_remove_request() from succeeding until we actually lock the page group. The reason is that whoever called nfs_inode_remove_request() doesn't necessarily have a lock on the page group head. So in order to avoid races, let's take the page group lock earlier in nfs_lock_and_join_requests(), and hold it across the removal of the request in nfs_inode_remove_request(). Reported-by: Jeff Layton Tested-by: Joe Quanaim Tested-by: Andrew Steffen Reviewed-by: Jeff Layton Fixes: bd37d6fce184 ("NFSv4: Convert nfs_lock_and_join_requests() to use nfs_page_find_head_request()") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman --- fs/nfs/pagelist.c | 9 ++--- fs/nfs/write.c | 71 ++++++++++++++-------------------------- include/linux/nfs_page.h | 1 + 3 files changed, 31 insertions(+), 50 deletions(-) diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 30e2488eb84c..0ea3916ed1dc 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -272,13 +272,14 @@ nfs_page_group_unlock(struct nfs_page *req) nfs_page_clear_headlock(req); } -/* - * nfs_page_group_sync_on_bit_locked +/** + * nfs_page_group_sync_on_bit_locked - Test if all requests have @bit set + * @req: request in page group + * @bit: PG_* bit that is used to sync page group * * must be called with page group lock held */ -static bool -nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit) +bool nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit) { struct nfs_page *head = req->wb_head; struct nfs_page *tmp; diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 289ff8e9f78a..cb1e9996fcc8 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -156,20 +156,10 @@ nfs_page_set_inode_ref(struct nfs_page *req, struct inode *inode) } } -static int -nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode) +static void nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode) { - int ret; - - if (!test_bit(PG_REMOVE, &req->wb_flags)) - return 0; - ret = nfs_page_group_lock(req); - if (ret) - return ret; if (test_and_clear_bit(PG_REMOVE, &req->wb_flags)) nfs_page_set_inode_ref(req, inode); - nfs_page_group_unlock(req); - return 0; } static struct nfs_page *nfs_folio_private_request(struct folio *folio) @@ -238,36 +228,6 @@ static struct nfs_page *nfs_folio_find_head_request(struct folio *folio) return req; } -static struct nfs_page *nfs_folio_find_and_lock_request(struct folio *folio) -{ - struct inode *inode = folio_file_mapping(folio)->host; - struct nfs_page *req, *head; - int ret; - - for (;;) { - req = nfs_folio_find_head_request(folio); - if (!req) - return req; - head = nfs_page_group_lock_head(req); - if (head != req) - nfs_release_request(req); - if (IS_ERR(head)) - return head; - ret = nfs_cancel_remove_inode(head, inode); - if (ret < 0) { - nfs_unlock_and_release_request(head); - return ERR_PTR(ret); - } - /* Ensure that nobody removed the request before we locked it */ - if (head == nfs_folio_private_request(folio)) - break; - if (folio_test_swapcache(folio)) - break; - nfs_unlock_and_release_request(head); - } - return head; -} - /* Adjust the file length if we're writing beyond the end */ static void nfs_grow_file(struct folio *folio, unsigned int offset, unsigned int count) @@ -621,20 +581,37 @@ static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio) struct nfs_commit_info cinfo; int ret; - nfs_init_cinfo_from_inode(&cinfo, inode); /* * A reference is taken only on the head request which acts as a * reference to the whole page group - the group will not be destroyed * until the head reference is released. */ - head = nfs_folio_find_and_lock_request(folio); - if (IS_ERR_OR_NULL(head)) - return head; +retry: + head = nfs_folio_find_head_request(folio); + if (!head) + return NULL; + + while (!nfs_lock_request(head)) { + ret = nfs_wait_on_request(head); + if (ret < 0) { + nfs_release_request(head); + return ERR_PTR(ret); + } + } ret = nfs_page_group_lock(head); if (ret < 0) goto out_unlock; + /* Ensure that nobody removed the request before we locked it */ + if (head != folio->private && !folio_test_swapcache(folio)) { + nfs_page_group_unlock(head); + nfs_unlock_and_release_request(head); + goto retry; + } + + nfs_cancel_remove_inode(head, inode); + /* lock each request in the page group */ for (subreq = head->wb_this_page; subreq != head; @@ -855,7 +832,8 @@ static void nfs_inode_remove_request(struct nfs_page *req) { struct nfs_inode *nfsi = NFS_I(nfs_page_to_inode(req)); - if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { + nfs_page_group_lock(req); + if (nfs_page_group_sync_on_bit_locked(req, PG_REMOVE)) { struct folio *folio = nfs_page_to_folio(req->wb_head); struct address_space *mapping = folio_file_mapping(folio); @@ -867,6 +845,7 @@ static void nfs_inode_remove_request(struct nfs_page *req) } spin_unlock(&mapping->private_lock); } + nfs_page_group_unlock(req); if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { atomic_long_dec(&nfsi->nrequests); diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index 3a0f7ebe5388..6a46069c5a36 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -162,6 +162,7 @@ extern void nfs_join_page_group(struct nfs_page *head, extern int nfs_page_group_lock(struct nfs_page *); extern void nfs_page_group_unlock(struct nfs_page *); extern bool nfs_page_group_sync_on_bit(struct nfs_page *, unsigned int); +extern bool nfs_page_group_sync_on_bit_locked(struct nfs_page *, unsigned int); extern int nfs_page_set_headlock(struct nfs_page *req); extern void nfs_page_clear_headlock(struct nfs_page *req); extern bool nfs_async_iocounter_wait(struct rpc_task *, struct nfs_lock_context *); From 9b2700151660332a60d87c5a58eca1b4c57ae25b Mon Sep 17 00:00:00 2001 From: Nikolay Kuratov Date: Tue, 5 Aug 2025 16:09:17 +0300 Subject: [PATCH 16/76] vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put() commit dd54bcf86c91a4455b1f95cbc8e9ac91205f3193 upstream. When operating on struct vhost_net_ubuf_ref, the following execution sequence is theoretically possible: CPU0 is finalizing DMA operation CPU1 is doing VHOST_NET_SET_BACKEND // ubufs->refcount == 2 vhost_net_ubuf_put() vhost_net_ubuf_put_wait_and_free(oldubufs) vhost_net_ubuf_put_and_wait() vhost_net_ubuf_put() int r = atomic_sub_return(1, &ubufs->refcount); // r = 1 int r = atomic_sub_return(1, &ubufs->refcount); // r = 0 wait_event(ubufs->wait, !atomic_read(&ubufs->refcount)); // no wait occurs here because condition is already true kfree(ubufs); if (unlikely(!r)) wake_up(&ubufs->wait); // use-after-free This leads to use-after-free on ubufs access. This happens because CPU1 skips waiting for wake_up() when refcount is already zero. To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(), as suggested by Hillf Danton. For this lock to take effect, free ubufs with kfree_rcu(). Cc: stable@vger.kernel.org Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock") Reported-by: Andrey Ryabinin Suggested-by: Hillf Danton Signed-off-by: Nikolay Kuratov Message-Id: <20250805130917.727332-1-kniv@yandex-team.ru> Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman --- drivers/vhost/net.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f2ed7167c848..5ad237d77a9a 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -96,6 +96,7 @@ struct vhost_net_ubuf_ref { atomic_t refcount; wait_queue_head_t wait; struct vhost_virtqueue *vq; + struct rcu_head rcu; }; #define VHOST_NET_BATCH 64 @@ -249,9 +250,13 @@ vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy) static int vhost_net_ubuf_put(struct vhost_net_ubuf_ref *ubufs) { - int r = atomic_sub_return(1, &ubufs->refcount); + int r; + + rcu_read_lock(); + r = atomic_sub_return(1, &ubufs->refcount); if (unlikely(!r)) wake_up(&ubufs->wait); + rcu_read_unlock(); return r; } @@ -264,7 +269,7 @@ static void vhost_net_ubuf_put_and_wait(struct vhost_net_ubuf_ref *ubufs) static void vhost_net_ubuf_put_wait_and_free(struct vhost_net_ubuf_ref *ubufs) { vhost_net_ubuf_put_and_wait(ubufs); - kfree(ubufs); + kfree_rcu(ubufs, rcu); } static void vhost_net_clear_ubuf_info(struct vhost_net *n) From a208d67cb44ba441bd38e04e270e9f1e230234ee Mon Sep 17 00:00:00 2001 From: Oscar Maes Date: Wed, 27 Aug 2025 08:23:21 +0200 Subject: [PATCH 17/76] net: ipv4: fix regression in local-broadcast routes [ Upstream commit 5189446ba995556eaa3755a6e875bc06675b88bd ] Commit 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes") introduced a regression where local-broadcast packets would have their gateway set in __mkroute_output, which was caused by fi = NULL being removed. Fix this by resetting the fib_info for local-broadcast packets. This preserves the intended changes for directed-broadcast packets. Cc: stable@vger.kernel.org Fixes: 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes") Reported-by: Brett A C Sheffield Closes: https://lore.kernel.org/regressions/20250822165231.4353-4-bacs@librecast.net Signed-off-by: Oscar Maes Reviewed-by: David Ahern Link: https://patch.msgid.link/20250827062322.4807-1-oscmaes92@gmail.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin --- net/ipv4/route.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8672ebbace98..20f5c8307443 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2547,12 +2547,16 @@ static struct rtable *__mkroute_output(const struct fib_result *res, !netif_is_l3_master(dev_out)) return ERR_PTR(-EINVAL); - if (ipv4_is_lbcast(fl4->daddr)) + if (ipv4_is_lbcast(fl4->daddr)) { type = RTN_BROADCAST; - else if (ipv4_is_multicast(fl4->daddr)) + + /* reset fi to prevent gateway resolution */ + fi = NULL; + } else if (ipv4_is_multicast(fl4->daddr)) { type = RTN_MULTICAST; - else if (ipv4_is_zeronet(fl4->daddr)) + } else if (ipv4_is_zeronet(fl4->daddr)) { return ERR_PTR(-EINVAL); + } if (dev_out->flags & IFF_LOOPBACK) flags |= RTCF_LOCAL; From 6de90c2a3b6cc35a6ef2ba334349ef076ceaf9b1 Mon Sep 17 00:00:00 2001 From: Rob Clark Date: Wed, 23 Jul 2025 13:28:22 -0700 Subject: [PATCH 18/76] drm/msm: Defer fd_install in SUBMIT ioctl [ Upstream commit f22853435bbd1e9836d0dce7fd99c040b94c2bf1 ] Avoid fd_install() until there are no more potential error paths, to avoid put_unused_fd() after the fd is made visible to userspace. Fixes: 68dc6c2d5eec ("drm/msm: Fix submit error-path leaks") Reported-by: Dan Carpenter Signed-off-by: Rob Clark Patchwork: https://patchwork.freedesktop.org/patch/665363/ Signed-off-by: Sasha Levin --- drivers/gpu/drm/msm/msm_gem_submit.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index bbe4f1665b60..dc142f9e4f60 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -981,12 +981,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) { sync_file = sync_file_create(submit->user_fence); - if (!sync_file) { + if (!sync_file) ret = -ENOMEM; - } else { - fd_install(out_fence_fd, sync_file->file); - args->fence_fd = out_fence_fd; - } } submit_attach_object_fences(submit); @@ -1013,10 +1009,14 @@ out: out_unlock: mutex_unlock(&queue->lock); out_post_unlock: - if (ret && (out_fence_fd >= 0)) { - put_unused_fd(out_fence_fd); + if (ret) { + if (out_fence_fd >= 0) + put_unused_fd(out_fence_fd); if (sync_file) fput(sync_file->file); + } else if (sync_file) { + fd_install(out_fence_fd, sync_file->file); + args->fence_fd = out_fence_fd; } if (!IS_ERR_OR_NULL(submit)) { From e7d0bd359f4cdf2b0eb99cd9659ec809e2646141 Mon Sep 17 00:00:00 2001 From: Madhavan Srinivasan Date: Sun, 18 May 2025 10:11:04 +0530 Subject: [PATCH 19/76] powerpc/kvm: Fix ifdef to remove build warning [ Upstream commit 88688a2c8ac6c8036d983ad8b34ce191c46a10aa ] When compiling for pseries or powernv defconfig with "make C=1", these warning were reported bu sparse tool in powerpc/kernel/kvm.c arch/powerpc/kernel/kvm.c:635:9: warning: switch with no cases arch/powerpc/kernel/kvm.c:646:9: warning: switch with no cases Currently #ifdef were added after the switch case which are specific for BOOKE and PPC_BOOK3S_32. These are not enabled in pseries/powernv defconfig. Fix it by moving the #ifdef before switch(){} Fixes: cbe487fac7fc0 ("KVM: PPC: Add mtsrin PV code") Tested-by: Venkat Rao Bagalkote Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250518044107.39928-1-maddy@linux.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/kernel/kvm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index 5b3c093611ba..7209d00a9c25 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -632,19 +632,19 @@ static void __init kvm_check_ins(u32 *inst, u32 features) #endif } - switch (inst_no_rt & ~KVM_MASK_RB) { #ifdef CONFIG_PPC_BOOK3S_32 + switch (inst_no_rt & ~KVM_MASK_RB) { case KVM_INST_MTSRIN: if (features & KVM_MAGIC_FEAT_SR) { u32 inst_rb = _inst & KVM_MASK_RB; kvm_patch_ins_mtsrin(inst, inst_rt, inst_rb); } break; -#endif } +#endif - switch (_inst) { #ifdef CONFIG_BOOKE + switch (_inst) { case KVM_INST_WRTEEI_0: kvm_patch_ins_wrteei_0(inst); break; @@ -652,8 +652,8 @@ static void __init kvm_check_ins(u32 *inst, u32 features) case KVM_INST_WRTEEI_1: kvm_patch_ins_wrtee(inst, 0, 1); break; -#endif } +#endif } extern u32 kvm_template_start[]; From 61d733a568d8a6b5585c298af048d63751b7e96d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Exp=C3=B3sito?= Date: Thu, 14 Aug 2025 12:39:39 +0200 Subject: [PATCH 20/76] HID: input: rename hidinput_set_battery_charge_status() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit a82231b2a8712d0218fc286a9b0da328d419a3f4 ] In preparation for a patch fixing a bug affecting hidinput_set_battery_charge_status(), rename the function to hidinput_update_battery_charge_status() and move it up so it can be used by hidinput_update_battery(). Refactor, no functional changes. Tested-by: 卢国宏 Signed-off-by: José Expósito Signed-off-by: Jiri Kosina Stable-dep-of: e94536e1d181 ("HID: input: report battery status changes immediately") Signed-off-by: Sasha Levin --- drivers/hid/hid-input-test.c | 10 ++++----- drivers/hid/hid-input.c | 42 ++++++++++++++++++------------------ 2 files changed, 26 insertions(+), 26 deletions(-) diff --git a/drivers/hid/hid-input-test.c b/drivers/hid/hid-input-test.c index 77c2d45ac62a..6f5c71660d82 100644 --- a/drivers/hid/hid-input-test.c +++ b/drivers/hid/hid-input-test.c @@ -7,7 +7,7 @@ #include -static void hid_test_input_set_battery_charge_status(struct kunit *test) +static void hid_test_input_update_battery_charge_status(struct kunit *test) { struct hid_device *dev; bool handled; @@ -15,15 +15,15 @@ static void hid_test_input_set_battery_charge_status(struct kunit *test) dev = kunit_kzalloc(test, sizeof(*dev), GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, dev); - handled = hidinput_set_battery_charge_status(dev, HID_DG_HEIGHT, 0); + handled = hidinput_update_battery_charge_status(dev, HID_DG_HEIGHT, 0); KUNIT_EXPECT_FALSE(test, handled); KUNIT_EXPECT_EQ(test, dev->battery_charge_status, POWER_SUPPLY_STATUS_UNKNOWN); - handled = hidinput_set_battery_charge_status(dev, HID_BAT_CHARGING, 0); + handled = hidinput_update_battery_charge_status(dev, HID_BAT_CHARGING, 0); KUNIT_EXPECT_TRUE(test, handled); KUNIT_EXPECT_EQ(test, dev->battery_charge_status, POWER_SUPPLY_STATUS_DISCHARGING); - handled = hidinput_set_battery_charge_status(dev, HID_BAT_CHARGING, 1); + handled = hidinput_update_battery_charge_status(dev, HID_BAT_CHARGING, 1); KUNIT_EXPECT_TRUE(test, handled); KUNIT_EXPECT_EQ(test, dev->battery_charge_status, POWER_SUPPLY_STATUS_CHARGING); } @@ -63,7 +63,7 @@ static void hid_test_input_get_battery_property(struct kunit *test) } static struct kunit_case hid_input_tests[] = { - KUNIT_CASE(hid_test_input_set_battery_charge_status), + KUNIT_CASE(hid_test_input_update_battery_charge_status), KUNIT_CASE(hid_test_input_get_battery_property), { } }; diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index 9d80635a91eb..b372b74f3e24 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -595,6 +595,20 @@ static void hidinput_cleanup_battery(struct hid_device *dev) dev->battery = NULL; } +static bool hidinput_update_battery_charge_status(struct hid_device *dev, + unsigned int usage, int value) +{ + switch (usage) { + case HID_BAT_CHARGING: + dev->battery_charge_status = value ? + POWER_SUPPLY_STATUS_CHARGING : + POWER_SUPPLY_STATUS_DISCHARGING; + return true; + } + + return false; +} + static void hidinput_update_battery(struct hid_device *dev, int value) { int capacity; @@ -617,20 +631,6 @@ static void hidinput_update_battery(struct hid_device *dev, int value) power_supply_changed(dev->battery); } } - -static bool hidinput_set_battery_charge_status(struct hid_device *dev, - unsigned int usage, int value) -{ - switch (usage) { - case HID_BAT_CHARGING: - dev->battery_charge_status = value ? - POWER_SUPPLY_STATUS_CHARGING : - POWER_SUPPLY_STATUS_DISCHARGING; - return true; - } - - return false; -} #else /* !CONFIG_HID_BATTERY_STRENGTH */ static int hidinput_setup_battery(struct hid_device *dev, unsigned report_type, struct hid_field *field, bool is_percentage) @@ -642,15 +642,15 @@ static void hidinput_cleanup_battery(struct hid_device *dev) { } -static void hidinput_update_battery(struct hid_device *dev, int value) -{ -} - -static bool hidinput_set_battery_charge_status(struct hid_device *dev, - unsigned int usage, int value) +static bool hidinput_update_battery_charge_status(struct hid_device *dev, + unsigned int usage, int value) { return false; } + +static void hidinput_update_battery(struct hid_device *dev, int value) +{ +} #endif /* CONFIG_HID_BATTERY_STRENGTH */ static bool hidinput_field_in_collection(struct hid_device *device, struct hid_field *field, @@ -1515,7 +1515,7 @@ void hidinput_hid_event(struct hid_device *hid, struct hid_field *field, struct return; if (usage->type == EV_PWR) { - bool handled = hidinput_set_battery_charge_status(hid, usage->hid, value); + bool handled = hidinput_update_battery_charge_status(hid, usage->hid, value); if (!handled) hidinput_update_battery(hid, value); From 430786612abe66901d1fc721570d90fa2e879d83 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Exp=C3=B3sito?= Date: Thu, 14 Aug 2025 12:39:40 +0200 Subject: [PATCH 21/76] HID: input: report battery status changes immediately MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit e94536e1d1818b0989aa19b443b7089f50133c35 ] Previously, the battery status (charging/discharging) was not reported immediately to user-space.  For most input devices, this wasn't problematic because changing their battery status requires connecting them to a different bus. For example, a gamepad would report a discharging status while connected via Bluetooth and a charging status while connected via USB. However, certain devices are not connected or disconnected when their battery status changes. For example, a phone battery changes its status without connecting or disconnecting it. In these cases, the battery status was not reported immediately to user space. Report battery status changes immediately to user space to support these kinds of devices. Fixes: a608dc1c0639 ("HID: input: map battery system charging") Reported-by: 卢国宏 Closes: https://lore.kernel.org/linux-input/aI49Im0sGb6fpgc8@fedora/T/ Tested-by: 卢国宏 Signed-off-by: José Expósito Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/hid-input.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index b372b74f3e24..f5c217ac4bfa 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -609,13 +609,19 @@ static bool hidinput_update_battery_charge_status(struct hid_device *dev, return false; } -static void hidinput_update_battery(struct hid_device *dev, int value) +static void hidinput_update_battery(struct hid_device *dev, unsigned int usage, + int value) { int capacity; if (!dev->battery) return; + if (hidinput_update_battery_charge_status(dev, usage, value)) { + power_supply_changed(dev->battery); + return; + } + if (value == 0 || value < dev->battery_min || value > dev->battery_max) return; @@ -642,13 +648,8 @@ static void hidinput_cleanup_battery(struct hid_device *dev) { } -static bool hidinput_update_battery_charge_status(struct hid_device *dev, - unsigned int usage, int value) -{ - return false; -} - -static void hidinput_update_battery(struct hid_device *dev, int value) +static void hidinput_update_battery(struct hid_device *dev, unsigned int usage, + int value) { } #endif /* CONFIG_HID_BATTERY_STRENGTH */ @@ -1515,11 +1516,7 @@ void hidinput_hid_event(struct hid_device *hid, struct hid_field *field, struct return; if (usage->type == EV_PWR) { - bool handled = hidinput_update_battery_charge_status(hid, usage->hid, value); - - if (!handled) - hidinput_update_battery(hid, value); - + hidinput_update_battery(hid, usage->hid, value); return; } From 4c549d87f016cd1c04ca4c25d86dffc839206bc8 Mon Sep 17 00:00:00 2001 From: Ludovico de Nittis Date: Tue, 12 Aug 2025 17:55:26 +0200 Subject: [PATCH 22/76] Bluetooth: hci_event: Treat UNKNOWN_CONN_ID on disconnect as success [ Upstream commit 2f050a5392b7a0928bf836d9891df4851463512c ] When the host sends an HCI_OP_DISCONNECT command, the controller may respond with the status HCI_ERROR_UNKNOWN_CONN_ID (0x02). E.g. this can happen on resume from suspend, if the link was terminated by the remote device before the event mask was correctly set. This is a btmon snippet that shows the issue: ``` > ACL Data RX: Handle 3 flags 0x02 dlen 12 L2CAP: Disconnection Request (0x06) ident 5 len 4 Destination CID: 65 Source CID: 72 < ACL Data TX: Handle 3 flags 0x00 dlen 12 L2CAP: Disconnection Response (0x07) ident 5 len 4 Destination CID: 65 Source CID: 72 > ACL Data RX: Handle 3 flags 0x02 dlen 12 L2CAP: Disconnection Request (0x06) ident 6 len 4 Destination CID: 64 Source CID: 71 < ACL Data TX: Handle 3 flags 0x00 dlen 12 L2CAP: Disconnection Response (0x07) ident 6 len 4 Destination CID: 64 Source CID: 71 < HCI Command: Set Event Mask (0x03|0x0001) plen 8 Mask: 0x3dbff807fffbffff Inquiry Complete Inquiry Result Connection Complete Connection Request Disconnection Complete Authentication Complete [...] < HCI Command: Disconnect (0x01|0x0006) plen 3 Handle: 3 Address: 78:20:A5:4A:DF:28 (Nintendo Co.,Ltd) Reason: Remote User Terminated Connection (0x13) > HCI Event: Command Status (0x0f) plen 4 Disconnect (0x01|0x0006) ncmd 1 Status: Unknown Connection Identifier (0x02) ``` Currently, the hci_cs_disconnect function treats any non-zero status as a command failure. This can be misleading because the connection is indeed being terminated and the controller is confirming that is has no knowledge of that connection handle. Meaning that the initial request of disconnecting a device should be treated as done. With this change we allow the function to proceed, following the success path, which correctly calls `mgmt_device_disconnected` and ensures a consistent state. Link: https://github.com/bluez/bluez/issues/1226 Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Signed-off-by: Ludovico de Nittis Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin --- net/bluetooth/hci_event.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index c06010c0d882..fb5e4e4858f7 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -2692,7 +2692,7 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status) if (!conn) goto unlock; - if (status) { + if (status && status != HCI_ERROR_UNKNOWN_CONN_ID) { mgmt_disconnect_failed(hdev, &conn->dst, conn->type, conn->dst_type, status); From f356ed87c78ca9863d9eb5102a84156b80d27fb0 Mon Sep 17 00:00:00 2001 From: Ludovico de Nittis Date: Tue, 12 Aug 2025 17:55:27 +0200 Subject: [PATCH 23/76] Bluetooth: hci_event: Mark connection as closed during suspend disconnect [ Upstream commit b7fafbc499b5ee164018eb0eefe9027f5a6aaad2 ] When suspending, the disconnect command for an active Bluetooth connection could be issued, but the corresponding `HCI_EV_DISCONN_COMPLETE` event might not be received before the system completes the suspend process. This can lead to an inconsistent state. On resume, the controller may auto-accept reconnections from the same device (due to suspend event filters), but these new connections are rejected by the kernel which still has connection objects from before suspend. Resulting in errors like: ``` kernel: Bluetooth: hci0: ACL packet for unknown connection handle 1 kernel: Bluetooth: hci0: Ignoring HCI_Connection_Complete for existing connection ``` This is a btmon snippet that shows the issue: ``` < HCI Command: Disconnect (0x01|0x0006) plen 3 Handle: 1 Address: 78:20:A5:4A:DF:28 (Nintendo Co.,Ltd) Reason: Remote User Terminated Connection (0x13) > HCI Event: Command Status (0x0f) plen 4 Disconnect (0x01|0x0006) ncmd 2 Status: Success (0x00) [...] // Host suspends with the event filter set for the device // On resume, the device tries to reconnect with a new handle > HCI Event: Connect Complete (0x03) plen 11 Status: Success (0x00) Handle: 2 Address: 78:20:A5:4A:DF:28 (Nintendo Co.,Ltd) // Kernel ignores this event because there is an existing connection with // handle 1 ``` By explicitly setting the connection state to BT_CLOSED we can ensure a consistent state, even if we don't receive the disconnect complete event in time. Link: https://github.com/bluez/bluez/issues/1226 Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Signed-off-by: Ludovico de Nittis Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin --- net/bluetooth/hci_event.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index fb5e4e4858f7..a07ad1c99a4b 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -2707,6 +2707,12 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status) goto done; } + /* During suspend, mark connection as closed immediately + * since we might not receive HCI_EV_DISCONN_COMPLETE + */ + if (hdev->suspended) + conn->state = BT_CLOSED; + mgmt_conn = test_and_clear_bit(HCI_CONN_MGMT_CONNECTED, &conn->flags); if (conn->type == ACL_LINK) { From 961abec8184a5ed76a2e975c6f8c5329ab85182d Mon Sep 17 00:00:00 2001 From: Luiz Augusto von Dentz Date: Wed, 20 Aug 2025 17:04:00 -0400 Subject: [PATCH 24/76] Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced [ Upstream commit 15bf2c6391bafb14a3020d06ec0761bce0803463 ] This attempts to detect if HCI_EV_NUM_COMP_PKTS contain an unbalanced (more than currently considered outstanding) number of packets otherwise it could cause the hcon->sent to underflow and loop around breaking the tracking of the outstanding packets pending acknowledgment. Fixes: f42809185896 ("Bluetooth: Simplify num_comp_pkts_evt function") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin --- net/bluetooth/hci_event.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index a07ad1c99a4b..5eed23b8d6c3 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -4392,7 +4392,17 @@ static void hci_num_comp_pkts_evt(struct hci_dev *hdev, void *data, if (!conn) continue; - conn->sent -= count; + /* Check if there is really enough packets outstanding before + * attempting to decrease the sent counter otherwise it could + * underflow.. + */ + if (conn->sent >= count) { + conn->sent -= count; + } else { + bt_dev_warn(hdev, "hcon %p sent %u < count %u", + conn, conn->sent, count); + conn->sent = 0; + } switch (conn->type) { case ACL_LINK: From 4bd2866db0025d8943aa2fef454ca719daec4d6a Mon Sep 17 00:00:00 2001 From: Pavel Shpakovskiy Date: Fri, 22 Aug 2025 12:20:55 +0300 Subject: [PATCH 25/76] Bluetooth: hci_sync: fix set_local_name race condition [ Upstream commit 6bbd0d3f0c23fc53c17409dd7476f38ae0ff0cd9 ] Function set_name_sync() uses hdev->dev_name field to send HCI_OP_WRITE_LOCAL_NAME command, but copying from data to hdev->dev_name is called after mgmt cmd was queued, so it is possible that function set_name_sync() will read old name value. This change adds name as a parameter for function hci_update_name_sync() to avoid race condition. Fixes: 6f6ff38a1e14 ("Bluetooth: hci_sync: Convert MGMT_OP_SET_LOCAL_NAME") Signed-off-by: Pavel Shpakovskiy Reviewed-by: Paul Menzel Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin --- include/net/bluetooth/hci_sync.h | 2 +- net/bluetooth/hci_sync.c | 6 +++--- net/bluetooth/mgmt.c | 5 ++++- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/include/net/bluetooth/hci_sync.h b/include/net/bluetooth/hci_sync.h index 3cb2d10cac93..e2e588b08fe9 100644 --- a/include/net/bluetooth/hci_sync.h +++ b/include/net/bluetooth/hci_sync.h @@ -72,7 +72,7 @@ int hci_update_class_sync(struct hci_dev *hdev); int hci_update_eir_sync(struct hci_dev *hdev); int hci_update_class_sync(struct hci_dev *hdev); -int hci_update_name_sync(struct hci_dev *hdev); +int hci_update_name_sync(struct hci_dev *hdev, const u8 *name); int hci_write_ssp_mode_sync(struct hci_dev *hdev, u8 mode); int hci_get_random_address(struct hci_dev *hdev, bool require_privacy, diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index 01aca0770711..020f1809fc99 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -3491,13 +3491,13 @@ int hci_update_scan_sync(struct hci_dev *hdev) return hci_write_scan_enable_sync(hdev, scan); } -int hci_update_name_sync(struct hci_dev *hdev) +int hci_update_name_sync(struct hci_dev *hdev, const u8 *name) { struct hci_cp_write_local_name cp; memset(&cp, 0, sizeof(cp)); - memcpy(cp.name, hdev->dev_name, sizeof(cp.name)); + memcpy(cp.name, name, sizeof(cp.name)); return __hci_cmd_sync_status(hdev, HCI_OP_WRITE_LOCAL_NAME, sizeof(cp), &cp, @@ -3550,7 +3550,7 @@ int hci_powered_update_sync(struct hci_dev *hdev) hci_write_fast_connectable_sync(hdev, false); hci_update_scan_sync(hdev); hci_update_class_sync(hdev); - hci_update_name_sync(hdev); + hci_update_name_sync(hdev, hdev->dev_name); hci_update_eir_sync(hdev); } diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index 82fa8c28438f..9b01eaaa0eb2 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -3819,8 +3819,11 @@ static void set_name_complete(struct hci_dev *hdev, void *data, int err) static int set_name_sync(struct hci_dev *hdev, void *data) { + struct mgmt_pending_cmd *cmd = data; + struct mgmt_cp_set_local_name *cp = cmd->param; + if (lmp_bredr_capable(hdev)) { - hci_update_name_sync(hdev); + hci_update_name_sync(hdev, cp->name); hci_update_eir_sync(hdev); } From 3c80c230d6e3e6f63d43f4c3f0bb344e3e8b119b Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Thu, 21 Aug 2025 02:18:24 +0000 Subject: [PATCH 26/76] atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control(). [ Upstream commit ec79003c5f9d2c7f9576fc69b8dbda80305cbe3a ] syzbot reported the splat below. [0] When atmtcp_v_open() or atmtcp_v_close() is called via connect() or close(), atmtcp_send_control() is called to send an in-kernel special message. The message has ATMTCP_HDR_MAGIC in atmtcp_control.hdr.length. Also, a pointer of struct atm_vcc is set to atmtcp_control.vcc. The notable thing is struct atmtcp_control is uAPI but has a space for an in-kernel pointer. struct atmtcp_control { struct atmtcp_hdr hdr; /* must be first */ ... atm_kptr_t vcc; /* both directions */ ... } __ATM_API_ALIGN; typedef struct { unsigned char _[8]; } __ATM_API_ALIGN atm_kptr_t; The special message is processed in atmtcp_recv_control() called from atmtcp_c_send(). atmtcp_c_send() is vcc->dev->ops->send() and called from 2 paths: 1. .ndo_start_xmit() (vcc->send() == atm_send_aal0()) 2. vcc_sendmsg() The problem is sendmsg() does not validate the message length and userspace can abuse atmtcp_recv_control() to overwrite any kptr by atmtcp_control. Let's add a new ->pre_send() hook to validate messages from sendmsg(). [0]: Oops: general protection fault, probably for non-canonical address 0xdffffc00200000ab: 0000 [#1] SMP KASAN PTI KASAN: probably user-memory-access in range [0x0000000100000558-0x000000010000055f] CPU: 0 UID: 0 PID: 5865 Comm: syz-executor331 Not tainted 6.17.0-rc1-syzkaller-00215-gbab3ce404553 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:atmtcp_recv_control drivers/atm/atmtcp.c:93 [inline] RIP: 0010:atmtcp_c_send+0x1da/0x950 drivers/atm/atmtcp.c:297 Code: 4d 8d 75 1a 4c 89 f0 48 c1 e8 03 42 0f b6 04 20 84 c0 0f 85 15 06 00 00 41 0f b7 1e 4d 8d b7 60 05 00 00 4c 89 f0 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 13 06 00 00 66 41 89 1e 4d 8d 75 1c 4c RSP: 0018:ffffc90003f5f810 EFLAGS: 00010203 RAX: 00000000200000ab RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88802a510000 RSI: 00000000ffffffff RDI: ffff888030a6068c RBP: ffff88802699fb40 R08: ffff888030a606eb R09: 1ffff1100614c0dd R10: dffffc0000000000 R11: ffffffff8718fc40 R12: dffffc0000000000 R13: ffff888030a60680 R14: 000000010000055f R15: 00000000ffffffff FS: 00007f8d7e9236c0(0000) GS:ffff888125c1c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000045ad50 CR3: 0000000075bde000 CR4: 00000000003526f0 Call Trace: vcc_sendmsg+0xa10/0xc60 net/atm/common.c:645 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg+0x219/0x270 net/socket.c:729 ____sys_sendmsg+0x505/0x830 net/socket.c:2614 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2668 __sys_sendmsg net/socket.c:2700 [inline] __do_sys_sendmsg net/socket.c:2705 [inline] __se_sys_sendmsg net/socket.c:2703 [inline] __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2703 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f8d7e96a4a9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f8d7e923198 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f8d7e9f4308 RCX: 00007f8d7e96a4a9 RDX: 0000000000000000 RSI: 0000200000000240 RDI: 0000000000000005 RBP: 00007f8d7e9f4300 R08: 65732f636f72702f R09: 65732f636f72702f R10: 65732f636f72702f R11: 0000000000000246 R12: 00007f8d7e9c10ac R13: 00007f8d7e9231a0 R14: 0000200000000200 R15: 0000200000000250 Modules linked in: Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+1741b56d54536f4ec349@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68a6767c.050a0220.3d78fd.0011.GAE@google.com/ Tested-by: syzbot+1741b56d54536f4ec349@syzkaller.appspotmail.com Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20250821021901.2814721-1-kuniyu@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/atm/atmtcp.c | 17 ++++++++++++++--- include/linux/atmdev.h | 1 + net/atm/common.c | 15 ++++++++++++--- 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/drivers/atm/atmtcp.c b/drivers/atm/atmtcp.c index ff558908897f..9c83fb29b2f1 100644 --- a/drivers/atm/atmtcp.c +++ b/drivers/atm/atmtcp.c @@ -279,6 +279,19 @@ static struct atm_vcc *find_vcc(struct atm_dev *dev, short vpi, int vci) return NULL; } +static int atmtcp_c_pre_send(struct atm_vcc *vcc, struct sk_buff *skb) +{ + struct atmtcp_hdr *hdr; + + if (skb->len < sizeof(struct atmtcp_hdr)) + return -EINVAL; + + hdr = (struct atmtcp_hdr *)skb->data; + if (hdr->length == ATMTCP_HDR_MAGIC) + return -EINVAL; + + return 0; +} static int atmtcp_c_send(struct atm_vcc *vcc,struct sk_buff *skb) { @@ -288,9 +301,6 @@ static int atmtcp_c_send(struct atm_vcc *vcc,struct sk_buff *skb) struct sk_buff *new_skb; int result = 0; - if (skb->len < sizeof(struct atmtcp_hdr)) - goto done; - dev = vcc->dev_data; hdr = (struct atmtcp_hdr *) skb->data; if (hdr->length == ATMTCP_HDR_MAGIC) { @@ -347,6 +357,7 @@ static const struct atmdev_ops atmtcp_v_dev_ops = { static const struct atmdev_ops atmtcp_c_dev_ops = { .close = atmtcp_c_close, + .pre_send = atmtcp_c_pre_send, .send = atmtcp_c_send }; diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h index 45f2f278b50a..70807c679f1a 100644 --- a/include/linux/atmdev.h +++ b/include/linux/atmdev.h @@ -185,6 +185,7 @@ struct atmdev_ops { /* only send is required */ int (*compat_ioctl)(struct atm_dev *dev,unsigned int cmd, void __user *arg); #endif + int (*pre_send)(struct atm_vcc *vcc, struct sk_buff *skb); int (*send)(struct atm_vcc *vcc,struct sk_buff *skb); int (*send_bh)(struct atm_vcc *vcc, struct sk_buff *skb); int (*send_oam)(struct atm_vcc *vcc,void *cell,int flags); diff --git a/net/atm/common.c b/net/atm/common.c index 9cc82acbc735..48bb3f66a3f2 100644 --- a/net/atm/common.c +++ b/net/atm/common.c @@ -635,18 +635,27 @@ int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t size) skb->dev = NULL; /* for paths shared with net_device interfaces */ if (!copy_from_iter_full(skb_put(skb, size), size, &m->msg_iter)) { - atm_return_tx(vcc, skb); - kfree_skb(skb); error = -EFAULT; - goto out; + goto free_skb; } if (eff != size) memset(skb->data + size, 0, eff-size); + + if (vcc->dev->ops->pre_send) { + error = vcc->dev->ops->pre_send(vcc, skb); + if (error) + goto free_skb; + } + error = vcc->dev->ops->send(vcc, skb); error = error ? error : size; out: release_sock(sk); return error; +free_skb: + atm_return_tx(vcc, skb); + kfree_skb(skb); + goto out; } __poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait) From 32a4984456418fc26a512615b758f57b153d50f2 Mon Sep 17 00:00:00 2001 From: Timur Tabi Date: Tue, 12 Aug 2025 19:10:03 -0500 Subject: [PATCH 27/76] drm/nouveau: remove unused increment in gm200_flcn_pio_imem_wr [ Upstream commit f529b8915543fb9ceb732cec5571f7fe12bc9530 ] The 'tag' parameter is passed by value and is not actually used after being incremented, so remove the increment. It's the function that calls gm200_flcn_pio_imem_wr that is supposed to (and does) increment 'tag'. Fixes: 0e44c2170876 ("drm/nouveau/flcn: new code to load+boot simple HS FWs (VPR scrubber)") Reviewed-by: Philipp Stanner Signed-off-by: Timur Tabi Link: https://lore.kernel.org/r/20250813001004.2986092-2-ttabi@nvidia.com Signed-off-by: Danilo Krummrich Signed-off-by: Sasha Levin --- drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c b/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c index b7da3ab44c27..6a004c6e6742 100644 --- a/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c +++ b/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c @@ -103,7 +103,7 @@ gm200_flcn_pio_imem_wr_init(struct nvkm_falcon *falcon, u8 port, bool sec, u32 i static void gm200_flcn_pio_imem_wr(struct nvkm_falcon *falcon, u8 port, const u8 *img, int len, u16 tag) { - nvkm_falcon_wr32(falcon, 0x188 + (port * 0x10), tag++); + nvkm_falcon_wr32(falcon, 0x188 + (port * 0x10), tag); while (len >= 4) { nvkm_falcon_wr32(falcon, 0x184 + (port * 0x10), *(u32 *)img); img += 4; From 4de399767ddc9d6a0174af056eba3364e12593c9 Mon Sep 17 00:00:00 2001 From: Timur Tabi Date: Tue, 12 Aug 2025 19:10:04 -0500 Subject: [PATCH 28/76] drm/nouveau: remove unused memory target test [ Upstream commit 64c722b5e7f6b909b0e448e580f64628a0d76208 ] The memory target check is a hold-over from a refactor. It's harmless but distracting, so just remove it. Fixes: 2541626cfb79 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs") Signed-off-by: Timur Tabi Link: https://lore.kernel.org/r/20250813001004.2986092-3-ttabi@nvidia.com Signed-off-by: Danilo Krummrich Signed-off-by: Sasha Levin --- drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c b/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c index 6a004c6e6742..7c43397c19e6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c +++ b/drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c @@ -249,9 +249,11 @@ int gm200_flcn_fw_load(struct nvkm_falcon_fw *fw) { struct nvkm_falcon *falcon = fw->falcon; - int target, ret; + int ret; if (fw->inst) { + int target; + nvkm_falcon_mask(falcon, 0x048, 0x00000001, 0x00000001); switch (nvkm_memory_target(fw->inst)) { @@ -285,15 +287,6 @@ gm200_flcn_fw_load(struct nvkm_falcon_fw *fw) } if (fw->boot) { - switch (nvkm_memory_target(&fw->fw.mem.memory)) { - case NVKM_MEM_TARGET_VRAM: target = 4; break; - case NVKM_MEM_TARGET_HOST: target = 5; break; - case NVKM_MEM_TARGET_NCOH: target = 6; break; - default: - WARN_ON(1); - return -EINVAL; - } - ret = nvkm_falcon_pio_wr(falcon, fw->boot, 0, 0, IMEM, falcon->code.limit - fw->boot_size, fw->boot_size, fw->boot_addr >> 8, false); From 40a9f217cde125996fbd14e9384d46aa3610230d Mon Sep 17 00:00:00 2001 From: Larysa Zaremba Date: Tue, 5 Dec 2023 22:08:33 +0100 Subject: [PATCH 29/76] ice: Introduce ice_xdp_buff [ Upstream commit d951c14ad237b087f0d1377c44932fcc0b322c40 ] In order to use XDP hints via kfuncs we need to put RX descriptor and miscellaneous data next to xdp_buff. Same as in hints implementations in other drivers, we achieve this through putting xdp_buff into a child structure. Currently, xdp_buff is stored in the ring structure, so replace it with union that includes child structure. This way enough memory is available while existing XDP code remains isolated from hints. Minimum size of the new child structure (ice_xdp_buff) is exactly 64 bytes (single cache line). To place it at the start of a cache line, move 'next' field from CL1 to CL4, as it isn't used often. This still leaves 192 bits available in CL3 for packet context extensions. Signed-off-by: Larysa Zaremba Reviewed-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20231205210847.28460-5-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov Stable-dep-of: b1a0c977c6f1 ("ice: fix incorrect counter for buffer allocation failures") Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 7 +++++-- drivers/net/ethernet/intel/ice/ice_txrx.h | 18 +++++++++++++++--- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 ++++++++++ 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index f5023ac9ab83..46f10f32924e 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -528,13 +528,14 @@ err: * @xdp_prog: XDP program to run * @xdp_ring: ring to be used for XDP_TX action * @rx_buf: Rx buffer to store the XDP action + * @eop_desc: Last descriptor in packet to read metadata from * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ static void ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf) + struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -542,6 +543,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (!xdp_prog) goto exit; + ice_xdp_meta_set_desc(xdp, eop_desc); + act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf); + ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index 407d4c320097..ad2d286bd359 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -257,6 +257,14 @@ enum ice_rx_dtype { ICE_RX_DTYPE_SPLIT_ALWAYS = 2, }; +struct ice_xdp_buff { + struct xdp_buff xdp_buff; + const union ice_32b_rx_flex_desc *eop_desc; +}; + +/* Required for compatibility with xdp_buffs from xsk_pool */ +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0); + /* indices into GLINT_ITR registers */ #define ICE_RX_ITR ICE_IDX_ITR0 #define ICE_TX_ITR ICE_IDX_ITR1 @@ -298,7 +306,6 @@ enum ice_dynamic_itr { /* descriptor ring, associated with a VSI */ struct ice_rx_ring { /* CL1 - 1st cacheline starts here */ - struct ice_rx_ring *next; /* pointer to next ring in q_vector */ void *desc; /* Descriptor ring memory */ struct device *dev; /* Used for DMA mapping */ struct net_device *netdev; /* netdev ring maps to */ @@ -310,12 +317,16 @@ struct ice_rx_ring { u16 count; /* Number of descriptors */ u16 reg_idx; /* HW register index of the ring */ u16 next_to_alloc; - /* CL2 - 2nd cacheline starts here */ + union { struct ice_rx_buf *rx_buf; struct xdp_buff **xdp_buf; }; - struct xdp_buff xdp; + /* CL2 - 2nd cacheline starts here */ + union { + struct ice_xdp_buff xdp_ext; + struct xdp_buff xdp; + }; /* CL3 - 3rd cacheline starts here */ struct bpf_prog *xdp_prog; u16 rx_offset; @@ -332,6 +343,7 @@ struct ice_rx_ring { /* CL4 - 4th cacheline starts here */ struct ice_channel *ch; struct ice_tx_ring *xdp_ring; + struct ice_rx_ring *next; /* pointer to next ring in q_vector */ struct xsk_buff_pool *xsk_pool; u32 nr_frags; dma_addr_t dma; /* physical address of ring */ diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index b0e56675f98b..00971f849135 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -164,4 +164,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 ptype); void ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); + +static inline void +ice_xdp_meta_set_desc(struct xdp_buff *xdp, + union ice_32b_rx_flex_desc *eop_desc) +{ + struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff, + xdp_buff); + + xdp_ext->eop_desc = eop_desc; +} #endif /* !_ICE_TXRX_LIB_H_ */ From 29bcd31ace16910f2765a22125e4baf4fc7c2c95 Mon Sep 17 00:00:00 2001 From: Maciej Fijalkowski Date: Thu, 23 Jan 2025 16:01:17 +0100 Subject: [PATCH 30/76] ice: gather page_count()'s of each frag right before XDP prog call [ Upstream commit 11c4aa074d547d825b19cd8d9f288254d89d805c ] If we store the pgcnt on few fragments while being in the middle of gathering the whole frame and we stumbled upon DD bit not being set, we terminate the NAPI Rx processing loop and come back later on. Then on next NAPI execution we work on previously stored pgcnt. Imagine that second half of page was used actively by networking stack and by the time we came back, stack is not busy with this page anymore and decremented the refcnt. The page reuse algorithm in this case should be good to reuse the page but given the old refcnt it will not do so and attempt to release the page via page_frag_cache_drain() with pagecnt_bias used as an arg. This in turn will result in negative refcnt on struct page, which was initially observed by Xu Du. Therefore, move the page count storage from ice_get_rx_buf() to a place where we are sure that whole frame has been collected, but before calling XDP program as it internally can also change the page count of fragments belonging to xdp_buff. Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf") Reported-and-tested-by: Xu Du Reviewed-by: Przemek Kitszel Reviewed-by: Simon Horman Co-developed-by: Jacob Keller Signed-off-by: Jacob Keller Signed-off-by: Maciej Fijalkowski Tested-by: Chandan Kumar Rout (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: b1a0c977c6f1 ("ice: fix incorrect counter for buffer allocation failures") Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 27 ++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 46f10f32924e..beb5c8643fe7 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[ntc]; - rx_buf->pgcnt = page_count(rx_buf->page); prefetchw(rx_buf->page); if (!size) @@ -940,6 +939,31 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, return rx_buf; } +/** + * ice_get_pgcnts - grab page_count() for gathered fragments + * @rx_ring: Rx descriptor ring to store the page counts on + * + * This function is intended to be called right before running XDP + * program so that the page recycling mechanism will be able to take + * a correct decision regarding underlying pages; this is done in such + * way as XDP program can change the refcount of page + */ +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + struct ice_rx_buf *rx_buf; + u32 cnt = rx_ring->count; + + for (int i = 0; i < nr_frags; i++) { + rx_buf = &rx_ring->rx_buf[idx]; + rx_buf->pgcnt = page_count(rx_buf->page); + + if (++idx == cnt) + idx = 0; + } +} + /** * ice_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on @@ -1243,6 +1267,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; + ice_get_pgcnts(rx_ring); ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; From 05fc7307e352015b97386e2878d8aff22d48382d Mon Sep 17 00:00:00 2001 From: Maciej Fijalkowski Date: Thu, 23 Jan 2025 16:01:18 +0100 Subject: [PATCH 31/76] ice: stop storing XDP verdict within ice_rx_buf [ Upstream commit 468a1952df78f65c5991b7ac885c8b5b7dd87bab ] Idea behind having ice_rx_buf::act was to simplify and speed up the Rx data path by walking through buffers that were representing cleaned HW Rx descriptors. Since it caused us a major headache recently and we rolled back to old approach that 'puts' Rx buffers right after running XDP prog/creating skb, this is useless now and should be removed. Get rid of ice_rx_buf::act and related logic. We still need to take care of a corner case where XDP program releases a particular fragment. Make ice_run_xdp() to return its result and use it within ice_put_rx_mbuf(). Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Przemek Kitszel Reviewed-by: Simon Horman Signed-off-by: Maciej Fijalkowski Tested-by: Chandan Kumar Rout (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: b1a0c977c6f1 ("ice: fix incorrect counter for buffer allocation failures") Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 62 +++++++++++-------- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- 3 files changed, 36 insertions(+), 70 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index beb5c8643fe7..e7084c6aab6d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -527,15 +527,14 @@ err: * @xdp: xdp_buff used as input to the XDP program * @xdp_prog: XDP program to run * @xdp_ring: ring to be used for XDP_TX action - * @rx_buf: Rx buffer to store the XDP action * @eop_desc: Last descriptor in packet to read metadata from * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ -static void +static u32 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) + union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -574,7 +573,7 @@ out_failure: ret = ICE_XDP_CONSUMED; } exit: - ice_set_rx_bufs_act(xdp, rx_ring, ret); + return ret; } /** @@ -860,10 +859,8 @@ ice_add_xdp_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, xdp_buff_set_frags_flag(xdp); } - if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) { - ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); + if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) return -ENOMEM; - } __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page, rx_buf->page_offset, size); @@ -1076,12 +1073,12 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp) rx_buf->page_offset + headlen, size, xdp->frame_sz); } else { - /* buffer is unused, change the act that should be taken later - * on; data was copied onto skb's linear part so there's no + /* buffer is unused, restore biased page count in Rx buffer; + * data was copied onto skb's linear part so there's no * need for adjusting page offset and we can reuse this buffer * as-is */ - rx_buf->act = ICE_SKB_CONSUMED; + rx_buf->pagecnt_bias++; } if (unlikely(xdp_buff_has_frags(xdp))) { @@ -1134,29 +1131,34 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) * @xdp: XDP buffer carrying linear + frags part * @xdp_xmit: XDP_TX/XDP_REDIRECT verdict storage * @ntc: a current next_to_clean value to be stored at rx_ring + * @verdict: return code from XDP program execution * * Walk through gathered fragments and satisfy internal page * recycle mechanism; we take here an action related to verdict * returned by XDP program; */ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, - u32 *xdp_xmit, u32 ntc) + u32 *xdp_xmit, u32 ntc, u32 verdict) { u32 nr_frags = rx_ring->nr_frags + 1; u32 idx = rx_ring->first_desc; u32 cnt = rx_ring->count; + u32 post_xdp_frags = 1; struct ice_rx_buf *buf; int i; - for (i = 0; i < nr_frags; i++) { + if (unlikely(xdp_buff_has_frags(xdp))) + post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags; + + for (i = 0; i < post_xdp_frags; i++) { buf = &rx_ring->rx_buf[idx]; - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { + if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - *xdp_xmit |= buf->act; - } else if (buf->act & ICE_XDP_CONSUMED) { + *xdp_xmit |= verdict; + } else if (verdict & ICE_XDP_CONSUMED) { buf->pagecnt_bias++; - } else if (buf->act == ICE_XDP_PASS) { + } else if (verdict == ICE_XDP_PASS) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); } @@ -1165,6 +1167,17 @@ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (++idx == cnt) idx = 0; } + /* handle buffers that represented frags released by XDP prog; + * for these we keep pagecnt_bias as-is; refcount from struct page + * has been decremented within XDP prog and we do not have to increase + * the biased refcnt + */ + for (; i < nr_frags; i++) { + buf = &rx_ring->rx_buf[idx]; + ice_put_rx_buf(rx_ring, buf); + if (++idx == cnt) + idx = 0; + } xdp->data = NULL; rx_ring->first_desc = ntc; @@ -1191,9 +1204,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) struct ice_tx_ring *xdp_ring = NULL; struct bpf_prog *xdp_prog = NULL; u32 ntc = rx_ring->next_to_clean; + u32 cached_ntu, xdp_verdict; u32 cnt = rx_ring->count; u32 xdp_xmit = 0; - u32 cached_ntu; bool failure; xdp_prog = READ_ONCE(rx_ring->xdp_prog); @@ -1257,7 +1270,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { - ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc); + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED); break; } if (++ntc == cnt) @@ -1268,13 +1281,13 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) continue; ice_get_pgcnts(rx_ring); - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); - if (rx_buf->act == ICE_XDP_PASS) + xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc); + if (xdp_verdict == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); continue; construct_skb: @@ -1285,12 +1298,9 @@ construct_skb: /* exit if we failed to retrieve a buffer */ if (!skb) { rx_ring->ring_stats->rx_stats.alloc_page_failed++; - rx_buf->act = ICE_XDP_CONSUMED; - if (unlikely(xdp_buff_has_frags(xdp))) - ice_set_rx_bufs_act(xdp, rx_ring, - ICE_XDP_CONSUMED); + xdp_verdict = ICE_XDP_CONSUMED; } - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); if (!skb) break; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index ad2d286bd359..53a155dde3e3 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -201,7 +201,6 @@ struct ice_rx_buf { struct page *page; unsigned int page_offset; unsigned int pgcnt; - unsigned int act; unsigned int pagecnt_bias; }; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 00971f849135..41efafc5eb38 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -5,49 +5,6 @@ #define _ICE_TXRX_LIB_H_ #include "ice.h" -/** - * ice_set_rx_bufs_act - propagate Rx buffer action to frags - * @xdp: XDP buffer representing frame (linear and frags part) - * @rx_ring: Rx ring struct - * act: action to store onto Rx buffers related to XDP buffer parts - * - * Set action that should be taken before putting Rx buffer from first frag - * to the last. - */ -static inline void -ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring, - const unsigned int act) -{ - u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags; - u32 nr_frags = rx_ring->nr_frags + 1; - u32 idx = rx_ring->first_desc; - u32 cnt = rx_ring->count; - struct ice_rx_buf *buf; - - for (int i = 0; i < nr_frags; i++) { - buf = &rx_ring->rx_buf[idx]; - buf->act = act; - - if (++idx == cnt) - idx = 0; - } - - /* adjust pagecnt_bias on frags freed by XDP prog */ - if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) { - u32 delta = rx_ring->nr_frags - sinfo_frags; - - while (delta) { - if (idx == 0) - idx = cnt - 1; - else - idx--; - buf = &rx_ring->rx_buf[idx]; - buf->pagecnt_bias--; - delta--; - } - } -} - /** * ice_test_staterr - tests bits in Rx descriptor status and error fields * @status_err_n: Rx descriptor status_error0 or status_error1 bits From dc70ea942fcd221e507497fff9a5be900c05d5e6 Mon Sep 17 00:00:00 2001 From: Michal Kubiak Date: Fri, 8 Aug 2025 17:53:10 +0200 Subject: [PATCH 32/76] ice: fix incorrect counter for buffer allocation failures [ Upstream commit b1a0c977c6f1130f7dd125ee3db8c2435d7e3d41 ] Currently, the driver increments `alloc_page_failed` when buffer allocation fails in `ice_clean_rx_irq()`. However, this counter is intended for page allocation failures, not buffer allocation issues. This patch corrects the counter by incrementing `alloc_buf_failed` instead, ensuring accurate statistics reporting for buffer allocation failures. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reported-by: Jacob Keller Suggested-by: Paul Menzel Signed-off-by: Michal Kubiak Reviewed-by: Paul Menzel Reviewed-by: Jason Xing Reviewed-by: Aleksandr Loktionov Tested-by: Priya Singh Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index e7084c6aab6d..eae4376c6859 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1297,7 +1297,7 @@ construct_skb: skb = ice_construct_skb(rx_ring, xdp); /* exit if we failed to retrieve a buffer */ if (!skb) { - rx_ring->ring_stats->rx_stats.alloc_page_failed++; + rx_ring->ring_stats->rx_stats.alloc_buf_failed++; xdp_verdict = ICE_XDP_CONSUMED; } ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); From 589edd3dc8fb67e3bbb9a5af930574ec518c0063 Mon Sep 17 00:00:00 2001 From: Dmitry Baryshkov Date: Sat, 9 Aug 2025 11:36:54 +0300 Subject: [PATCH 33/76] dt-bindings: display/msm: qcom,mdp5: drop lut clock [ Upstream commit 7ab3b7579a6d2660a3425b9ea93b9a140b07f49c ] None of MDP5 platforms have a LUT clock on the display-controller, it was added by the mistake. Drop it, fixing DT warnings on MSM8976 / MSM8956 platforms. Technically it's an ABI break, but no other platforms are affected. Fixes: 385c8ac763b3 ("dt-bindings: display/msm: convert MDP5 schema to YAML format") Signed-off-by: Dmitry Baryshkov Acked-by: Rob Herring (Arm) Patchwork: https://patchwork.freedesktop.org/patch/667822/ Signed-off-by: Rob Clark Signed-off-by: Sasha Levin --- Documentation/devicetree/bindings/display/msm/qcom,mdp5.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/msm/qcom,mdp5.yaml b/Documentation/devicetree/bindings/display/msm/qcom,mdp5.yaml index 91c774f106ce..ab1196d1ec2d 100644 --- a/Documentation/devicetree/bindings/display/msm/qcom,mdp5.yaml +++ b/Documentation/devicetree/bindings/display/msm/qcom,mdp5.yaml @@ -59,7 +59,6 @@ properties: - const: bus - const: core - const: vsync - - const: lut - const: tbu - const: tbu_rt # MSM8996 has additional iommu clock From b3c70f6fc25804a50f05a860d72cc5a0583eb6db Mon Sep 17 00:00:00 2001 From: Yeounsu Moon Date: Sun, 24 Aug 2025 03:29:24 +0900 Subject: [PATCH 34/76] net: dlink: fix multicast stats being counted incorrectly [ Upstream commit 007a5ffadc4fd51739527f1503b7cf048f31c413 ] `McstFramesRcvdOk` counts the number of received multicast packets, and it reports the value correctly. However, reading `McstFramesRcvdOk` clears the register to zero. As a result, the driver was reporting only the packets since the last read, instead of the accumulated total. Fix this by updating the multicast statistics accumulatively instaed of instantaneously. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-on: D-Link DGE-550T Rev-A3 Signed-off-by: Yeounsu Moon Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250823182927.6063-3-yyyynoom@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/dlink/dl2k.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c index fad5a72d3b16..f1208591ed67 100644 --- a/drivers/net/ethernet/dlink/dl2k.c +++ b/drivers/net/ethernet/dlink/dl2k.c @@ -1092,7 +1092,7 @@ get_stats (struct net_device *dev) dev->stats.rx_bytes += dr32(OctetRcvOk); dev->stats.tx_bytes += dr32(OctetXmtOk); - dev->stats.multicast = dr32(McstFramesRcvdOk); + dev->stats.multicast += dr32(McstFramesRcvdOk); dev->stats.collisions += dr32(SingleColFrames) + dr32(MultiColFrames); From 7acefa4c66aab836eb8f6a0bcbc94619a695232d Mon Sep 17 00:00:00 2001 From: Horatiu Vultur Date: Mon, 25 Aug 2025 08:55:43 +0200 Subject: [PATCH 35/76] phy: mscc: Fix when PTP clock is register and unregister [ Upstream commit 882e57cbc7204662f6c5672d5b04336c1d790b03 ] It looks like that every time when the interface was set down and up the driver was creating a new ptp clock. On top of this the function ptp_clock_unregister was never called. Therefore fix this by calling ptp_clock_register and initialize the mii_ts struct inside the probe function and call ptp_clock_unregister when driver is removed. Fixes: 7d272e63e0979d ("net: phy: mscc: timestamping and PHC support") Signed-off-by: Horatiu Vultur Reviewed-by: Vadim Fedorenko Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20250825065543.2916334-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/phy/mscc/mscc.h | 4 ++++ drivers/net/phy/mscc/mscc_main.c | 4 +--- drivers/net/phy/mscc/mscc_ptp.c | 34 ++++++++++++++++++++------------ 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/drivers/net/phy/mscc/mscc.h b/drivers/net/phy/mscc/mscc.h index cdb343779a8f..4ba6e32cf6d8 100644 --- a/drivers/net/phy/mscc/mscc.h +++ b/drivers/net/phy/mscc/mscc.h @@ -476,6 +476,7 @@ static inline void vsc8584_config_macsec_intr(struct phy_device *phydev) void vsc85xx_link_change_notify(struct phy_device *phydev); void vsc8584_config_ts_intr(struct phy_device *phydev); int vsc8584_ptp_init(struct phy_device *phydev); +void vsc8584_ptp_deinit(struct phy_device *phydev); int vsc8584_ptp_probe_once(struct phy_device *phydev); int vsc8584_ptp_probe(struct phy_device *phydev); irqreturn_t vsc8584_handle_ts_interrupt(struct phy_device *phydev); @@ -490,6 +491,9 @@ static inline int vsc8584_ptp_init(struct phy_device *phydev) { return 0; } +static inline void vsc8584_ptp_deinit(struct phy_device *phydev) +{ +} static inline int vsc8584_ptp_probe_once(struct phy_device *phydev) { return 0; diff --git a/drivers/net/phy/mscc/mscc_main.c b/drivers/net/phy/mscc/mscc_main.c index 3de72d9cc22b..3a932b30f435 100644 --- a/drivers/net/phy/mscc/mscc_main.c +++ b/drivers/net/phy/mscc/mscc_main.c @@ -2337,9 +2337,7 @@ static int vsc85xx_probe(struct phy_device *phydev) static void vsc85xx_remove(struct phy_device *phydev) { - struct vsc8531_private *priv = phydev->priv; - - skb_queue_purge(&priv->rx_skbs_list); + vsc8584_ptp_deinit(phydev); } /* Microsemi VSC85xx PHYs */ diff --git a/drivers/net/phy/mscc/mscc_ptp.c b/drivers/net/phy/mscc/mscc_ptp.c index add1a9ee721a..1f6237705b44 100644 --- a/drivers/net/phy/mscc/mscc_ptp.c +++ b/drivers/net/phy/mscc/mscc_ptp.c @@ -1297,7 +1297,6 @@ static void vsc8584_set_input_clk_configured(struct phy_device *phydev) static int __vsc8584_init_ptp(struct phy_device *phydev) { - struct vsc8531_private *vsc8531 = phydev->priv; static const u32 ltc_seq_e[] = { 0, 400000, 0, 0, 0 }; static const u8 ltc_seq_a[] = { 8, 6, 5, 4, 2 }; u32 val; @@ -1514,17 +1513,7 @@ static int __vsc8584_init_ptp(struct phy_device *phydev) vsc85xx_ts_eth_cmp1_sig(phydev); - vsc8531->mii_ts.rxtstamp = vsc85xx_rxtstamp; - vsc8531->mii_ts.txtstamp = vsc85xx_txtstamp; - vsc8531->mii_ts.hwtstamp = vsc85xx_hwtstamp; - vsc8531->mii_ts.ts_info = vsc85xx_ts_info; - phydev->mii_ts = &vsc8531->mii_ts; - - memcpy(&vsc8531->ptp->caps, &vsc85xx_clk_caps, sizeof(vsc85xx_clk_caps)); - - vsc8531->ptp->ptp_clock = ptp_clock_register(&vsc8531->ptp->caps, - &phydev->mdio.dev); - return PTR_ERR_OR_ZERO(vsc8531->ptp->ptp_clock); + return 0; } void vsc8584_config_ts_intr(struct phy_device *phydev) @@ -1551,6 +1540,16 @@ int vsc8584_ptp_init(struct phy_device *phydev) return 0; } +void vsc8584_ptp_deinit(struct phy_device *phydev) +{ + struct vsc8531_private *vsc8531 = phydev->priv; + + if (vsc8531->ptp->ptp_clock) { + ptp_clock_unregister(vsc8531->ptp->ptp_clock); + skb_queue_purge(&vsc8531->rx_skbs_list); + } +} + irqreturn_t vsc8584_handle_ts_interrupt(struct phy_device *phydev) { struct vsc8531_private *priv = phydev->priv; @@ -1608,7 +1607,16 @@ int vsc8584_ptp_probe(struct phy_device *phydev) vsc8531->ptp->phydev = phydev; - return 0; + vsc8531->mii_ts.rxtstamp = vsc85xx_rxtstamp; + vsc8531->mii_ts.txtstamp = vsc85xx_txtstamp; + vsc8531->mii_ts.hwtstamp = vsc85xx_hwtstamp; + vsc8531->mii_ts.ts_info = vsc85xx_ts_info; + phydev->mii_ts = &vsc8531->mii_ts; + + memcpy(&vsc8531->ptp->caps, &vsc85xx_clk_caps, sizeof(vsc85xx_clk_caps)); + vsc8531->ptp->ptp_clock = ptp_clock_register(&vsc8531->ptp->caps, + &phydev->mdio.dev); + return PTR_ERR_OR_ZERO(vsc8531->ptp->ptp_clock); } int vsc8584_ptp_probe_once(struct phy_device *phydev) From 09fd27c8621e33248f6453e05e451167ba904010 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Mon, 25 Aug 2025 17:34:28 +0300 Subject: [PATCH 36/76] net/mlx5: Reload auxiliary drivers on fw_activate [ Upstream commit 34cc6a54914f478c93e176450fae6313404f9f74 ] The devlink reload fw_activate command performs firmware activation followed by driver reload, while devlink reload driver_reinit triggers only driver reload. However, the driver reload logic differs between the two modes, as on driver_reinit mode mlx5 also reloads auxiliary drivers, while in fw_activate mode the auxiliary drivers are suspended where applicable. Additionally, following the cited commit, if the device has multiple PFs, the behavior during fw_activate may vary between PFs: one PF may suspend auxiliary drivers, while another reloads them. Align devlink dev reload fw_activate behavior with devlink dev reload driver_reinit, to reload all auxiliary drivers. Fixes: 72ed5d5624af ("net/mlx5: Suspend auxiliary devices only in case of PCI device suspend") Signed-off-by: Moshe Shemesh Reviewed-by: Tariq Toukan Reviewed-by: Akiva Goldberger Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index f66788a2ed77..f2d1f7cad7e7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -107,7 +107,7 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli if (err) return err; - mlx5_unload_one_devl_locked(dev, true); + mlx5_unload_one_devl_locked(dev, false); err = mlx5_health_wait_pci_up(dev); if (err) NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset"); From 6292688e07d066893e85dac7c28db3c78a1c2358 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Wed, 11 Sep 2024 13:17:51 -0700 Subject: [PATCH 37/76] net/mlx5: Add device cap for supporting hot reset in sync reset flow [ Upstream commit 9947204cdad97d22d171039019a4aad4d6899cdd ] New devices with new FW can support sync reset for firmware activate using hot reset. Add capability for supporting it and add MFRL field to query from FW which type of PCI reset method to use while handling sync reset events. Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20240911201757.1505453-10-saeed@kernel.org Signed-off-by: Jakub Kicinski Stable-dep-of: 902a8bc23a24 ("net/mlx5: Fix lockdep assertion on sync reset unload event") Signed-off-by: Sasha Levin --- include/linux/mlx5/mlx5_ifc.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 9106771bb92f..4913d364e977 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1731,7 +1731,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_328[0x2]; u8 relaxed_ordering_read[0x1]; u8 log_max_pd[0x5]; - u8 reserved_at_330[0x6]; + u8 reserved_at_330[0x5]; + u8 pcie_reset_using_hotreset_method[0x1]; u8 pci_sync_for_fw_update_with_driver_unload[0x1]; u8 vnic_env_cnt_steering_fail[0x1]; u8 vport_counter_local_loopback[0x1]; @@ -10824,6 +10825,11 @@ struct mlx5_ifc_mcda_reg_bits { u8 data[][0x20]; }; +enum { + MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE = 0, + MLX5_MFRL_REG_PCI_RESET_METHOD_HOT_RESET = 1, +}; + enum { MLX5_MFRL_REG_RESET_STATE_IDLE = 0, MLX5_MFRL_REG_RESET_STATE_IN_NEGOTIATION = 1, @@ -10851,7 +10857,8 @@ struct mlx5_ifc_mfrl_reg_bits { u8 pci_sync_for_fw_update_start[0x1]; u8 pci_sync_for_fw_update_resp[0x2]; u8 rst_type_sel[0x3]; - u8 reserved_at_28[0x4]; + u8 pci_reset_req_method[0x3]; + u8 reserved_at_2b[0x1]; u8 reset_state[0x4]; u8 reset_type[0x8]; u8 reset_level[0x8]; From 8da591ae26145e374a9d2148496eed810a5d4f7b Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Wed, 11 Sep 2024 13:17:52 -0700 Subject: [PATCH 38/76] net/mlx5: Add support for sync reset using hot reset [ Upstream commit 57502f62678ced0149d415324931bde37b42885a ] On device that supports sync reset for firmware activate using hot reset, the driver queries the required reset method while handling the sync reset request. If the required reset method is hot reset, the driver will use pci_reset_bus() to reset the PCI link instead of the link toggle. Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20240911201757.1505453-11-saeed@kernel.org Signed-off-by: Jakub Kicinski Stable-dep-of: 902a8bc23a24 ("net/mlx5: Fix lockdep assertion on sync reset unload event") Signed-off-by: Sasha Levin --- .../ethernet/mellanox/mlx5/core/fw_reset.c | 82 +++++++++++++++---- .../net/ethernet/mellanox/mlx5/core/main.c | 3 + 2 files changed, 69 insertions(+), 16 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 6b17346aa4ce..a1c2dd7f5c2c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -26,6 +26,7 @@ struct mlx5_fw_reset { struct work_struct reset_now_work; struct work_struct reset_abort_work; unsigned long reset_flags; + u8 reset_method; struct timer_list timer; struct completion done; int ret; @@ -94,7 +95,7 @@ static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level, } static int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, - u8 *reset_type, u8 *reset_state) + u8 *reset_type, u8 *reset_state, u8 *reset_method) { u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {}; u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {}; @@ -110,13 +111,26 @@ static int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, *reset_type = MLX5_GET(mfrl_reg, out, reset_type); if (reset_state) *reset_state = MLX5_GET(mfrl_reg, out, reset_state); + if (reset_method) + *reset_method = MLX5_GET(mfrl_reg, out, pci_reset_req_method); return 0; } int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type) { - return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL); + return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL, NULL); +} + +static int mlx5_fw_reset_get_reset_method(struct mlx5_core_dev *dev, + u8 *reset_method) +{ + if (!MLX5_CAP_GEN(dev, pcie_reset_using_hotreset_method)) { + *reset_method = MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE; + return 0; + } + + return mlx5_reg_mfrl_query(dev, NULL, NULL, NULL, reset_method); } static int mlx5_fw_reset_get_reset_state_err(struct mlx5_core_dev *dev, @@ -124,7 +138,7 @@ static int mlx5_fw_reset_get_reset_state_err(struct mlx5_core_dev *dev, { u8 reset_state; - if (mlx5_reg_mfrl_query(dev, NULL, NULL, &reset_state)) + if (mlx5_reg_mfrl_query(dev, NULL, NULL, &reset_state, NULL)) goto out; if (!reset_state) @@ -402,7 +416,11 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) struct mlx5_core_dev *dev = fw_reset->dev; int err; - if (test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || + err = mlx5_fw_reset_get_reset_method(dev, &fw_reset->reset_method); + if (err) + mlx5_core_warn(dev, "Failed reading MFRL, err %d\n", err); + + if (err || test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || !mlx5_is_reset_now_capable(dev)) { err = mlx5_fw_reset_set_reset_sync_nack(dev); mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s", @@ -419,21 +437,15 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n"); } -static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev) +static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev, u16 dev_id) { struct pci_bus *bridge_bus = dev->pdev->bus; struct pci_dev *bridge = bridge_bus->self; unsigned long timeout; struct pci_dev *sdev; - u16 reg16, dev_id; int cap, err; + u16 reg16; - err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id); - if (err) - return pcibios_err_to_errno(err); - err = mlx5_check_dev_ids(dev, dev_id); - if (err) - return err; cap = pci_find_capability(bridge, PCI_CAP_ID_EXP); if (!cap) return -EOPNOTSUPP; @@ -503,6 +515,44 @@ restore: return err; } +static int mlx5_pci_reset_bus(struct mlx5_core_dev *dev) +{ + if (!MLX5_CAP_GEN(dev, pcie_reset_using_hotreset_method)) + return -EOPNOTSUPP; + + return pci_reset_bus(dev->pdev); +} + +static int mlx5_sync_pci_reset(struct mlx5_core_dev *dev, u8 reset_method) +{ + u16 dev_id; + int err; + + err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id); + if (err) + return pcibios_err_to_errno(err); + err = mlx5_check_dev_ids(dev, dev_id); + if (err) + return err; + + switch (reset_method) { + case MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE: + err = mlx5_pci_link_toggle(dev, dev_id); + if (err) + mlx5_core_warn(dev, "mlx5_pci_link_toggle failed\n"); + break; + case MLX5_MFRL_REG_PCI_RESET_METHOD_HOT_RESET: + err = mlx5_pci_reset_bus(dev); + if (err) + mlx5_core_warn(dev, "mlx5_pci_reset_bus failed\n"); + break; + default: + return -EOPNOTSUPP; + } + + return err; +} + static void mlx5_sync_reset_now_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, @@ -521,9 +571,9 @@ static void mlx5_sync_reset_now_event(struct work_struct *work) goto done; } - err = mlx5_pci_link_toggle(dev); + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); if (err) { - mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err); + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, no reset done, err %d\n", err); set_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags); } @@ -585,9 +635,9 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", rst_state); if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { - err = mlx5_pci_link_toggle(dev); + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); if (err) { - mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, err %d\n", err); + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", err); fw_reset->ret = err; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 62a85f09b52f..8a11e410f7c1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -627,6 +627,9 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx) if (MLX5_CAP_GEN_MAX(dev, pci_sync_for_fw_update_with_driver_unload)) MLX5_SET(cmd_hca_cap, set_hca_cap, pci_sync_for_fw_update_with_driver_unload, 1); + if (MLX5_CAP_GEN_MAX(dev, pcie_reset_using_hotreset_method)) + MLX5_SET(cmd_hca_cap, set_hca_cap, + pcie_reset_using_hotreset_method, 1); if (MLX5_CAP_GEN_MAX(dev, num_vhca_ports)) MLX5_SET(cmd_hca_cap, From ddac9d0fe2493dd550cbfc75eeaf31e9b6dac959 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Mon, 25 Aug 2025 17:34:29 +0300 Subject: [PATCH 39/76] net/mlx5: Fix lockdep assertion on sync reset unload event [ Upstream commit 902a8bc23a24882200f57cadc270e15a2cfaf2bb ] Fix lockdep assertion triggered during sync reset unload event. When the sync reset flow is initiated using the devlink reload fw_activate option, the PF already holds the devlink lock while handling unload event. In this case, delegate sync reset unload event handling back to the devlink callback process to avoid double-locking and resolve the lockdep warning. Kernel log: WARNING: CPU: 9 PID: 1578 at devl_assert_locked+0x31/0x40 [...] Call Trace: mlx5_unload_one_devl_locked+0x2c/0xc0 [mlx5_core] mlx5_sync_reset_unload_event+0xaf/0x2f0 [mlx5_core] process_one_work+0x222/0x640 worker_thread+0x199/0x350 kthread+0x10b/0x230 ? __pfx_worker_thread+0x10/0x10 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x8e/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 7a9770f1bfea ("net/mlx5: Handle sync reset unload event") Signed-off-by: Moshe Shemesh Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- .../net/ethernet/mellanox/mlx5/core/devlink.c | 2 +- .../ethernet/mellanox/mlx5/core/fw_reset.c | 108 ++++++++++-------- .../ethernet/mellanox/mlx5/core/fw_reset.h | 1 + 3 files changed, 63 insertions(+), 48 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index f2d1f7cad7e7..8489b5087d9c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -107,7 +107,7 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli if (err) return err; - mlx5_unload_one_devl_locked(dev, false); + mlx5_sync_reset_unload_flow(dev, true); err = mlx5_health_wait_pci_up(dev); if (err) NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset"); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index a1c2dd7f5c2c..b9986a265608 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -12,7 +12,8 @@ enum { MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, MLX5_FW_RESET_FLAGS_PENDING_COMP, MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, - MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED + MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, + MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, }; struct mlx5_fw_reset { @@ -217,7 +218,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev) return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false); } -static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unloaded) +static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev) { struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; struct devlink *devlink = priv_to_devlink(dev); @@ -226,8 +227,7 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unload if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) { complete(&fw_reset->done); } else { - if (!unloaded) - mlx5_unload_one(dev, false); + mlx5_sync_reset_unload_flow(dev, false); if (mlx5_health_wait_pci_up(dev)) mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n"); else @@ -270,7 +270,7 @@ static void mlx5_sync_reset_reload_work(struct work_struct *work) mlx5_sync_reset_clear_reset_requested(dev, false); mlx5_enter_error_state(dev, true); - mlx5_fw_reset_complete_reload(dev, false); + mlx5_fw_reset_complete_reload(dev); } #define MLX5_RESET_POLL_INTERVAL (HZ / 10) @@ -553,6 +553,59 @@ static int mlx5_sync_pci_reset(struct mlx5_core_dev *dev, u8 reset_method) return err; } +void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked) +{ + struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; + unsigned long timeout; + bool reset_action; + u8 rst_state; + int err; + + if (locked) + mlx5_unload_one_devl_locked(dev, false); + else + mlx5_unload_one(dev, false); + + if (!test_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags)) + return; + + mlx5_set_fw_rst_ack(dev); + mlx5_core_warn(dev, "Sync Reset Unload done, device reset expected\n"); + + reset_action = false; + timeout = jiffies + msecs_to_jiffies(mlx5_tout_ms(dev, RESET_UNLOAD)); + do { + rst_state = mlx5_get_fw_rst_state(dev); + if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ || + rst_state == MLX5_FW_RST_STATE_IDLE) { + reset_action = true; + break; + } + msleep(20); + } while (!time_after(jiffies, timeout)); + + if (!reset_action) { + mlx5_core_err(dev, "Got timeout waiting for sync reset action, state = %u\n", + rst_state); + fw_reset->ret = -ETIMEDOUT; + goto done; + } + + mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", + rst_state); + if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); + if (err) { + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", + err); + fw_reset->ret = err; + } + } + +done: + clear_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags); +} + static void mlx5_sync_reset_now_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, @@ -580,16 +633,13 @@ static void mlx5_sync_reset_now_event(struct work_struct *work) mlx5_enter_error_state(dev, true); done: fw_reset->ret = err; - mlx5_fw_reset_complete_reload(dev, false); + mlx5_fw_reset_complete_reload(dev); } static void mlx5_sync_reset_unload_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset; struct mlx5_core_dev *dev; - unsigned long timeout; - bool reset_action; - u8 rst_state; int err; fw_reset = container_of(work, struct mlx5_fw_reset, reset_unload_work); @@ -598,6 +648,7 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) if (mlx5_sync_reset_clear_reset_requested(dev, false)) return; + set_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags); mlx5_core_warn(dev, "Sync Reset Unload. Function is forced down.\n"); err = mlx5_cmd_fast_teardown_hca(dev); @@ -606,44 +657,7 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) else mlx5_enter_error_state(dev, true); - if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) - mlx5_unload_one_devl_locked(dev, false); - else - mlx5_unload_one(dev, false); - - mlx5_set_fw_rst_ack(dev); - mlx5_core_warn(dev, "Sync Reset Unload done, device reset expected\n"); - - reset_action = false; - timeout = jiffies + msecs_to_jiffies(mlx5_tout_ms(dev, RESET_UNLOAD)); - do { - rst_state = mlx5_get_fw_rst_state(dev); - if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ || - rst_state == MLX5_FW_RST_STATE_IDLE) { - reset_action = true; - break; - } - msleep(20); - } while (!time_after(jiffies, timeout)); - - if (!reset_action) { - mlx5_core_err(dev, "Got timeout waiting for sync reset action, state = %u\n", - rst_state); - fw_reset->ret = -ETIMEDOUT; - goto done; - } - - mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", rst_state); - if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { - err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); - if (err) { - mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", err); - fw_reset->ret = err; - } - } - -done: - mlx5_fw_reset_complete_reload(dev, true); + mlx5_fw_reset_complete_reload(dev); } static void mlx5_sync_reset_abort_event(struct work_struct *work) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h index ea527d06a85f..d5b28525c960 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h @@ -12,6 +12,7 @@ int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel, int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev); int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev); +void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked); int mlx5_fw_reset_verify_fw_complete(struct mlx5_core_dev *dev, struct netlink_ext_ack *extack); void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev); From 25659835c7afb8c7f21656e285dea1e44a0a6a53 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Fri, 2 Jun 2023 13:43:42 +0200 Subject: [PATCH 40/76] net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc() [ Upstream commit 2597ee190b4eb48d3b7d35b7bb2cc18046ae087e ] Before every call of mlx5_sf_dealloc(), there is a call to mlx5_sf_id_erase(). So move it to the beginning of mlx5_sf_dealloc(). Also remove redundant mlx5_sf_id_erase() call from mlx5_sf_free() as it is called only from mlx5_sf_dealloc(). Signed-off-by: Jiri Pirko Reviewed-by: Shay Drory Signed-off-by: Saeed Mahameed Stable-dep-of: 26e42ec7712d ("net/mlx5: Nack sync reset when SFs are present") Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index e34a8f88c518..1dd01701df20 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -111,7 +111,6 @@ id_err: static void mlx5_sf_free(struct mlx5_sf_table *table, struct mlx5_sf *sf) { - mlx5_sf_id_erase(table, sf); mlx5_sf_hw_table_sf_free(table->dev, sf->controller, sf->id); trace_mlx5_sf_free(table->dev, sf->port_index, sf->controller, sf->hw_fn_id); kfree(sf); @@ -361,6 +360,8 @@ int mlx5_devlink_sf_port_new(struct devlink *devlink, static void mlx5_sf_dealloc(struct mlx5_sf_table *table, struct mlx5_sf *sf) { + mlx5_sf_id_erase(table, sf); + if (sf->hw_state == MLX5_VHCA_STATE_ALLOCATED) { mlx5_sf_free(table, sf); } else if (mlx5_sf_is_active(sf)) { @@ -401,7 +402,6 @@ int mlx5_devlink_sf_port_del(struct devlink *devlink, } mlx5_eswitch_unload_sf_vport(esw, sf->hw_fn_id); - mlx5_sf_id_erase(table, sf); mutex_lock(&table->sf_state_lock); mlx5_sf_dealloc(table, sf); @@ -473,7 +473,6 @@ static void mlx5_sf_deactivate_all(struct mlx5_sf_table *table) */ xa_for_each(&table->port_indices, index, sf) { mlx5_eswitch_unload_sf_vport(esw, sf->hw_fn_id); - mlx5_sf_id_erase(table, sf); mlx5_sf_dealloc(table, sf); } } From a7e9da4d3afb304dec9ab73880f4d55babebe215 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Wed, 17 May 2023 13:59:34 +0200 Subject: [PATCH 41/76] net/mlx5: Use devlink port pointer to get the pointer of container SF struct [ Upstream commit 9caeb1475c3e852bcfa6332227c6bb2feaa8eb23 ] Benefit from the fact that struct devlink_port is eventually embedded in struct mlx5_sf and use container_of() macro to get it instead of the xarray lookup in every devlink port op. Signed-off-by: Jiri Pirko Reviewed-by: Shay Drory Signed-off-by: Saeed Mahameed Stable-dep-of: 26e42ec7712d ("net/mlx5: Nack sync reset when SFs are present") Signed-off-by: Sasha Levin --- .../ethernet/mellanox/mlx5/core/sf/devlink.c | 44 +++++-------------- 1 file changed, 12 insertions(+), 32 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index 1dd01701df20..4711dcfe8ba8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -20,6 +20,13 @@ struct mlx5_sf { u16 hw_state; }; +static void *mlx5_sf_by_dl_port(struct devlink_port *dl_port) +{ + struct mlx5_devlink_port *mlx5_dl_port = mlx5_devlink_port_get(dl_port); + + return container_of(mlx5_dl_port, struct mlx5_sf, dl_port); +} + struct mlx5_sf_table { struct mlx5_core_dev *dev; /* To refer from notifier context. */ struct xarray port_indices; /* port index based lookup. */ @@ -30,12 +37,6 @@ struct mlx5_sf_table { struct notifier_block vhca_nb; }; -static struct mlx5_sf * -mlx5_sf_lookup_by_index(struct mlx5_sf_table *table, unsigned int port_index) -{ - return xa_load(&table->port_indices, port_index); -} - static struct mlx5_sf * mlx5_sf_lookup_by_function_id(struct mlx5_sf_table *table, unsigned int fn_id) { @@ -171,26 +172,19 @@ int mlx5_devlink_sf_port_fn_state_get(struct devlink_port *dl_port, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(dl_port->devlink); + struct mlx5_sf *sf = mlx5_sf_by_dl_port(dl_port); struct mlx5_sf_table *table; - struct mlx5_sf *sf; - int err = 0; table = mlx5_sf_table_try_get(dev); if (!table) return -EOPNOTSUPP; - sf = mlx5_sf_lookup_by_index(table, dl_port->index); - if (!sf) { - err = -EOPNOTSUPP; - goto sf_err; - } mutex_lock(&table->sf_state_lock); *state = mlx5_sf_to_devlink_state(sf->hw_state); *opstate = mlx5_sf_to_devlink_opstate(sf->hw_state); mutex_unlock(&table->sf_state_lock); -sf_err: mlx5_sf_table_put(table); - return err; + return 0; } static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf, @@ -256,8 +250,8 @@ int mlx5_devlink_sf_port_fn_state_set(struct devlink_port *dl_port, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(dl_port->devlink); + struct mlx5_sf *sf = mlx5_sf_by_dl_port(dl_port); struct mlx5_sf_table *table; - struct mlx5_sf *sf; int err; table = mlx5_sf_table_try_get(dev); @@ -266,14 +260,7 @@ int mlx5_devlink_sf_port_fn_state_set(struct devlink_port *dl_port, "Port state set is only supported in eswitch switchdev mode or SF ports are disabled."); return -EOPNOTSUPP; } - sf = mlx5_sf_lookup_by_index(table, dl_port->index); - if (!sf) { - err = -ENODEV; - goto out; - } - err = mlx5_sf_state_set(dev, table, sf, state, extack); -out: mlx5_sf_table_put(table); return err; } @@ -384,10 +371,9 @@ int mlx5_devlink_sf_port_del(struct devlink *devlink, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf *sf = mlx5_sf_by_dl_port(dl_port); struct mlx5_eswitch *esw = dev->priv.eswitch; struct mlx5_sf_table *table; - struct mlx5_sf *sf; - int err = 0; table = mlx5_sf_table_try_get(dev); if (!table) { @@ -395,20 +381,14 @@ int mlx5_devlink_sf_port_del(struct devlink *devlink, "Port del is only supported in eswitch switchdev mode or SF ports are disabled."); return -EOPNOTSUPP; } - sf = mlx5_sf_lookup_by_index(table, dl_port->index); - if (!sf) { - err = -ENODEV; - goto sf_err; - } mlx5_eswitch_unload_sf_vport(esw, sf->hw_fn_id); mutex_lock(&table->sf_state_lock); mlx5_sf_dealloc(table, sf); mutex_unlock(&table->sf_state_lock); -sf_err: mlx5_sf_table_put(table); - return err; + return 0; } static bool mlx5_sf_state_update_check(const struct mlx5_sf *sf, u8 new_state) From a623e80aaa852789de597accb6d8c48af18c1564 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Thu, 18 May 2023 11:33:16 +0200 Subject: [PATCH 42/76] net/mlx5: Convert SF port_indices xarray to function_ids xarray [ Upstream commit 2284a4836251b3dee348172f69ac84157aa7b03e ] No need to lookup for sf by a port index. Convert the xarray to have function id as an index and optimize the remaining function id based lookup. Signed-off-by: Jiri Pirko Reviewed-by: Shay Drory Signed-off-by: Saeed Mahameed Stable-dep-of: 26e42ec7712d ("net/mlx5: Nack sync reset when SFs are present") Signed-off-by: Sasha Levin --- .../ethernet/mellanox/mlx5/core/sf/devlink.c | 29 +++++++------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index 4711dcfe8ba8..3f0ac2d1dde6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -29,7 +29,7 @@ static void *mlx5_sf_by_dl_port(struct devlink_port *dl_port) struct mlx5_sf_table { struct mlx5_core_dev *dev; /* To refer from notifier context. */ - struct xarray port_indices; /* port index based lookup. */ + struct xarray function_ids; /* function id based lookup. */ refcount_t refcount; struct completion disable_complete; struct mutex sf_state_lock; /* Serializes sf state among user cmds & vhca event handler. */ @@ -40,24 +40,17 @@ struct mlx5_sf_table { static struct mlx5_sf * mlx5_sf_lookup_by_function_id(struct mlx5_sf_table *table, unsigned int fn_id) { - unsigned long index; - struct mlx5_sf *sf; - - xa_for_each(&table->port_indices, index, sf) { - if (sf->hw_fn_id == fn_id) - return sf; - } - return NULL; + return xa_load(&table->function_ids, fn_id); } -static int mlx5_sf_id_insert(struct mlx5_sf_table *table, struct mlx5_sf *sf) +static int mlx5_sf_function_id_insert(struct mlx5_sf_table *table, struct mlx5_sf *sf) { - return xa_insert(&table->port_indices, sf->port_index, sf, GFP_KERNEL); + return xa_insert(&table->function_ids, sf->hw_fn_id, sf, GFP_KERNEL); } -static void mlx5_sf_id_erase(struct mlx5_sf_table *table, struct mlx5_sf *sf) +static void mlx5_sf_function_id_erase(struct mlx5_sf_table *table, struct mlx5_sf *sf) { - xa_erase(&table->port_indices, sf->port_index); + xa_erase(&table->function_ids, sf->hw_fn_id); } static struct mlx5_sf * @@ -94,7 +87,7 @@ mlx5_sf_alloc(struct mlx5_sf_table *table, struct mlx5_eswitch *esw, sf->hw_state = MLX5_VHCA_STATE_ALLOCATED; sf->controller = controller; - err = mlx5_sf_id_insert(table, sf); + err = mlx5_sf_function_id_insert(table, sf); if (err) goto insert_err; @@ -347,7 +340,7 @@ int mlx5_devlink_sf_port_new(struct devlink *devlink, static void mlx5_sf_dealloc(struct mlx5_sf_table *table, struct mlx5_sf *sf) { - mlx5_sf_id_erase(table, sf); + mlx5_sf_function_id_erase(table, sf); if (sf->hw_state == MLX5_VHCA_STATE_ALLOCATED) { mlx5_sf_free(table, sf); @@ -451,7 +444,7 @@ static void mlx5_sf_deactivate_all(struct mlx5_sf_table *table) /* At this point, no new user commands can start and no vhca event can * arrive. It is safe to destroy all user created SFs. */ - xa_for_each(&table->port_indices, index, sf) { + xa_for_each(&table->function_ids, index, sf) { mlx5_eswitch_unload_sf_vport(esw, sf->hw_fn_id); mlx5_sf_dealloc(table, sf); } @@ -510,7 +503,7 @@ int mlx5_sf_table_init(struct mlx5_core_dev *dev) mutex_init(&table->sf_state_lock); table->dev = dev; - xa_init(&table->port_indices); + xa_init(&table->function_ids); dev->priv.sf_table = table; refcount_set(&table->refcount, 0); table->esw_nb.notifier_call = mlx5_sf_esw_event; @@ -545,6 +538,6 @@ void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); WARN_ON(refcount_read(&table->refcount)); mutex_destroy(&table->sf_state_lock); - WARN_ON(!xa_empty(&table->port_indices)); + WARN_ON(!xa_empty(&table->function_ids)); kfree(table); } From 3e07c623fbc5a3800bfb956a31a16adaee3c13c7 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Mon, 25 Aug 2025 17:34:30 +0300 Subject: [PATCH 43/76] net/mlx5: Nack sync reset when SFs are present [ Upstream commit 26e42ec7712d392d561964514b1f253b1a96f42d ] If PF (Physical Function) has SFs (Sub-Functions), since the SFs are not taking part in the synchronization flow, sync reset can lead to fatal error on the SFs, as the function will be closed unexpectedly from the SF point of view. Add a check to prevent sync reset when there are SFs on a PF device which is not ECPF, as ECPF is teardowned gracefully before reset. Fixes: 92501fa6e421 ("net/mlx5: Ack on sync_reset_request only if PF can do reset_now") Signed-off-by: Moshe Shemesh Reviewed-by: Parav Pandit Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-8-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 6 ++++++ drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c | 10 ++++++++++ drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h | 6 ++++++ 3 files changed, 22 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index b9986a265608..1547704c8976 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -6,6 +6,7 @@ #include "fw_reset.h" #include "diag/fw_tracer.h" #include "lib/tout.h" +#include "sf/sf.h" enum { MLX5_FW_RESET_FLAGS_RESET_REQUESTED, @@ -397,6 +398,11 @@ static bool mlx5_is_reset_now_capable(struct mlx5_core_dev *dev) return false; } + if (!mlx5_core_is_ecpf(dev) && !mlx5_sf_table_empty(dev)) { + mlx5_core_warn(dev, "SFs should be removed before reset\n"); + return false; + } + #if IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE) err = mlx5_check_hotplug_interrupt(dev); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index 3f0ac2d1dde6..c9089f2ec5f2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -541,3 +541,13 @@ void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) WARN_ON(!xa_empty(&table->function_ids)); kfree(table); } + +bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table = dev->priv.sf_table; + + if (!table) + return true; + + return xa_empty(&table->function_ids); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h index 860f9ddb7107..89559a37997a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -17,6 +17,7 @@ void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev); int mlx5_sf_table_init(struct mlx5_core_dev *dev); void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev); +bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev); int mlx5_devlink_sf_port_new(struct devlink *devlink, const struct devlink_port_new_attrs *add_attr, @@ -61,6 +62,11 @@ static inline void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) { } +static inline bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev) +{ + return true; +} + #endif #endif From cdd96ed125242afb2add7d760bd634799914a9a1 Mon Sep 17 00:00:00 2001 From: Alexei Lazar Date: Mon, 25 Aug 2025 17:34:32 +0300 Subject: [PATCH 44/76] net/mlx5e: Update and set Xon/Xoff upon MTU set [ Upstream commit ceddedc969f0532b7c62ca971ee50d519d2bc0cb ] Xon/Xoff sizes are derived from calculation that include the MTU size. Set Xon/Xoff when MTU is set. If Xon/Xoff fails, set the previous MTU. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Alexei Lazar Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-10-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- .../mellanox/mlx5/core/en/port_buffer.h | 12 ++++++++++++ .../net/ethernet/mellanox/mlx5/core/en_main.c | 17 ++++++++++++++++- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h index f4a19ffbb641..66d276a1be83 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h @@ -66,11 +66,23 @@ struct mlx5e_port_buffer { struct mlx5e_bufferx_reg buffer[MLX5E_MAX_NETWORK_BUFFER]; }; +#ifdef CONFIG_MLX5_CORE_EN_DCB int mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv, u32 change, unsigned int mtu, struct ieee_pfc *pfc, u32 *buffer_size, u8 *prio2buffer); +#else +static inline int +mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv, + u32 change, unsigned int mtu, + void *pfc, + u32 *buffer_size, + u8 *prio2buffer) +{ + return 0; +} +#endif int mlx5e_port_query_buffer(struct mlx5e_priv *priv, struct mlx5e_port_buffer *port_buffer); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5c6f01abdcb9..09ba60b2e744 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -44,6 +44,7 @@ #include "eswitch.h" #include "en.h" #include "en/txrx.h" +#include "en/port_buffer.h" #include "en_tc.h" #include "en_rep.h" #include "en_accel/ipsec.h" @@ -2722,9 +2723,11 @@ int mlx5e_set_dev_port_mtu(struct mlx5e_priv *priv) struct mlx5e_params *params = &priv->channels.params; struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; - u16 mtu; + u16 mtu, prev_mtu; int err; + mlx5e_query_mtu(mdev, params, &prev_mtu); + err = mlx5e_set_mtu(mdev, params, params->sw_mtu); if (err) return err; @@ -2734,6 +2737,18 @@ int mlx5e_set_dev_port_mtu(struct mlx5e_priv *priv) netdev_warn(netdev, "%s: VPort MTU %d is different than netdev mtu %d\n", __func__, mtu, params->sw_mtu); + if (mtu != prev_mtu && MLX5_BUFFER_SUPPORTED(mdev)) { + err = mlx5e_port_manual_buffer_config(priv, 0, mtu, + NULL, NULL, NULL); + if (err) { + netdev_warn(netdev, "%s: Failed to set Xon/Xoff values with MTU %d (err %d), setting back to previous MTU %d\n", + __func__, mtu, err, prev_mtu); + + mlx5e_set_mtu(mdev, params, prev_mtu); + return err; + } + } + params->sw_mtu = mtu; return 0; } From dec9d873bdf79cd7edbd7162c2b3fa958bf802b4 Mon Sep 17 00:00:00 2001 From: Alexei Lazar Date: Mon, 25 Aug 2025 17:34:33 +0300 Subject: [PATCH 45/76] net/mlx5e: Update and set Xon/Xoff upon port speed set [ Upstream commit d24341740fe48add8a227a753e68b6eedf4b385a ] Xon/Xoff sizes are derived from calculations that include the port speed. These settings need to be updated and applied whenever the port speed is changed. The port speed is typically set after the physical link goes down and is negotiated as part of the link-up process between the two connected interfaces. Xon/Xoff parameters being updated at the point where the new negotiated speed is established. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Alexei Lazar Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-11-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 09ba60b2e744..d378aa55f22f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -109,6 +109,8 @@ void mlx5e_update_carrier(struct mlx5e_priv *priv) if (up) { netdev_info(priv->netdev, "Link up\n"); netif_carrier_on(priv->netdev); + mlx5e_port_manual_buffer_config(priv, 0, priv->netdev->mtu, + NULL, NULL, NULL); } else { netdev_info(priv->netdev, "Link down\n"); netif_carrier_off(priv->netdev); From 9b0acd3bb2910430c9d92dd5522edb0593321feb Mon Sep 17 00:00:00 2001 From: Alexei Lazar Date: Mon, 25 Aug 2025 17:34:34 +0300 Subject: [PATCH 46/76] net/mlx5e: Set local Xoff after FW update [ Upstream commit aca0c31af61e0d5cf1675a0cbd29460b95ae693c ] The local Xoff value is being set before the firmware (FW) update. In case of a failure where the FW is not updated with the new value, there is no fallback to the previous value. Update the local Xoff value after the FW has been successfully set. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Alexei Lazar Reviewed-by: Tariq Toukan Reviewed-by: Dragos Tatulea Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250825143435.598584-12-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c index 3efa8bf1d14e..4720523813b9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c @@ -575,7 +575,6 @@ int mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv, if (err) return err; } - priv->dcbx.xoff = xoff; /* Apply the settings */ if (update_buffer) { @@ -584,6 +583,8 @@ int mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv, return err; } + priv->dcbx.xoff = xoff; + if (update_prio2buffer) err = mlx5e_port_set_priority2buffer(priv->mdev, prio2buffer); From ce006b60fc495e2592a65b94d397114c96ff5f94 Mon Sep 17 00:00:00 2001 From: Rohan G Thomas Date: Mon, 25 Aug 2025 12:36:52 +0800 Subject: [PATCH 47/76] net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts [ Upstream commit 4f23382841e67174211271a454811dd17c0ef3c5 ] Enabling RX FIFO Overflow interrupts is counterproductive and causes an interrupt storm when RX FIFO overflows. Disabling this interrupt has no side effect and eliminates interrupt storms when the RX FIFO overflows. Commit 8a7cb245cf28 ("net: stmmac: Do not enable RX FIFO overflow interrupts") disables RX FIFO overflow interrupts for DWMAC4 IP and removes the corresponding handling of this interrupt. This patch is doing the same thing for XGMAC IP. Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2") Signed-off-by: Rohan G Thomas Reviewed-by: Matthew Gerlach Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250825-xgmac-minor-fixes-v3-1-c225fe4444c0@altera.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c index 05ea74e93793..37c4258fef79 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c @@ -203,10 +203,6 @@ static void dwxgmac2_dma_rx_mode(struct stmmac_priv *priv, void __iomem *ioaddr, } writel(value, ioaddr + XGMAC_MTL_RXQ_OPMODE(channel)); - - /* Enable MTL RX overflow */ - value = readl(ioaddr + XGMAC_MTL_QINTEN(channel)); - writel(value | XGMAC_RXOIE, ioaddr + XGMAC_MTL_QINTEN(channel)); } static void dwxgmac2_dma_tx_mode(struct stmmac_priv *priv, void __iomem *ioaddr, From d02d635fc03d5bfe26106c929109290c5fc862e2 Mon Sep 17 00:00:00 2001 From: Serge Semin Date: Fri, 19 Apr 2024 12:03:05 +0300 Subject: [PATCH 48/76] net: stmmac: Rename phylink_get_caps() callback to update_caps() [ Upstream commit dc144baeb4fbfa0d91ce9c3875307566f58704ec ] Since recent commits the stmmac_ops::phylink_get_caps() callback has no longer been responsible for the phylink MAC capabilities getting, but merely updates the MAC capabilities in the mac_device_info::link::caps field. Rename the callback to comply with the what the method does now. Signed-off-by: Serge Semin Reviewed-by: Romain Gantois Signed-off-by: Paolo Abeni Stable-dep-of: 42ef11b2bff5 ("net: stmmac: xgmac: Correct supported speed modes") Signed-off-by: Sasha Levin --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 8 ++++---- drivers/net/ethernet/stmicro/stmmac/hwif.h | 8 ++++---- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++--- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index a9837985a483..bdb4f527289d 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -69,7 +69,7 @@ static void dwmac4_core_init(struct mac_device_info *hw, init_waitqueue_head(&priv->tstamp_busy_wait); } -static void dwmac4_phylink_get_caps(struct stmmac_priv *priv) +static void dwmac4_update_caps(struct stmmac_priv *priv) { if (priv->plat->tx_queues_to_use > 1) priv->hw->link.caps &= ~(MAC_10HD | MAC_100HD | MAC_1000HD); @@ -1161,7 +1161,7 @@ static int dwmac4_config_l4_filter(struct mac_device_info *hw, u32 filter_no, const struct stmmac_ops dwmac4_ops = { .core_init = dwmac4_core_init, - .phylink_get_caps = dwmac4_phylink_get_caps, + .update_caps = dwmac4_update_caps, .set_mac = stmmac_set_mac, .rx_ipc = dwmac4_rx_ipc_enable, .rx_queue_enable = dwmac4_rx_queue_enable, @@ -1204,7 +1204,7 @@ const struct stmmac_ops dwmac4_ops = { const struct stmmac_ops dwmac410_ops = { .core_init = dwmac4_core_init, - .phylink_get_caps = dwmac4_phylink_get_caps, + .update_caps = dwmac4_update_caps, .set_mac = stmmac_dwmac4_set_mac, .rx_ipc = dwmac4_rx_ipc_enable, .rx_queue_enable = dwmac4_rx_queue_enable, @@ -1253,7 +1253,7 @@ const struct stmmac_ops dwmac410_ops = { const struct stmmac_ops dwmac510_ops = { .core_init = dwmac4_core_init, - .phylink_get_caps = dwmac4_phylink_get_caps, + .update_caps = dwmac4_update_caps, .set_mac = stmmac_dwmac4_set_mac, .rx_ipc = dwmac4_rx_ipc_enable, .rx_queue_enable = dwmac4_rx_queue_enable, diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h index 47fb8e1646c2..ee9a7d98648b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -300,8 +300,8 @@ struct stmmac_est; struct stmmac_ops { /* MAC core initialization */ void (*core_init)(struct mac_device_info *hw, struct net_device *dev); - /* Get phylink capabilities */ - void (*phylink_get_caps)(struct stmmac_priv *priv); + /* Update MAC capabilities */ + void (*update_caps)(struct stmmac_priv *priv); /* Enable the MAC RX/TX */ void (*set_mac)(void __iomem *ioaddr, bool enable); /* Enable and verify that the IPC module is supported */ @@ -423,8 +423,8 @@ struct stmmac_ops { #define stmmac_core_init(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, core_init, __args) -#define stmmac_mac_phylink_get_caps(__priv) \ - stmmac_do_void_callback(__priv, mac, phylink_get_caps, __priv) +#define stmmac_mac_update_caps(__priv) \ + stmmac_do_void_callback(__priv, mac, update_caps, __priv) #define stmmac_mac_set(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, set_mac, __args) #define stmmac_rx_ipc(__priv, __args...) \ diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 615d25a0e46b..fa8ee0624f2f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1230,8 +1230,8 @@ static int stmmac_phy_setup(struct stmmac_priv *priv) xpcs_get_interfaces(priv->hw->xpcs, priv->phylink_config.supported_interfaces); - /* Get the MAC specific capabilities */ - stmmac_mac_phylink_get_caps(priv); + /* Refresh the MAC-specific capabilities */ + stmmac_mac_update_caps(priv); priv->phylink_config.mac_capabilities = priv->hw->link.caps; @@ -7232,7 +7232,7 @@ int stmmac_reinit_queues(struct net_device *dev, u32 rx_cnt, u32 tx_cnt) priv->rss.table[i] = ethtool_rxfh_indir_default(i, rx_cnt); - stmmac_mac_phylink_get_caps(priv); + stmmac_mac_update_caps(priv); priv->phylink_config.mac_capabilities = priv->hw->link.caps; From dc38d0111c16ff8821e20d441ac54565e3893486 Mon Sep 17 00:00:00 2001 From: Rohan G Thomas Date: Mon, 25 Aug 2025 12:36:53 +0800 Subject: [PATCH 49/76] net: stmmac: xgmac: Correct supported speed modes [ Upstream commit 42ef11b2bff5b6a2910c28d2ea47cc00e0fbcaec ] Correct supported speed modes as per the XGMAC databook. Commit 9cb54af214a7 ("net: stmmac: Fix IP-cores specific MAC capabilities") removes support for 10M, 100M and 1000HD. 1000HD is not supported by XGMAC IP, but it does support 10M and 100M FD mode for XGMAC version >= 2_20, and it also supports 10M and 100M HD mode if the HDSEL bit is set in the MAC_HW_FEATURE0 reg. This commit enables support for 10M and 100M speed modes for XGMAC IP based on XGMAC version and MAC capabilities. Fixes: 9cb54af214a7 ("net: stmmac: Fix IP-cores specific MAC capabilities") Signed-off-by: Rohan G Thomas Reviewed-by: Matthew Gerlach Link: https://patch.msgid.link/20250825-xgmac-minor-fixes-v3-2-c225fe4444c0@altera.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 13 +++++++++++-- drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 5 +++++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c index 052566f5b7f3..0bcb378fa0bc 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c @@ -47,6 +47,14 @@ static void dwxgmac2_core_init(struct mac_device_info *hw, writel(XGMAC_INT_DEFAULT_EN, ioaddr + XGMAC_INT_EN); } +static void dwxgmac2_update_caps(struct stmmac_priv *priv) +{ + if (!priv->dma_cap.mbps_10_100) + priv->hw->link.caps &= ~(MAC_10 | MAC_100); + else if (!priv->dma_cap.half_duplex) + priv->hw->link.caps &= ~(MAC_10HD | MAC_100HD); +} + static void dwxgmac2_set_mac(void __iomem *ioaddr, bool enable) { u32 tx = readl(ioaddr + XGMAC_TX_CONFIG); @@ -1583,6 +1591,7 @@ static void dwxgmac3_fpe_configure(void __iomem *ioaddr, struct stmmac_fpe_cfg * const struct stmmac_ops dwxgmac210_ops = { .core_init = dwxgmac2_core_init, + .update_caps = dwxgmac2_update_caps, .set_mac = dwxgmac2_set_mac, .rx_ipc = dwxgmac2_rx_ipc, .rx_queue_enable = dwxgmac2_rx_queue_enable, @@ -1705,8 +1714,8 @@ int dwxgmac2_setup(struct stmmac_priv *priv) mac->mcast_bits_log2 = ilog2(mac->multicast_filter_bins); mac->link.caps = MAC_ASYM_PAUSE | MAC_SYM_PAUSE | - MAC_1000FD | MAC_2500FD | MAC_5000FD | - MAC_10000FD; + MAC_10 | MAC_100 | MAC_1000FD | + MAC_2500FD | MAC_5000FD | MAC_10000FD; mac->link.duplex = 0; mac->link.speed10 = XGMAC_CONFIG_SS_10_MII; mac->link.speed100 = XGMAC_CONFIG_SS_100_MII; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c index 37c4258fef79..b2c03cb65c7c 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c @@ -382,8 +382,11 @@ static int dwxgmac2_dma_interrupt(struct stmmac_priv *priv, static int dwxgmac2_get_hw_feature(void __iomem *ioaddr, struct dma_features *dma_cap) { + struct stmmac_priv *priv; u32 hw_cap; + priv = container_of(dma_cap, struct stmmac_priv, dma_cap); + /* MAC HW feature 0 */ hw_cap = readl(ioaddr + XGMAC_HW_FEATURE0); dma_cap->edma = (hw_cap & XGMAC_HWFEAT_EDMA) >> 31; @@ -406,6 +409,8 @@ static int dwxgmac2_get_hw_feature(void __iomem *ioaddr, dma_cap->vlhash = (hw_cap & XGMAC_HWFEAT_VLHASH) >> 4; dma_cap->half_duplex = (hw_cap & XGMAC_HWFEAT_HDSEL) >> 3; dma_cap->mbps_1000 = (hw_cap & XGMAC_HWFEAT_GMIISEL) >> 1; + if (dma_cap->mbps_1000 && priv->synopsys_id >= DWXGMAC_CORE_2_20) + dma_cap->mbps_10_100 = 1; /* MAC HW feature 1 */ hw_cap = readl(ioaddr + XGMAC_HW_FEATURE1); From b0f8725196aedfbd2eb2361d4775af6d7260394c Mon Sep 17 00:00:00 2001 From: Rohan G Thomas Date: Mon, 25 Aug 2025 12:36:54 +0800 Subject: [PATCH 50/76] net: stmmac: Set CIC bit only for TX queues with COE [ Upstream commit b1eded580ab28119de0b0f21efe37ee2b4419144 ] Currently, in the AF_XDP transmit paths, the CIC bit of TX Desc3 is set for all packets. Setting this bit for packets transmitting through queues that don't support checksum offloading causes the TX DMA to get stuck after transmitting some packets. This patch ensures the CIC bit of TX Desc3 is set only if the TX queue supports checksum offloading. Fixes: 132c32ee5bc0 ("net: stmmac: Add TX via XDP zero-copy socket") Signed-off-by: Rohan G Thomas Reviewed-by: Matthew Gerlach Link: https://patch.msgid.link/20250825-xgmac-minor-fixes-v3-3-c225fe4444c0@altera.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index fa8ee0624f2f..ff5389a8efc3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2426,6 +2426,7 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) struct netdev_queue *nq = netdev_get_tx_queue(priv->dev, queue); struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue]; struct stmmac_txq_stats *txq_stats = &priv->xstats.txq_stats[queue]; + bool csum = !priv->plat->tx_queues_cfg[queue].coe_unsupported; struct xsk_buff_pool *pool = tx_q->xsk_pool; unsigned int entry = tx_q->cur_tx; struct dma_desc *tx_desc = NULL; @@ -2496,7 +2497,7 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) } stmmac_prepare_tx_desc(priv, tx_desc, 1, xdp_desc.len, - true, priv->mode, true, true, + csum, priv->mode, true, true, xdp_desc.len); stmmac_enable_dma_transmission(priv, priv->ioaddr); @@ -4789,6 +4790,7 @@ static int stmmac_xdp_xmit_xdpf(struct stmmac_priv *priv, int queue, { struct stmmac_txq_stats *txq_stats = &priv->xstats.txq_stats[queue]; struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue]; + bool csum = !priv->plat->tx_queues_cfg[queue].coe_unsupported; unsigned int entry = tx_q->cur_tx; struct dma_desc *tx_desc; dma_addr_t dma_addr; @@ -4833,7 +4835,7 @@ static int stmmac_xdp_xmit_xdpf(struct stmmac_priv *priv, int queue, stmmac_set_desc_addr(priv, tx_desc, dma_addr); stmmac_prepare_tx_desc(priv, tx_desc, 1, xdpf->len, - true, priv->mode, true, true, + csum, priv->mode, true, true, xdpf->len); tx_q->tx_count_frames++; From 4998ab3eb2b8a904be2d988899fd1316ed1fdc8e Mon Sep 17 00:00:00 2001 From: Takamitsu Iwai Date: Sat, 23 Aug 2025 17:58:55 +0900 Subject: [PATCH 51/76] net: rose: split remove and free operations in rose_remove_neigh() [ Upstream commit dcb34659028f856c423a29ef9b4e2571d203444d ] The current rose_remove_neigh() performs two distinct operations: 1. Removes rose_neigh from rose_neigh_list 2. Frees the rose_neigh structure Split these operations into separate functions to improve maintainability and prepare for upcoming refcount_t conversion. The timer cleanup remains in rose_remove_neigh() because free operations can be called from timer itself. This patch introduce rose_neigh_put() to handle the freeing of rose_neigh structures and modify rose_remove_neigh() to handle removal only. Signed-off-by: Takamitsu Iwai Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20250823085857.47674-2-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski Stable-dep-of: d860d1faa6b2 ("net: rose: convert 'use' field to refcount_t") Signed-off-by: Sasha Levin --- include/net/rose.h | 8 ++++++++ net/rose/rose_route.c | 15 ++++++--------- 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/include/net/rose.h b/include/net/rose.h index 23267b4efcfa..174b4f605d84 100644 --- a/include/net/rose.h +++ b/include/net/rose.h @@ -151,6 +151,14 @@ struct rose_sock { #define rose_sk(sk) ((struct rose_sock *)(sk)) +static inline void rose_neigh_put(struct rose_neigh *rose_neigh) +{ + if (rose_neigh->ax25) + ax25_cb_put(rose_neigh->ax25); + kfree(rose_neigh->digipeat); + kfree(rose_neigh); +} + /* af_rose.c */ extern ax25_address rose_callsign; extern int sysctl_rose_restart_request_timeout; diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c index a7054546f52d..b406b1e0fb1e 100644 --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -234,20 +234,12 @@ static void rose_remove_neigh(struct rose_neigh *rose_neigh) if ((s = rose_neigh_list) == rose_neigh) { rose_neigh_list = rose_neigh->next; - if (rose_neigh->ax25) - ax25_cb_put(rose_neigh->ax25); - kfree(rose_neigh->digipeat); - kfree(rose_neigh); return; } while (s != NULL && s->next != NULL) { if (s->next == rose_neigh) { s->next = rose_neigh->next; - if (rose_neigh->ax25) - ax25_cb_put(rose_neigh->ax25); - kfree(rose_neigh->digipeat); - kfree(rose_neigh); return; } @@ -331,8 +323,10 @@ static int rose_del_node(struct rose_route_struct *rose_route, if (rose_node->neighbour[i] == rose_neigh) { rose_neigh->count--; - if (rose_neigh->count == 0 && rose_neigh->use == 0) + if (rose_neigh->count == 0 && rose_neigh->use == 0) { rose_remove_neigh(rose_neigh); + rose_neigh_put(rose_neigh); + } rose_node->count--; @@ -513,6 +507,7 @@ void rose_rt_device_down(struct net_device *dev) } rose_remove_neigh(s); + rose_neigh_put(s); } spin_unlock_bh(&rose_neigh_list_lock); spin_unlock_bh(&rose_node_list_lock); @@ -569,6 +564,7 @@ static int rose_clear_routes(void) if (s->use == 0 && !s->loopback) { s->count = 0; rose_remove_neigh(s); + rose_neigh_put(s); } } @@ -1301,6 +1297,7 @@ void __exit rose_rt_free(void) rose_neigh = rose_neigh->next; rose_remove_neigh(s); + rose_neigh_put(s); } while (rose_node != NULL) { From f8c29fc437d03a98fb075c31c5be761cc8326284 Mon Sep 17 00:00:00 2001 From: Takamitsu Iwai Date: Sat, 23 Aug 2025 17:58:56 +0900 Subject: [PATCH 52/76] net: rose: convert 'use' field to refcount_t [ Upstream commit d860d1faa6b2ce3becfdb8b0c2b048ad31800061 ] The 'use' field in struct rose_neigh is used as a reference counter but lacks atomicity. This can lead to race conditions where a rose_neigh structure is freed while still being referenced by other code paths. For example, when rose_neigh->use becomes zero during an ioctl operation via rose_rt_ioctl(), the structure may be removed while its timer is still active, potentially causing use-after-free issues. This patch changes the type of 'use' from unsigned short to refcount_t and updates all code paths to use rose_neigh_hold() and rose_neigh_put() which operate reference counts atomically. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Takamitsu Iwai Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20250823085857.47674-3-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- include/net/rose.h | 18 +++++++++++++----- net/rose/af_rose.c | 13 +++++++------ net/rose/rose_in.c | 12 ++++++------ net/rose/rose_route.c | 33 ++++++++++++++++++--------------- net/rose/rose_timer.c | 2 +- 5 files changed, 45 insertions(+), 33 deletions(-) diff --git a/include/net/rose.h b/include/net/rose.h index 174b4f605d84..2b5491bbf39a 100644 --- a/include/net/rose.h +++ b/include/net/rose.h @@ -8,6 +8,7 @@ #ifndef _ROSE_H #define _ROSE_H +#include #include #include #include @@ -96,7 +97,7 @@ struct rose_neigh { ax25_cb *ax25; struct net_device *dev; unsigned short count; - unsigned short use; + refcount_t use; unsigned int number; char restarted; char dce_mode; @@ -151,12 +152,19 @@ struct rose_sock { #define rose_sk(sk) ((struct rose_sock *)(sk)) +static inline void rose_neigh_hold(struct rose_neigh *rose_neigh) +{ + refcount_inc(&rose_neigh->use); +} + static inline void rose_neigh_put(struct rose_neigh *rose_neigh) { - if (rose_neigh->ax25) - ax25_cb_put(rose_neigh->ax25); - kfree(rose_neigh->digipeat); - kfree(rose_neigh); + if (refcount_dec_and_test(&rose_neigh->use)) { + if (rose_neigh->ax25) + ax25_cb_put(rose_neigh->ax25); + kfree(rose_neigh->digipeat); + kfree(rose_neigh); + } } /* af_rose.c */ diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index 66e9ceaaa43a..614695444b6a 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -170,7 +170,7 @@ void rose_kill_by_neigh(struct rose_neigh *neigh) if (rose->neighbour == neigh) { rose_disconnect(s, ENETUNREACH, ROSE_OUT_OF_ORDER, 0); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); rose->neighbour = NULL; } } @@ -212,7 +212,7 @@ start: if (rose->device == dev) { rose_disconnect(sk, ENETUNREACH, ROSE_OUT_OF_ORDER, 0); if (rose->neighbour) - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); netdev_put(rose->device, &rose->dev_tracker); rose->device = NULL; } @@ -655,7 +655,7 @@ static int rose_release(struct socket *sock) break; case ROSE_STATE_2: - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); release_sock(sk); rose_disconnect(sk, 0, -1, -1); lock_sock(sk); @@ -823,6 +823,7 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le rose->lci = rose_new_lci(rose->neighbour); if (!rose->lci) { err = -ENETUNREACH; + rose_neigh_put(rose->neighbour); goto out_release; } @@ -834,12 +835,14 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le dev = rose_dev_first(); if (!dev) { err = -ENETUNREACH; + rose_neigh_put(rose->neighbour); goto out_release; } user = ax25_findbyuid(current_euid()); if (!user) { err = -EINVAL; + rose_neigh_put(rose->neighbour); dev_put(dev); goto out_release; } @@ -874,8 +877,6 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le rose->state = ROSE_STATE_1; - rose->neighbour->use++; - rose_write_internal(sk, ROSE_CALL_REQUEST); rose_start_heartbeat(sk); rose_start_t1timer(sk); @@ -1077,7 +1078,7 @@ int rose_rx_call_request(struct sk_buff *skb, struct net_device *dev, struct ros GFP_ATOMIC); make_rose->facilities = facilities; - make_rose->neighbour->use++; + rose_neigh_hold(make_rose->neighbour); if (rose_sk(sk)->defer) { make_rose->state = ROSE_STATE_5; diff --git a/net/rose/rose_in.c b/net/rose/rose_in.c index 4d67f36dce1b..7caae93937ee 100644 --- a/net/rose/rose_in.c +++ b/net/rose/rose_in.c @@ -56,7 +56,7 @@ static int rose_state1_machine(struct sock *sk, struct sk_buff *skb, int framety case ROSE_CLEAR_REQUEST: rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); rose_disconnect(sk, ECONNREFUSED, skb->data[3], skb->data[4]); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); break; default: @@ -79,12 +79,12 @@ static int rose_state2_machine(struct sock *sk, struct sk_buff *skb, int framety case ROSE_CLEAR_REQUEST: rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); rose_disconnect(sk, 0, skb->data[3], skb->data[4]); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); break; case ROSE_CLEAR_CONFIRMATION: rose_disconnect(sk, 0, -1, -1); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); break; default: @@ -120,7 +120,7 @@ static int rose_state3_machine(struct sock *sk, struct sk_buff *skb, int framety case ROSE_CLEAR_REQUEST: rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); rose_disconnect(sk, 0, skb->data[3], skb->data[4]); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); break; case ROSE_RR: @@ -233,7 +233,7 @@ static int rose_state4_machine(struct sock *sk, struct sk_buff *skb, int framety case ROSE_CLEAR_REQUEST: rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); rose_disconnect(sk, 0, skb->data[3], skb->data[4]); - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); break; default: @@ -253,7 +253,7 @@ static int rose_state5_machine(struct sock *sk, struct sk_buff *skb, int framety if (frametype == ROSE_CLEAR_REQUEST) { rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); rose_disconnect(sk, 0, skb->data[3], skb->data[4]); - rose_sk(sk)->neighbour->use--; + rose_neigh_put(rose_sk(sk)->neighbour); } return 0; diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c index b406b1e0fb1e..42460da0854d 100644 --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -93,11 +93,11 @@ static int __must_check rose_add_node(struct rose_route_struct *rose_route, rose_neigh->ax25 = NULL; rose_neigh->dev = dev; rose_neigh->count = 0; - rose_neigh->use = 0; rose_neigh->dce_mode = 0; rose_neigh->loopback = 0; rose_neigh->number = rose_neigh_no++; rose_neigh->restarted = 0; + refcount_set(&rose_neigh->use, 1); skb_queue_head_init(&rose_neigh->queue); @@ -255,10 +255,10 @@ static void rose_remove_route(struct rose_route *rose_route) struct rose_route *s; if (rose_route->neigh1 != NULL) - rose_route->neigh1->use--; + rose_neigh_put(rose_route->neigh1); if (rose_route->neigh2 != NULL) - rose_route->neigh2->use--; + rose_neigh_put(rose_route->neigh2); if ((s = rose_route_list) == rose_route) { rose_route_list = rose_route->next; @@ -323,7 +323,7 @@ static int rose_del_node(struct rose_route_struct *rose_route, if (rose_node->neighbour[i] == rose_neigh) { rose_neigh->count--; - if (rose_neigh->count == 0 && rose_neigh->use == 0) { + if (rose_neigh->count == 0) { rose_remove_neigh(rose_neigh); rose_neigh_put(rose_neigh); } @@ -375,11 +375,11 @@ void rose_add_loopback_neigh(void) sn->ax25 = NULL; sn->dev = NULL; sn->count = 0; - sn->use = 0; sn->dce_mode = 1; sn->loopback = 1; sn->number = rose_neigh_no++; sn->restarted = 1; + refcount_set(&sn->use, 1); skb_queue_head_init(&sn->queue); @@ -561,8 +561,7 @@ static int rose_clear_routes(void) s = rose_neigh; rose_neigh = rose_neigh->next; - if (s->use == 0 && !s->loopback) { - s->count = 0; + if (!s->loopback) { rose_remove_neigh(s); rose_neigh_put(s); } @@ -680,6 +679,7 @@ struct rose_neigh *rose_get_neigh(rose_address *addr, unsigned char *cause, for (i = 0; i < node->count; i++) { if (node->neighbour[i]->restarted) { res = node->neighbour[i]; + rose_neigh_hold(node->neighbour[i]); goto out; } } @@ -691,6 +691,7 @@ struct rose_neigh *rose_get_neigh(rose_address *addr, unsigned char *cause, for (i = 0; i < node->count; i++) { if (!rose_ftimer_running(node->neighbour[i])) { res = node->neighbour[i]; + rose_neigh_hold(node->neighbour[i]); goto out; } failed = 1; @@ -780,13 +781,13 @@ static void rose_del_route_by_neigh(struct rose_neigh *rose_neigh) } if (rose_route->neigh1 == rose_neigh) { - rose_route->neigh1->use--; + rose_neigh_put(rose_route->neigh1); rose_route->neigh1 = NULL; rose_transmit_clear_request(rose_route->neigh2, rose_route->lci2, ROSE_OUT_OF_ORDER, 0); } if (rose_route->neigh2 == rose_neigh) { - rose_route->neigh2->use--; + rose_neigh_put(rose_route->neigh2); rose_route->neigh2 = NULL; rose_transmit_clear_request(rose_route->neigh1, rose_route->lci1, ROSE_OUT_OF_ORDER, 0); } @@ -915,7 +916,7 @@ int rose_route_frame(struct sk_buff *skb, ax25_cb *ax25) rose_clear_queues(sk); rose->cause = ROSE_NETWORK_CONGESTION; rose->diagnostic = 0; - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); rose->neighbour = NULL; rose->lci = 0; rose->state = ROSE_STATE_0; @@ -1040,12 +1041,12 @@ int rose_route_frame(struct sk_buff *skb, ax25_cb *ax25) if ((new_lci = rose_new_lci(new_neigh)) == 0) { rose_transmit_clear_request(rose_neigh, lci, ROSE_NETWORK_CONGESTION, 71); - goto out; + goto put_neigh; } if ((rose_route = kmalloc(sizeof(*rose_route), GFP_ATOMIC)) == NULL) { rose_transmit_clear_request(rose_neigh, lci, ROSE_NETWORK_CONGESTION, 120); - goto out; + goto put_neigh; } rose_route->lci1 = lci; @@ -1058,8 +1059,8 @@ int rose_route_frame(struct sk_buff *skb, ax25_cb *ax25) rose_route->lci2 = new_lci; rose_route->neigh2 = new_neigh; - rose_route->neigh1->use++; - rose_route->neigh2->use++; + rose_neigh_hold(rose_route->neigh1); + rose_neigh_hold(rose_route->neigh2); rose_route->next = rose_route_list; rose_route_list = rose_route; @@ -1071,6 +1072,8 @@ int rose_route_frame(struct sk_buff *skb, ax25_cb *ax25) rose_transmit_link(skb, rose_route->neigh2); res = 1; +put_neigh: + rose_neigh_put(new_neigh); out: spin_unlock_bh(&rose_route_list_lock); spin_unlock_bh(&rose_neigh_list_lock); @@ -1186,7 +1189,7 @@ static int rose_neigh_show(struct seq_file *seq, void *v) (rose_neigh->loopback) ? "RSLOOP-0" : ax2asc(buf, &rose_neigh->callsign), rose_neigh->dev ? rose_neigh->dev->name : "???", rose_neigh->count, - rose_neigh->use, + refcount_read(&rose_neigh->use) - 1, (rose_neigh->dce_mode) ? "DCE" : "DTE", (rose_neigh->restarted) ? "yes" : "no", ax25_display_timer(&rose_neigh->t0timer) / HZ, diff --git a/net/rose/rose_timer.c b/net/rose/rose_timer.c index 1525773e94aa..c52d7d20c519 100644 --- a/net/rose/rose_timer.c +++ b/net/rose/rose_timer.c @@ -180,7 +180,7 @@ static void rose_timer_expiry(struct timer_list *t) break; case ROSE_STATE_2: /* T3 */ - rose->neighbour->use--; + rose_neigh_put(rose->neighbour); rose_disconnect(sk, ETIMEDOUT, -1, -1); break; From 9c547c8eee9d1cf6e744611d688b9f725cf9a115 Mon Sep 17 00:00:00 2001 From: Takamitsu Iwai Date: Sat, 23 Aug 2025 17:58:57 +0900 Subject: [PATCH 53/76] net: rose: include node references in rose_neigh refcount [ Upstream commit da9c9c877597170b929a6121a68dcd3dd9a80f45 ] Current implementation maintains two separate reference counting mechanisms: the 'count' field in struct rose_neigh tracks references from rose_node structures, while the 'use' field (now refcount_t) tracks references from rose_sock. This patch merges these two reference counting systems using 'use' field for proper reference management. Specifically, this patch adds incrementing and decrementing of rose_neigh->use when rose_neigh->count is incremented or decremented. This patch also modifies rose_rt_free(), rose_rt_device_down() and rose_clear_route() to properly release references to rose_neigh objects before freeing a rose_node through rose_remove_node(). These changes ensure rose_neigh structures are properly freed only when all references, including those from rose_node structures, are released. As a result, this resolves a slab-use-after-free issue reported by Syzbot. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+942297eecf7d2d61d1f1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=942297eecf7d2d61d1f1 Signed-off-by: Takamitsu Iwai Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20250823085857.47674-4-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/rose/rose_route.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c index 42460da0854d..6acbb795c506 100644 --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -178,6 +178,7 @@ static int __must_check rose_add_node(struct rose_route_struct *rose_route, } } rose_neigh->count++; + rose_neigh_hold(rose_neigh); goto out; } @@ -187,6 +188,7 @@ static int __must_check rose_add_node(struct rose_route_struct *rose_route, rose_node->neighbour[rose_node->count] = rose_neigh; rose_node->count++; rose_neigh->count++; + rose_neigh_hold(rose_neigh); } out: @@ -322,6 +324,7 @@ static int rose_del_node(struct rose_route_struct *rose_route, for (i = 0; i < rose_node->count; i++) { if (rose_node->neighbour[i] == rose_neigh) { rose_neigh->count--; + rose_neigh_put(rose_neigh); if (rose_neigh->count == 0) { rose_remove_neigh(rose_neigh); @@ -430,6 +433,7 @@ int rose_add_loopback_node(const rose_address *address) rose_node_list = rose_node; rose_loopback_neigh->count++; + rose_neigh_hold(rose_loopback_neigh); out: spin_unlock_bh(&rose_node_list_lock); @@ -461,6 +465,7 @@ void rose_del_loopback_node(const rose_address *address) rose_remove_node(rose_node); rose_loopback_neigh->count--; + rose_neigh_put(rose_loopback_neigh); out: spin_unlock_bh(&rose_node_list_lock); @@ -500,6 +505,7 @@ void rose_rt_device_down(struct net_device *dev) memmove(&t->neighbour[i], &t->neighbour[i + 1], sizeof(t->neighbour[0]) * (t->count - i)); + rose_neigh_put(s); } if (t->count <= 0) @@ -543,6 +549,7 @@ static int rose_clear_routes(void) { struct rose_neigh *s, *rose_neigh; struct rose_node *t, *rose_node; + int i; spin_lock_bh(&rose_node_list_lock); spin_lock_bh(&rose_neigh_list_lock); @@ -553,8 +560,12 @@ static int rose_clear_routes(void) while (rose_node != NULL) { t = rose_node; rose_node = rose_node->next; - if (!t->loopback) + + if (!t->loopback) { + for (i = 0; i < rose_node->count; i++) + rose_neigh_put(t->neighbour[i]); rose_remove_node(t); + } } while (rose_neigh != NULL) { @@ -1189,7 +1200,7 @@ static int rose_neigh_show(struct seq_file *seq, void *v) (rose_neigh->loopback) ? "RSLOOP-0" : ax2asc(buf, &rose_neigh->callsign), rose_neigh->dev ? rose_neigh->dev->name : "???", rose_neigh->count, - refcount_read(&rose_neigh->use) - 1, + refcount_read(&rose_neigh->use) - rose_neigh->count - 1, (rose_neigh->dce_mode) ? "DCE" : "DTE", (rose_neigh->restarted) ? "yes" : "no", ax25_display_timer(&rose_neigh->t0timer) / HZ, @@ -1294,6 +1305,7 @@ void __exit rose_rt_free(void) struct rose_neigh *s, *rose_neigh = rose_neigh_list; struct rose_node *t, *rose_node = rose_node_list; struct rose_route *u, *rose_route = rose_route_list; + int i; while (rose_neigh != NULL) { s = rose_neigh; @@ -1307,6 +1319,8 @@ void __exit rose_rt_free(void) t = rose_node; rose_node = rose_node->next; + for (i = 0; i < t->count; i++) + rose_neigh_put(t->neighbour[i]); rose_remove_node(t); } From 463aa96fca6209bb205f49f7deea3817d7ddaa3a Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 26 Aug 2025 14:13:14 +0000 Subject: [PATCH 54/76] sctp: initialize more fields in sctp_v6_from_sk() [ Upstream commit 2e8750469242cad8f01f320131fd5a6f540dbb99 ] syzbot found that sin6_scope_id was not properly initialized, leading to undefined behavior. Clear sin6_scope_id and sin6_flowinfo. BUG: KMSAN: uninit-value in __sctp_v6_cmp_addr+0x887/0x8c0 net/sctp/ipv6.c:649 __sctp_v6_cmp_addr+0x887/0x8c0 net/sctp/ipv6.c:649 sctp_inet6_cmp_addr+0x4f2/0x510 net/sctp/ipv6.c:983 sctp_bind_addr_conflict+0x22a/0x3b0 net/sctp/bind_addr.c:390 sctp_get_port_local+0x21eb/0x2440 net/sctp/socket.c:8452 sctp_get_port net/sctp/socket.c:8523 [inline] sctp_listen_start net/sctp/socket.c:8567 [inline] sctp_inet_listen+0x710/0xfd0 net/sctp/socket.c:8636 __sys_listen_socket net/socket.c:1912 [inline] __sys_listen net/socket.c:1927 [inline] __do_sys_listen net/socket.c:1932 [inline] __se_sys_listen net/socket.c:1930 [inline] __x64_sys_listen+0x343/0x4c0 net/socket.c:1930 x64_sys_call+0x271d/0x3e20 arch/x86/include/generated/asm/syscalls_64.h:51 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Local variable addr.i.i created at: sctp_get_port net/sctp/socket.c:8515 [inline] sctp_listen_start net/sctp/socket.c:8567 [inline] sctp_inet_listen+0x650/0xfd0 net/sctp/socket.c:8636 __sys_listen_socket net/socket.c:1912 [inline] __sys_listen net/socket.c:1927 [inline] __do_sys_listen net/socket.c:1932 [inline] __se_sys_listen net/socket.c:1930 [inline] __x64_sys_listen+0x343/0x4c0 net/socket.c:1930 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+e69f06a0f30116c68056@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68adc0a2.050a0220.37038e.00c4.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Cc: Marcelo Ricardo Leitner Acked-by: Xin Long Link: https://patch.msgid.link/20250826141314.1802610-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/sctp/ipv6.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 717828e53162..0673857cb3d8 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -547,7 +547,9 @@ static void sctp_v6_from_sk(union sctp_addr *addr, struct sock *sk) { addr->v6.sin6_family = AF_INET6; addr->v6.sin6_port = 0; + addr->v6.sin6_flowinfo = 0; addr->v6.sin6_addr = sk->sk_v6_rcv_saddr; + addr->v6.sin6_scope_id = 0; } /* Initialize sk->sk_rcv_saddr from sctp_addr. */ From 925599eba46045930b850a98ae594d2e3028ac40 Mon Sep 17 00:00:00 2001 From: Li Nan Date: Wed, 27 Aug 2025 15:39:54 +0800 Subject: [PATCH 55/76] efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare [ Upstream commit a6358f8cf64850f3f27857b8ed8c1b08cfc4685c ] Observed on kernel 6.6 (present on master as well): BUG: KASAN: slab-out-of-bounds in memcmp+0x98/0xd0 Call trace: kasan_check_range+0xe8/0x190 __asan_loadN+0x1c/0x28 memcmp+0x98/0xd0 efivarfs_d_compare+0x68/0xd8 __d_lookup_rcu_op_compare+0x178/0x218 __d_lookup_rcu+0x1f8/0x228 d_alloc_parallel+0x150/0x648 lookup_open.isra.0+0x5f0/0x8d0 open_last_lookups+0x264/0x828 path_openat+0x130/0x3f8 do_filp_open+0x114/0x248 do_sys_openat2+0x340/0x3c0 __arm64_sys_openat+0x120/0x1a0 If dentry->d_name.len < EFI_VARIABLE_GUID_LEN , 'guid' can become negative, leadings to oob. The issue can be triggered by parallel lookups using invalid filename: T1 T2 lookup_open ->lookup simple_lookup d_add // invalid dentry is added to hash list lookup_open d_alloc_parallel __d_lookup_rcu __d_lookup_rcu_op_compare hlist_bl_for_each_entry_rcu // invalid dentry can be retrieved ->d_compare efivarfs_d_compare // oob Fix it by checking 'guid' before cmp. Fixes: da27a24383b2 ("efivarfs: guid part of filenames are case-insensitive") Signed-off-by: Li Nan Signed-off-by: Wu Guanghao Signed-off-by: Ard Biesheuvel Signed-off-by: Sasha Levin --- fs/efivarfs/super.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c index 586c5709dfb5..34438981ddd8 100644 --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -90,6 +90,10 @@ static int efivarfs_d_compare(const struct dentry *dentry, { int guid = len - EFI_VARIABLE_GUID_LEN; + /* Parallel lookups may produce a temporary invalid filename */ + if (guid <= 0) + return 1; + if (name->len != len) return 1; From f49161646e03d107ce81a99c6ca5da682fe5fb69 Mon Sep 17 00:00:00 2001 From: Thijs Raymakers Date: Mon, 4 Aug 2025 08:44:05 +0200 Subject: [PATCH 56/76] KVM: x86: use array_index_nospec with indices that come from guest commit c87bd4dd43a624109c3cc42d843138378a7f4548 upstream. min and dest_id are guest-controlled indices. Using array_index_nospec() after the bounds checks clamps these values to mitigate speculative execution side-channels. Signed-off-by: Thijs Raymakers Cc: stable@vger.kernel.org Cc: Sean Christopherson Cc: Paolo Bonzini Cc: Greg Kroah-Hartman Fixes: 715062970f37 ("KVM: X86: Implement PV sched yield hypercall") Fixes: bdf7ffc89922 ("KVM: LAPIC: Fix pv ipis out-of-bounds access") Fixes: 4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall") Link: https://lore.kernel.org/r/20250804064405.4802-1-thijs@raymakers.nl Signed-off-by: Sean Christopherson Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/lapic.c | 2 ++ arch/x86/kvm/x86.c | 7 +++++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index ba1c2a7f74f7..af4ae9216667 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -847,6 +847,8 @@ static int __pv_send_ipi(unsigned long *ipi_bitmap, struct kvm_apic_map *map, if (min > map->max_apic_id) return 0; + min = array_index_nospec(min, map->max_apic_id + 1); + for_each_set_bit(i, ipi_bitmap, min((u32)BITS_PER_LONG, (map->max_apic_id - min + 1))) { if (map->phys_map[min + i]) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index af0b2b3bc991..5088065ac704 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9802,8 +9802,11 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) rcu_read_lock(); map = rcu_dereference(vcpu->kvm->arch.apic_map); - if (likely(map) && dest_id <= map->max_apic_id && map->phys_map[dest_id]) - target = map->phys_map[dest_id]->vcpu; + if (likely(map) && dest_id <= map->max_apic_id) { + dest_id = array_index_nospec(dest_id, map->max_apic_id + 1); + if (map->phys_map[dest_id]) + target = map->phys_map[dest_id]->vcpu; + } rcu_read_unlock(); From c84ba4cdf4c65eef0f94661ab4cf3895594546a1 Mon Sep 17 00:00:00 2001 From: "Borislav Petkov (AMD)" Date: Wed, 20 Aug 2025 11:58:57 +0200 Subject: [PATCH 57/76] x86/microcode/AMD: Handle the case of no BIOS microcode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit fcf8239ad6a5de54fa7ce18e464c6b5951b982cb upstream. Machines can be shipped without any microcode in the BIOS. Which means, the microcode patch revision is 0. Handle that gracefully. Fixes: 94838d230a6c ("x86/microcode/AMD: Use the family,model,stepping encoded in the patch ID") Reported-by: Vítek Vávra Signed-off-by: Borislav Petkov (AMD) Cc: Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/cpu/microcode/amd.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 7444fe0e3d08..2cb30d9c5b4a 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -161,8 +161,28 @@ static int cmp_id(const void *key, const void *elem) return 1; } +static u32 cpuid_to_ucode_rev(unsigned int val) +{ + union zen_patch_rev p = {}; + union cpuid_1_eax c; + + c.full = val; + + p.stepping = c.stepping; + p.model = c.model; + p.ext_model = c.ext_model; + p.ext_fam = c.ext_fam; + + return p.ucode_rev; +} + static bool need_sha_check(u32 cur_rev) { + if (!cur_rev) { + cur_rev = cpuid_to_ucode_rev(bsp_cpuid_1_eax); + pr_info_once("No current revision, generating the lowest one: 0x%x\n", cur_rev); + } + switch (cur_rev >> 8) { case 0x80012: return cur_rev <= 0x800126f; break; case 0x80082: return cur_rev <= 0x800820f; break; @@ -744,8 +764,6 @@ static struct ucode_patch *cache_find_patch(struct ucode_cpu_info *uci, u16 equi n.equiv_cpu = equiv_cpu; n.patch_id = uci->cpu_sig.rev; - WARN_ON_ONCE(!n.patch_id); - list_for_each_entry(p, µcode_cache, plist) if (patch_cpus_equivalent(p, &n, false)) return p; From 5f3c0839b173f7f33415eb098331879e547d1d2d Mon Sep 17 00:00:00 2001 From: Qasim Ijaz Date: Sun, 10 Aug 2025 19:10:41 +0100 Subject: [PATCH 58/76] HID: asus: fix UAF via HID_CLAIMED_INPUT validation commit d3af6ca9a8c34bbd8cff32b469b84c9021c9e7e4 upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty #36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8be12c ("HID: asus: Add i2c touchpad support") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz Link: https://patch.msgid.link/20250810181041.44874-1-qasdev00@gmail.com Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-asus.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-asus.c b/drivers/hid/hid-asus.c index 84625e817ce9..896f73aa4d2c 100644 --- a/drivers/hid/hid-asus.c +++ b/drivers/hid/hid-asus.c @@ -1108,7 +1108,13 @@ static int asus_probe(struct hid_device *hdev, const struct hid_device_id *id) return ret; } - if (!drvdata->input) { + /* + * Check that input registration succeeded. Checking that + * HID_CLAIMED_INPUT is set prevents a UAF when all input devices + * were freed during registration due to no usages being mapped, + * leaving drvdata->input pointing to freed memory. + */ + if (!drvdata->input || !(hdev->claimed & HID_CLAIMED_INPUT)) { hid_err(hdev, "Asus input not registered\n"); ret = -ENOMEM; goto err_stop_hw; From d4e6e2680807671e1c73cd6a986b33659ce92f2b Mon Sep 17 00:00:00 2001 From: Qasim Ijaz Date: Sun, 10 Aug 2025 19:09:24 +0100 Subject: [PATCH 59/76] HID: multitouch: fix slab out-of-bounds access in mt_report_fixup() commit 0379eb8691b9c4477da0277ae0832036ca4410b4 upstream. A malicious HID device can trigger a slab out-of-bounds during mt_report_fixup() by passing in report descriptor smaller than 607 bytes. mt_report_fixup() attempts to patch byte offset 607 of the descriptor with 0x25 by first checking if byte offset 607 is 0x15 however it lacks bounds checks to verify if the descriptor is big enough before conducting this check. Fix this bug by ensuring the descriptor size is at least 608 bytes before accessing it. Below is the KASAN splat after the out of bounds access happens: [ 13.671954] ================================================================== [ 13.672667] BUG: KASAN: slab-out-of-bounds in mt_report_fixup+0x103/0x110 [ 13.673297] Read of size 1 at addr ffff888103df39df by task kworker/0:1/10 [ 13.673297] [ 13.673297] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0-00005-gec5d573d83f4-dirty #3 [ 13.673297] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/04 [ 13.673297] Call Trace: [ 13.673297] [ 13.673297] dump_stack_lvl+0x5f/0x80 [ 13.673297] print_report+0xd1/0x660 [ 13.673297] kasan_report+0xe5/0x120 [ 13.673297] __asan_report_load1_noabort+0x18/0x20 [ 13.673297] mt_report_fixup+0x103/0x110 [ 13.673297] hid_open_report+0x1ef/0x810 [ 13.673297] mt_probe+0x422/0x960 [ 13.673297] hid_device_probe+0x2e2/0x6f0 [ 13.673297] really_probe+0x1c6/0x6b0 [ 13.673297] __driver_probe_device+0x24f/0x310 [ 13.673297] driver_probe_device+0x4e/0x220 [ 13.673297] __device_attach_driver+0x169/0x320 [ 13.673297] bus_for_each_drv+0x11d/0x1b0 [ 13.673297] __device_attach+0x1b8/0x3e0 [ 13.673297] device_initial_probe+0x12/0x20 [ 13.673297] bus_probe_device+0x13d/0x180 [ 13.673297] device_add+0xe3a/0x1670 [ 13.673297] hid_add_device+0x31d/0xa40 [...] Fixes: c8000deb6836 ("HID: multitouch: Add support for GT7868Q") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz Reviewed-by: Jiri Slaby Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-multitouch.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index becd4c1ccf93..a85581cd511f 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -1448,6 +1448,14 @@ static __u8 *mt_report_fixup(struct hid_device *hdev, __u8 *rdesc, if (hdev->vendor == I2C_VENDOR_ID_GOODIX && (hdev->product == I2C_DEVICE_ID_GOODIX_01E8 || hdev->product == I2C_DEVICE_ID_GOODIX_01E9)) { + if (*size < 608) { + dev_info( + &hdev->dev, + "GT7868Q fixup: report descriptor is only %u bytes, skipping\n", + *size); + return rdesc; + } + if (rdesc[607] == 0x15) { rdesc[607] = 0x25; dev_info( From 8783b2a0b740978a765611f306dbc3e73b46d138 Mon Sep 17 00:00:00 2001 From: Antheas Kapenekakis Date: Sun, 3 Aug 2025 18:02:53 +0200 Subject: [PATCH 60/76] HID: quirks: add support for Legion Go dual dinput modes commit 1f3214aae9f49faf495f3836216afbc6c5400b2e upstream. The Legion Go features detachable controllers which support a dual dinput mode. In this mode, the controllers appear under a single HID device with two applications. Currently, both controllers appear under the same event device, causing their controls to be mixed up. This patch separates the two so that they can be used independently. In addition, the latest firmware update for the Legion Go swaps the IDs to the ones used by the Legion Go 2, so add those IDs as well. [jkosina@suse.com: improved shortlog] Signed-off-by: Antheas Kapenekakis Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-ids.h | 2 ++ drivers/hid/hid-quirks.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 0d1d7162814f..e8b51e80c4c5 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -818,6 +818,8 @@ #define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019 0x6019 #define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_602E 0x602e #define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6093 0x6093 +#define USB_DEVICE_ID_LENOVO_LEGION_GO_DUAL_DINPUT 0x6184 +#define USB_DEVICE_ID_LENOVO_LEGION_GO2_DUAL_DINPUT 0x61ed #define USB_VENDOR_ID_LETSKETCH 0x6161 #define USB_DEVICE_ID_WP9620N 0x4d15 diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 80372342c176..64f9728018b8 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -124,6 +124,8 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_MOUSEPEN_I608X_V2), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_PENSKETCH_T609A), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_LABTEC, USB_DEVICE_ID_LABTEC_ODDOR_HANDBRAKE), HID_QUIRK_ALWAYS_POLL }, + { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_LEGION_GO_DUAL_DINPUT), HID_QUIRK_MULTI_INPUT }, + { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_LEGION_GO2_DUAL_DINPUT), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_OPTICAL_USB_MOUSE_600E), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_608D), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019), HID_QUIRK_ALWAYS_POLL }, From 88d3c2790c335288bb92d1ccc3aa2adac8d51266 Mon Sep 17 00:00:00 2001 From: Matt Coffin Date: Wed, 20 Aug 2025 01:49:51 -0600 Subject: [PATCH 61/76] HID: logitech: Add ids for G PRO 2 LIGHTSPEED commit ab1bb82f3db20e23eace06db52031b1164a110c2 upstream. Adds support for the G PRO 2 LIGHTSPEED Wireless via it's nano receiver or directly. This nano receiver appears to work identically to the 1_1 receiver for the case I've verified, which is the battery status through lg-hidpp. The same appears to be the case wired, sharing much with the Pro X Superlight 2; differences seemed to lie in userland configuration rather than in interfaces used by hid_logitech_hidpp on the kernel side. I verified the sysfs interface for battery charge/discharge status, and capacity read to be working on my 910-007290 device (white). Signed-off-by: Matt Coffin Reviewed-by: Bastien Nocera Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-logitech-dj.c | 4 ++++ drivers/hid/hid-logitech-hidpp.c | 2 ++ 3 files changed, 7 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index e8b51e80c4c5..3f74633070b6 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -893,6 +893,7 @@ #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_2 0xc534 #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1 0xc539 #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_1 0xc53f +#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_2 0xc543 #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_POWERPLAY 0xc53a #define USB_DEVICE_ID_LOGITECH_BOLT_RECEIVER 0xc548 #define USB_DEVICE_ID_SPACETRAVELLER 0xc623 diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c index 37958edec55f..e2d5b3f69914 100644 --- a/drivers/hid/hid-logitech-dj.c +++ b/drivers/hid/hid-logitech-dj.c @@ -1983,6 +1983,10 @@ static const struct hid_device_id logi_dj_receivers[] = { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_1), .driver_data = recvr_type_gaming_hidpp}, + { /* Logitech lightspeed receiver (0xc543) */ + HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, + USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_2), + .driver_data = recvr_type_gaming_hidpp}, { /* Logitech 27 MHz HID++ 1.0 receiver (0xc513) */ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_MX3000_RECEIVER), diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index 4519ee377aa7..3a2c1e48aba2 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -4652,6 +4652,8 @@ static const struct hid_device_id hidpp_devices[] = { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC094) }, { /* Logitech G Pro X Superlight 2 Gaming Mouse over USB */ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC09b) }, + { /* Logitech G PRO 2 LIGHTSPEED Wireless Mouse over USB */ + HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xc09a) }, { /* G935 Gaming Headset */ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0x0a87), From b9166ea27d0afaf7852dbc1bc4aa2ff92dcc7ea2 Mon Sep 17 00:00:00 2001 From: Ping Cheng Date: Sun, 10 Aug 2025 22:40:30 -0700 Subject: [PATCH 62/76] HID: wacom: Add a new Art Pen 2 commit 9fc51941d9e7793da969b2c66e6f8213c5b1237f upstream. Signed-off-by: Ping Cheng Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/wacom_wac.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c index dd44373ba930..42bc8f05e263 100644 --- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -684,6 +684,7 @@ static bool wacom_is_art_pen(int tool_id) case 0x885: /* Intuos3 Marker Pen */ case 0x804: /* Intuos4/5 13HD/24HD Marker Pen */ case 0x10804: /* Intuos4/5 13HD/24HD Art Pen */ + case 0x204: /* Art Pen 2 */ is_art_pen = true; break; } From e422370e6ab28478872b914cee5d49a9bdfae0c6 Mon Sep 17 00:00:00 2001 From: Minjong Kim Date: Wed, 13 Aug 2025 19:30:22 +0900 Subject: [PATCH 63/76] HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version() commit 185c926283da67a72df20a63a5046b3b4631b7d9 upstream. in ntrig_report_version(), hdev parameter passed from hid_probe(). sending descriptor to /dev/uhid can make hdev->dev.parent->parent to null if hdev->dev.parent->parent is null, usb_dev has invalid address(0xffffffffffffff58) that hid_to_usb_dev(hdev) returned when usb_rcvctrlpipe() use usb_dev,it trigger page fault error for address(0xffffffffffffff58) add null check logic to ntrig_report_version() before calling hid_to_usb_dev() Signed-off-by: Minjong Kim Link: https://patch.msgid.link/20250813-hid-ntrig-page-fault-fix-v2-1-f98581f35106@samsung.com Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-ntrig.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/hid/hid-ntrig.c b/drivers/hid/hid-ntrig.c index b5d26f03fe6b..a1128c5315ff 100644 --- a/drivers/hid/hid-ntrig.c +++ b/drivers/hid/hid-ntrig.c @@ -144,6 +144,9 @@ static void ntrig_report_version(struct hid_device *hdev) struct usb_device *usb_dev = hid_to_usb_dev(hdev); unsigned char *data = kmalloc(8, GFP_KERNEL); + if (!hid_is_usb(hdev)) + return; + if (!data) goto err_free; From 8911ec881cd79c9a6a827c005ff987e887550153 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Mon, 25 Aug 2025 13:40:22 -0400 Subject: [PATCH 64/76] Revert "drm/amdgpu: fix incorrect vm flags to map bo" commit ac4ed2da4c1305a1a002415058aa7deaf49ffe3e upstream. This reverts commit b08425fa77ad2f305fe57a33dceb456be03b653f. Revert this to align with 6.17 because the fixes tag was wrong on this commit. Signed-off-by: Alex Deucher (cherry picked from commit be33e8a239aac204d7e9e673c4220ef244eb1ba3) Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 384834fbd590..720011019741 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -89,8 +89,8 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, } r = amdgpu_vm_bo_map(adev, *bo_va, csa_addr, 0, size, - AMDGPU_VM_PAGE_READABLE | AMDGPU_VM_PAGE_WRITEABLE | - AMDGPU_VM_PAGE_EXECUTABLE); + AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE | + AMDGPU_PTE_EXECUTABLE); if (r) { DRM_ERROR("failed to do bo_map on static CSA, err=%d\n", r); From 4f43a6d376dad78646aaffb1c2a3c819c9d6af43 Mon Sep 17 00:00:00 2001 From: Shanker Donthineni Date: Mon, 11 Aug 2025 13:17:59 -0500 Subject: [PATCH 65/76] dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted commit 89a2d212bdb4bc29bed8e7077abe054b801137ea upstream. When CONFIG_DMA_DIRECT_REMAP is enabled, atomic pool pages are remapped via dma_common_contiguous_remap() using the supplied pgprot. Currently, the mapping uses pgprot_dmacoherent(PAGE_KERNEL), which leaves the memory encrypted on systems with memory encryption enabled (e.g., ARM CCA Realms). This can cause the DMA layer to fail or crash when accessing the memory, as the underlying physical pages are not configured as expected. Fix this by requesting a decrypted mapping in the vmap() call: pgprot_decrypted(pgprot_dmacoherent(PAGE_KERNEL)) This ensures that atomic pool memory is consistently mapped unencrypted. Cc: stable@vger.kernel.org Signed-off-by: Shanker Donthineni Reviewed-by: Catalin Marinas Signed-off-by: Marek Szyprowski Link: https://lore.kernel.org/r/20250811181759.998805-1-sdonthineni@nvidia.com Signed-off-by: Greg Kroah-Hartman --- kernel/dma/pool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c index b481c48a31a6..6b0be9598a97 100644 --- a/kernel/dma/pool.c +++ b/kernel/dma/pool.c @@ -102,8 +102,8 @@ static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size, #ifdef CONFIG_DMA_DIRECT_REMAP addr = dma_common_contiguous_remap(page, pool_size, - pgprot_dmacoherent(PAGE_KERNEL), - __builtin_return_address(0)); + pgprot_decrypted(pgprot_dmacoherent(PAGE_KERNEL)), + __builtin_return_address(0)); if (!addr) goto free_page; #else From 4191ea1f0bb3e27d65c5dcde7bd00e709ec67141 Mon Sep 17 00:00:00 2001 From: Shuhao Fu Date: Thu, 28 Aug 2025 02:24:19 +0800 Subject: [PATCH 66/76] fs/smb: Fix inconsistent refcnt update commit ab529e6ca1f67bcf31f3ea80c72bffde2e9e053e upstream. A possible inconsistent update of refcount was identified in `smb2_compound_op`. Such inconsistent update could lead to possible resource leaks. Why it is a possible bug: 1. In the comment section of the function, it clearly states that the reference to `cfile` should be dropped after calling this function. 2. Every control flow path would check and drop the reference to `cfile`, except the patched one. 3. Existing callers would not handle refcount update of `cfile` if -ENOMEM is returned. To fix the bug, an extra goto label "out" is added, to make sure that the cleanup logic would always be respected. As the problem is caused by the allocation failure of `vars`, the cleanup logic between label "finished" and "out" can be safely ignored. According to the definition of function `is_replayable_error`, the error code of "-ENOMEM" is not recoverable. Therefore, the replay logic also gets ignored. Signed-off-by: Shuhao Fu Acked-by: Paulo Alcantara (Red Hat) Cc: stable@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman --- fs/smb/client/smb2inode.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c index e1078a1decdf..0cc80f472432 100644 --- a/fs/smb/client/smb2inode.c +++ b/fs/smb/client/smb2inode.c @@ -206,8 +206,10 @@ replay_again: server = cifs_pick_channel(ses); vars = kzalloc(sizeof(*vars), GFP_ATOMIC); - if (vars == NULL) - return -ENOMEM; + if (vars == NULL) { + rc = -ENOMEM; + goto out; + } rqst = &vars->rqst[0]; rsp_iov = &vars->rsp_iov[0]; @@ -828,6 +830,7 @@ finished: smb2_should_replay(tcon, &retries, &cur_sleep)) goto replay_again; +out: if (cfile) cifsFileInfo_put(cfile); From 3c332109173396b03567cf968db96f19eff135e9 Mon Sep 17 00:00:00 2001 From: Fabio Porcedda Date: Fri, 22 Aug 2025 11:13:24 +0200 Subject: [PATCH 67/76] net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions commit e81a7f65288c7e2cfb7e7890f648e099fd885ab3 upstream. Add the following Telit Cinterion LE910C4-WWX new compositions: 0x1034: tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1034 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x1037: tty (diag) + tty (Telit custom) + tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 15 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1037 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x1038: tty (Telit custom) + tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 9 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1038 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda Link: https://patch.msgid.link/20250822091324.39558-1-Fabio.Porcedda@telit.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman --- drivers/net/usb/qmi_wwan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 3976bc4295dd..eba755b584a4 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1363,6 +1363,9 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x2357, 0x0201, 4)}, /* TP-LINK HSUPA Modem MA180 */ {QMI_FIXED_INTF(0x2357, 0x9000, 4)}, /* TP-LINK MA260 */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1031, 3)}, /* Telit LE910C1-EUX */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1034, 2)}, /* Telit LE910C4-WWX */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1037, 4)}, /* Telit LE910C4-WWX */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1038, 3)}, /* Telit LE910C4-WWX */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x103a, 0)}, /* Telit LE910C4-WWX */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1040, 2)}, /* Telit LE922A */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1050, 2)}, /* Telit FN980 */ From 1424f6132fc8a41c8d21bef1696109c46979be5d Mon Sep 17 00:00:00 2001 From: Steve French Date: Sat, 23 Aug 2025 21:15:59 -0500 Subject: [PATCH 68/76] smb3 client: fix return code mapping of remap_file_range commit 0e08fa789d39aa01923e3ba144bd808291895c3c upstream. We were returning -EOPNOTSUPP for various remap_file_range cases but for some of these the copy_file_range_syscall() requires -EINVAL to be returned (e.g. where source and target file ranges overlap when source and target are the same file). This fixes xfstest generic/157 which was expecting EINVAL for that (and also e.g. for when the src offset is beyond end of file). Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman --- fs/smb/client/cifsfs.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index a1ab95f382d5..2744d5580d19 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -1371,6 +1371,20 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, netfs_resize_file(&target_cifsi->netfs, new_size); fscache_resize_cookie(cifs_inode_cookie(target_inode), new_size); + } else if (rc == -EOPNOTSUPP) { + /* + * copy_file_range syscall man page indicates EINVAL + * is returned e.g when "fd_in and fd_out refer to the + * same file and the source and target ranges overlap." + * Test generic/157 was what showed these cases where + * we need to remap EOPNOTSUPP to EINVAL + */ + if (off >= src_inode->i_size) { + rc = -EINVAL; + } else if (src_inode == target_inode) { + if (off + len > destoff) + rc = -EINVAL; + } } } From 4e86e5ba325e9938f824a185801ff945727f6de8 Mon Sep 17 00:00:00 2001 From: James Jones Date: Mon, 11 Aug 2025 15:00:16 -0700 Subject: [PATCH 69/76] drm/nouveau/disp: Always accept linear modifier commit e2fe0c54fb7401e6ecd3c10348519ab9e23bd639 upstream. On some chipsets, which block-linear modifiers are supported is format-specific. However, linear modifiers are always be supported. The prior modifier filtering logic was not accounting for the linear case. Cc: stable@vger.kernel.org Fixes: c586f30bf74c ("drm/nouveau/kms: Add format mod prop to base/ovly/nvdisp") Signed-off-by: James Jones Link: https://lore.kernel.org/r/20250811220017.1337-3-jajones@nvidia.com Signed-off-by: Danilo Krummrich Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index 7a2cceaee6e9..1199dfc1194c 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -663,6 +663,10 @@ static bool nv50_plane_format_mod_supported(struct drm_plane *plane, struct nouveau_drm *drm = nouveau_drm(plane->dev); uint8_t i; + /* All chipsets can display all formats in linear layout */ + if (modifier == DRM_FORMAT_MOD_LINEAR) + return true; + if (drm->client.device.info.chipset < 0xc0) { const struct drm_format_info *info = drm_format_info(format); const uint8_t kind = (modifier >> 12) & 0xff; From ea687c003663d1698131b3c75e8a1286becd52b4 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 27 Aug 2025 17:21:49 +0000 Subject: [PATCH 70/76] net: rose: fix a typo in rose_clear_routes() commit 1cc8a5b534e5f9b5e129e54ee2e63c9f5da4f39a upstream. syzbot crashed in rose_clear_routes(), after a recent patch typo. KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 0 UID: 0 PID: 10591 Comm: syz.3.1856 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:rose_clear_routes net/rose/rose_route.c:565 [inline] RIP: 0010:rose_rt_ioctl+0x162/0x1250 net/rose/rose_route.c:760 rose_ioctl+0x3ce/0x8b0 net/rose/af_rose.c:1381 sock_do_ioctl+0xd9/0x300 net/socket.c:1238 sock_ioctl+0x576/0x790 net/socket.c:1359 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: da9c9c877597 ("net: rose: include node references in rose_neigh refcount") Reported-by: syzbot+2eb8d1719f7cfcfa6840@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68af3e29.a70a0220.3cafd4.002e.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Cc: Takamitsu Iwai Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20250827172149.5359-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman --- net/rose/rose_route.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c index 6acbb795c506..28746ae5a258 100644 --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -562,7 +562,7 @@ static int rose_clear_routes(void) rose_node = rose_node->next; if (!t->loopback) { - for (i = 0; i < rose_node->count; i++) + for (i = 0; i < t->count; i++) rose_neigh_put(t->neighbour[i]); rose_remove_node(t); } From 1b3ccc39480722fc0c11537858a47e0c8cc9ad3c Mon Sep 17 00:00:00 2001 From: Chris Mi Date: Wed, 15 Jan 2025 13:39:06 +0200 Subject: [PATCH 71/76] net/mlx5: SF, Fix add port error handling commit 2011a2a18ef00b5b8e4b753acbe6451a8c5f2260 upstream. If failed to add SF, error handling doesn't delete the SF from the SF table. But the hw resources are deleted. So when unload driver, hw resources will be deleted again. Firmware will report syndrome 0x68def3 which means "SF is not allocated can not deallocate". Fix it by delete SF from SF table if failed to add SF. Fixes: 2597ee190b4e ("net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc()") Signed-off-by: Chris Mi Reviewed-by: Shay Drori Reviewed-by: Jacob Keller Signed-off-by: Tariq Toukan Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index c9089f2ec5f2..d5b2b6cfc8d2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -280,6 +280,7 @@ static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, return 0; esw_err: + mlx5_sf_function_id_erase(table, sf); mlx5_sf_free(table, sf); return err; } From 5a809b34250cef465df5d4371b75e27e93fc8eb4 Mon Sep 17 00:00:00 2001 From: Hamish Martin Date: Wed, 25 Oct 2023 16:55:13 +1300 Subject: [PATCH 72/76] HID: mcp2221: Don't set bus speed on every transfer commit 02a46753601a24e1673d9c28173121055e8e6cc9 upstream. Since the initial commit of this driver the I2C bus speed has been reconfigured for every single transfer. This is despite the fact that we never change the speed and it is never "lost" by the chip. Upon investigation we find that what was really happening was that the setting of the bus speed had the side effect of cancelling a previous failed command if there was one, thereby freeing the bus. This is the part that was actually required to keep the bus operational in the face of failed commands. Instead of always setting the speed, we now correctly cancel any failed commands as they are detected. This means we can just set the bus speed at probe time and remove the previous speed sets on each transfer. This has the effect of improving performance and reducing the number of commands required to complete transfers. Signed-off-by: Hamish Martin Signed-off-by: Jiri Kosina Fixes: 67a95c21463d ("HID: mcp2221: add usb to i2c-smbus host bridge") [romain.sioen@microchip.com: backport to stable, up to 6.8. Add "Fixes" tag] Signed-off-by: Romain Sioen Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-mcp2221.c | 41 ++++++++++++++++++++++++++------------- 1 file changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c index c5bfca8ac5e6..ce4e8e6b3d86 100644 --- a/drivers/hid/hid-mcp2221.c +++ b/drivers/hid/hid-mcp2221.c @@ -187,6 +187,25 @@ static int mcp_cancel_last_cmd(struct mcp2221 *mcp) return mcp_send_data_req_status(mcp, mcp->txbuf, 8); } +/* Check if the last command succeeded or failed and return the result. + * If the command did fail, cancel that command which will free the i2c bus. + */ +static int mcp_chk_last_cmd_status_free_bus(struct mcp2221 *mcp) +{ + int ret; + + ret = mcp_chk_last_cmd_status(mcp); + if (ret) { + /* The last command was a failure. + * Send a cancel which will also free the bus. + */ + usleep_range(980, 1000); + mcp_cancel_last_cmd(mcp); + } + + return ret; +} + static int mcp_set_i2c_speed(struct mcp2221 *mcp) { int ret; @@ -241,7 +260,7 @@ static int mcp_i2c_write(struct mcp2221 *mcp, usleep_range(980, 1000); if (last_status) { - ret = mcp_chk_last_cmd_status(mcp); + ret = mcp_chk_last_cmd_status_free_bus(mcp); if (ret) return ret; } @@ -308,7 +327,7 @@ static int mcp_i2c_smbus_read(struct mcp2221 *mcp, if (ret) return ret; - ret = mcp_chk_last_cmd_status(mcp); + ret = mcp_chk_last_cmd_status_free_bus(mcp); if (ret) return ret; @@ -328,11 +347,6 @@ static int mcp_i2c_xfer(struct i2c_adapter *adapter, mutex_lock(&mcp->lock); - /* Setting speed before every transaction is required for mcp2221 */ - ret = mcp_set_i2c_speed(mcp); - if (ret) - goto exit; - if (num == 1) { if (msgs->flags & I2C_M_RD) { ret = mcp_i2c_smbus_read(mcp, msgs, MCP2221_I2C_RD_DATA, @@ -417,9 +431,7 @@ static int mcp_smbus_write(struct mcp2221 *mcp, u16 addr, if (last_status) { usleep_range(980, 1000); - ret = mcp_chk_last_cmd_status(mcp); - if (ret) - return ret; + ret = mcp_chk_last_cmd_status_free_bus(mcp); } return ret; @@ -437,10 +449,6 @@ static int mcp_smbus_xfer(struct i2c_adapter *adapter, u16 addr, mutex_lock(&mcp->lock); - ret = mcp_set_i2c_speed(mcp); - if (ret) - goto exit; - switch (size) { case I2C_SMBUS_QUICK: @@ -1152,6 +1160,11 @@ static int mcp2221_probe(struct hid_device *hdev, if (i2c_clk_freq < 50) i2c_clk_freq = 50; mcp->cur_i2c_clk_div = (12000000 / (i2c_clk_freq * 1000)) - 3; + ret = mcp_set_i2c_speed(mcp); + if (ret) { + hid_err(hdev, "can't set i2c speed: %d\n", ret); + return ret; + } mcp->adapter.owner = THIS_MODULE; mcp->adapter.class = I2C_CLASS_HWMON; From 6c9552de7f7e632abea46f0a2662074a6db966ba Mon Sep 17 00:00:00 2001 From: Hamish Martin Date: Wed, 25 Oct 2023 16:55:14 +1300 Subject: [PATCH 73/76] HID: mcp2221: Handle reads greater than 60 bytes commit 2682468671aa0b4d52ae09779b48212a80d71b91 upstream. When a user requests more than 60 bytes of data the MCP2221 must chunk the data in chunks up to 60 bytes long (see command/response code 0x40 in the datasheet). In order to signal that the device has more data the (undocumented) byte at byte index 2 of the Get I2C Data response uses the value 0x54. This contrasts with the case for the final data chunk where the value returned is 0x55 (MCP2221_I2C_READ_COMPL). The fact that 0x55 was not returned in the response was interpreted by the driver as a failure meaning that all reads of more than 60 bytes would fail. Add support for reads that are split over multiple chunks by looking for the response code indicating that more data is expected and continuing the read as the code intended. Some timing delays are required to ensure the chip has time to refill its FIFO as data is read in from the I2C bus. This timing has been tested in my system when configured for bus speeds of 50KHz, 100KHz, and 400KHz and operates well. Signed-off-by: Hamish Martin Signed-off-by: Jiri Kosina Fixes: 67a95c21463d0 ("HID: mcp2221: add usb to i2c-smbus host bridge") [romain.sioen@microchip.com: backport to stable, up to 6.8. Add "Fixes" tag] Signed-off-by: Romain Sioen Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-mcp2221.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c index ce4e8e6b3d86..a985301a4135 100644 --- a/drivers/hid/hid-mcp2221.c +++ b/drivers/hid/hid-mcp2221.c @@ -49,6 +49,7 @@ enum { MCP2221_I2C_MASK_ADDR_NACK = 0x40, MCP2221_I2C_WRADDRL_SEND = 0x21, MCP2221_I2C_ADDR_NACK = 0x25, + MCP2221_I2C_READ_PARTIAL = 0x54, MCP2221_I2C_READ_COMPL = 0x55, MCP2221_ALT_F_NOT_GPIOV = 0xEE, MCP2221_ALT_F_NOT_GPIOD = 0xEF, @@ -297,6 +298,7 @@ static int mcp_i2c_smbus_read(struct mcp2221 *mcp, { int ret; u16 total_len; + int retries = 0; mcp->txbuf[0] = type; if (msg) { @@ -320,20 +322,31 @@ static int mcp_i2c_smbus_read(struct mcp2221 *mcp, mcp->rxbuf_idx = 0; do { + /* Wait for the data to be read by the device */ + usleep_range(980, 1000); + memset(mcp->txbuf, 0, 4); mcp->txbuf[0] = MCP2221_I2C_GET_DATA; ret = mcp_send_data_req_status(mcp, mcp->txbuf, 1); - if (ret) - return ret; - - ret = mcp_chk_last_cmd_status_free_bus(mcp); - if (ret) - return ret; - - usleep_range(980, 1000); + if (ret) { + if (retries < 5) { + /* The data wasn't ready to read. + * Wait a bit longer and try again. + */ + usleep_range(90, 100); + retries++; + } else { + return ret; + } + } else { + retries = 0; + } } while (mcp->rxbuf_idx < total_len); + usleep_range(980, 1000); + ret = mcp_chk_last_cmd_status_free_bus(mcp); + return ret; } @@ -799,7 +812,8 @@ static int mcp2221_raw_event(struct hid_device *hdev, mcp->status = -EIO; break; } - if (data[2] == MCP2221_I2C_READ_COMPL) { + if (data[2] == MCP2221_I2C_READ_COMPL || + data[2] == MCP2221_I2C_READ_PARTIAL) { buf = mcp->rxbuf; memcpy(&buf[mcp->rxbuf_idx], &data[4], data[3]); mcp->rxbuf_idx = mcp->rxbuf_idx + data[3]; From 0699faf7041357dc4eb132d097295ce06cee77ac Mon Sep 17 00:00:00 2001 From: Imre Deak Date: Thu, 28 Aug 2025 20:49:29 +0300 Subject: [PATCH 74/76] Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS" This reverts commit 65e46aeaf84aa88539bcff6b8077e05fbd0700da which is commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f upstream. The upstream commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") the reverted commit backported causes a regression, on one eDP panel at least resulting in display flickering, described in detail at the Link: below. The issue fixed by the upstream commit will need a different solution, revert the backport for now. Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: Sasha Levin Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558 Signed-off-by: Imre Deak Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/display/drm_dp_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c index 772d8e662278..851f0baf9460 100644 --- a/drivers/gpu/drm/display/drm_dp_helper.c +++ b/drivers/gpu/drm/display/drm_dp_helper.c @@ -663,7 +663,7 @@ ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset, * monitor doesn't power down exactly after the throw away read. */ if (!aux->is_remote) { - ret = drm_dp_dpcd_probe(aux, DP_LANE0_1_STATUS); + ret = drm_dp_dpcd_probe(aux, DP_DPCD_REV); if (ret < 0) return ret; } From d3cc7476b89fb45b7e00874f4f56f6b928467c60 Mon Sep 17 00:00:00 2001 From: Eric Sandeen Date: Fri, 22 Aug 2025 12:55:56 -0500 Subject: [PATCH 75/76] xfs: do not propagate ENODATA disk errors into xattr code commit ae668cd567a6a7622bc813ee0bb61c42bed61ba7 upstream. ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code; namely, that the requested attribute name could not be found. However, a medium error from disk may also return ENODATA. At best, this medium error may escape to userspace as "attribute not found" when in fact it's an IO (disk) error. At worst, we may oops in xfs_attr_leaf_get() when we do: error = xfs_attr_leaf_hasname(args, &bp); if (error == -ENOATTR) { xfs_trans_brelse(args->trans, bp); return error; } because an ENODATA/ENOATTR error from disk leaves us with a null bp, and the xfs_trans_brelse will then null-deref it. As discussed on the list, we really need to modify the lower level IO functions to trap all disk errors and ensure that we don't let unique errors like this leak up into higher xfs functions - many like this should be remapped to EIO. However, this patch directly addresses a reported bug in the xattr code, and should be safe to backport to stable kernels. A larger-scope patch to handle more unique errors at lower levels can follow later. (Note, prior to 07120f1abdff we did not oops, but we did return the wrong error code to userspace.) Signed-off-by: Eric Sandeen Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines") Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino [ Adjust context: removed metadata health tracking calls ] Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- fs/xfs/libxfs/xfs_attr_remote.c | 7 +++++++ fs/xfs/libxfs/xfs_da_btree.c | 6 ++++++ 2 files changed, 13 insertions(+) diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c index 54de405cbab5..4d369876487b 100644 --- a/fs/xfs/libxfs/xfs_attr_remote.c +++ b/fs/xfs/libxfs/xfs_attr_remote.c @@ -418,6 +418,13 @@ xfs_attr_rmtval_get( dblkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount); error = xfs_buf_read(mp->m_ddev_targp, dblkno, dblkcnt, 0, &bp, &xfs_attr3_rmt_buf_ops); + /* + * ENODATA from disk implies a disk medium failure; + * ENODATA for xattrs means attribute not found, so + * disambiguate that here. + */ + if (error == -ENODATA) + error = -EIO; if (error) return error; diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index 28bbfc31039c..1efd45076ee2 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2649,6 +2649,12 @@ xfs_da_read_buf( error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); + /* + * ENODATA from disk implies a disk medium failure; ENODATA for + * xattrs means attribute not found, so disambiguate that here. + */ + if (error == -ENODATA && whichfork == XFS_ATTR_FORK) + error = -EIO; if (error) goto out_free; From 355bd0b51d2fc0a2c9d0789c77269e4476637428 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Thu, 4 Sep 2025 15:30:29 +0200 Subject: [PATCH 76/76] Linux 6.6.104 Link: https://lore.kernel.org/r/20250902131935.107897242@linuxfoundation.org Tested-by: Brett A C Sheffield Tested-by: Jon Hunter Tested-by: Florian Fainelli Tested-by: Linux Kernel Functional Testing Tested-by: Ron Economos Tested-by: Mark Brown Tested-by: Peter Schneider Signed-off-by: Greg Kroah-Hartman --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 9b288ccccd64..ae57f816375e 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 6 PATCHLEVEL = 6 -SUBLEVEL = 103 +SUBLEVEL = 104 EXTRAVERSION = NAME = Pinguïn Aangedreven