linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 10:31:46 +09:00

Author	SHA1	Message	Date
Vasily Khoruzhick	0baafd476c	ACPICA: Implement ACPI_WARNING_ONCE and ACPI_ERROR_ONCE [ Upstream commit 632b746b108e3c62e0795072d00ed597371c738a ] ACPICA commit 2ad4e6e7c4118f4cdfcad321c930b836cec77406 In some cases it is not practical nor useful to nag user about some firmware errors that they cannot fix. Add a macro that will print a warning or error only once to be used in these cases. Link: https://github.com/acpica/acpica/commit/2ad4e6e7 Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: c82c507126c9 ("ACPICA: executer/exsystem: Don't nag user about every Stall() violating the spec") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:38 +02:00
Avraham Stern	47b1ce7203	wifi: iwlwifi: mvm: increase the time between ranging measurements [ Upstream commit 3a7ee94559dfd640604d0265739e86dec73b64e8 ] The algo running in fw may take a little longer than 5 milliseconds, (e.g. measurement on 80MHz while associated). Increase the minimum time between measurements to 7 milliseconds. Fixes: `830aa3e7d1` ("iwlwifi: mvm: add support for range request command version 13") Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20240729201718.d3f3c26e00d9.I09e951290e8a3d73f147b88166fd9a678d1d69ed@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:38 +02:00
Ping-Ke Shih	aafca50e71	wifi: mac80211: don't use rate mask for offchannel TX either [ Upstream commit e7a7ef9a0742dbd0818d5b15fba2c5313ace765b ] Like the commit ab9177d83c04 ("wifi: mac80211: don't use rate mask for scanning"), ignore incorrect settings to avoid no supported rate warning reported by syzbot. The syzbot did bisect and found cause is commit `9df66d5b9f` ("cfg80211: fix default HE tx bitrate mask in 2G band"), which however corrects bitmask of HE MCS and recognizes correctly settings of empty legacy rate plus HE MCS rate instead of returning -EINVAL. As suggestions [1], follow the change of SCAN TX to consider this case of offchannel TX as well. [1] https://lore.kernel.org/linux-wireless/6ab2dc9c3afe753ca6fdcdd1421e7a1f47e87b84.camel@sipsolutions.net/T/#m2ac2a6d2be06a37c9c47a3d8a44b4f647ed4f024 Reported-by: syzbot+8dd98a9e98ee28dc484a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-wireless/000000000000fdef8706191a3f7b@google.com/ Fixes: `9df66d5b9f` ("cfg80211: fix default HE tx bitrate mask in 2G band") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240729074816.20323-1-pkshih@realtek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:38 +02:00
Jing Zhang	24f30b34ff	drivers/perf: Fix ali_drw_pmu driver interrupt status clearing [ Upstream commit a3dd920977dccc453c550260c4b7605b280b79c3 ] The alibaba_uncore_pmu driver forgot to clear all interrupt status in the interrupt processing function. After the PMU counter overflow interrupt occurred, an interrupt storm occurred, causing the system to hang. Therefore, clear the correct interrupt status in the interrupt handling function to fix it. Fixes: `cf7b61073e` ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC") Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1724297611-20686-1-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:38 +02:00
Andre Przywara	92dbc74464	kselftest/arm64: signal: fix/refactor SVE vector length enumeration [ Upstream commit 5225b6562b9a7dc808d5a1e465aaf5e2ebb220cd ] Currently a number of SVE/SME related tests have almost identical functions to enumerate all supported vector lengths. However over time the copy&pasted code has diverged, allowing some bugs to creep in: - fake_sigreturn_sme_change_vl reports a failure, not a SKIP if only one vector length is supported (but the SVE version is fine) - fake_sigreturn_sme_change_vl tries to set the SVE vector length, not the SME one (but the other SME tests are fine) - za_no_regs keeps iterating forever if only one vector length is supported (but za_regs is correct) Since those bugs seem to be mostly copy&paste ones, let's consolidate the enumeration loop into one shared function, and just call that from each test. That should fix the above bugs, and prevent similar issues from happening again. Fixes: `4963aeb35a` ("kselftest/arm64: signal: Add SME signal handling tests") Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240821164401.3598545-1-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Mark Brown	17f8c83212	kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA [ Upstream commit `a7db82f18c` ] The current signal handling tests for SME do not account for the fact that unlike SVE all SME vector lengths are optional so we can't guarantee that we will encounter the minimum possible VL, they will hang enumerating VLs on such systems. Abort enumeration when we find the lowest VL in the newly added ssve_za_regs test. Fixes: `bc69da5ff0` ("kselftest/arm64: Verify simultaneous SSVE and ZA context generation") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-2-d47c13dc8e1e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: 5225b6562b9a ("kselftest/arm64: signal: fix/refactor SVE vector length enumeration") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Mark Brown	2505532410	kselftest/arm64: Verify simultaneous SSVE and ZA context generation [ Upstream commit `bc69da5ff0` ] Add a test that generates SSVE and ZA context in a single signal frame to ensure that nothing is going wrong in that case for any reason. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230117-arm64-test-ssve-za-v1-2-203c00150154@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: 5225b6562b9a ("kselftest/arm64: signal: fix/refactor SVE vector length enumeration") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Mark Brown	3363e9a4dd	kselftest/arm64: Don't pass headers to the compiler as source [ Upstream commit `a884f7970e` ] The signal Makefile rules pass all the dependencies for each executable, including headers, to the compiler which GCC is happy enough with but clang rejects: clang --target=aarch64-none-linux-gnu -fintegrated-as -Wall -O2 -g -I/home/broonie/git/linux/tools/testing/selftests/ -isystem /home/broonie/git/linux/usr/include -D_GNU_SOURCE -std=gnu99 -I. test_signals.c test_signals_utils.c testcases/testcases.c signals.S testcases/fake_sigreturn_bad_magic.c test_signals.h test_signals_utils.h testcases/testcases.h -o testcases/fake_sigreturn_bad_magic clang: error: cannot specify -o when generating multiple output files This happens because clang gets confused about what to do with the header files, failing to identify them as source. This is not amazing behaviour on clang's part and should ideally be fixed but even if that happens we'd still need a new clang release so let's instead rework the Makefile so we use variables for the lists of header and source files, allowing us to only pass the source files to the compiler and keep clang happy. As a bonus the resulting Makefile is a bit easier to read. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230111-arm64-kselftest-clang-v1-3-89c69d377727@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: 5225b6562b9a ("kselftest/arm64: signal: fix/refactor SVE vector length enumeration") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Olaf Hering	5fa2f2dbf0	mount: handle OOM on mnt_warn_timestamp_expiry [ Upstream commit 4bcda1eaf184e308f07f9c61d3a535f9ce477ce8 ] If no page could be allocated, an error pointer was used as format string in pr_warn. Rearrange the code to return early in case of OOM. Also add a check for the return value of d_path. Fixes: `f8b92ba67c` ("mount: Add mount warning for impending timestamp expiry") Signed-off-by: Olaf Hering <olaf@aepfle.de> Link: https://lore.kernel.org/r/20240730085856.32385-1-olaf@aepfle.de [brauner: rewrite commit and commit message] Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Andy Shevchenko	880c18add0	fs/namespace: fnic: Switch to use %ptTd [ Upstream commit `74e60b8b2f` ] Use %ptTd instead of open-coded variant to print contents of time64_t type in human readable form. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Stable-dep-of: 4bcda1eaf184 ("mount: handle OOM on mnt_warn_timestamp_expiry") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Andrew Jones	db2556e538	RISC-V: KVM: Fix sbiret init before forwarding to userspace [ Upstream commit 6b7b282e6baea06ba65b55ae7d38326ceb79cebf ] When forwarding SBI calls to userspace ensure sbiret.error is initialized to SBI_ERR_NOT_SUPPORTED first, in case userspace neglects to set it to anything. If userspace neglects it then we can't be sure it did anything else either, so we just report it didn't do or try anything. Just init sbiret.value to zero, which is the preferred value to return when nothing special is specified. KVM was already initializing both sbiret.error and sbiret.value, but the values used appear to come from a copy+paste of the __sbi_ecall() implementation, i.e. a0 and a1, which don't apply prior to the call being executed, nor at all when forwarding to userspace. Fixes: `dea8ee31a0` ("RISC-V: KVM: Add SBI v0.1 support") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240807154943.150540-2-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:37 +02:00
Dmitry Kandybka	161a8becd0	wifi: rtw88: remove CPT execution branch never used [ Upstream commit 77c977327dfaa9ae2e154964cdb89ceb5c7b7cf1 ] In 'rtw_coex_action_bt_a2dp_pan', 'wl_cpt_test' and 'bt_cpt_test' are hardcoded to false, so corresponding 'table_case' and 'tdma_case' assignments are never met. Also 'rtw_coex_set_rf_para(rtwdev, chip->wl_rf_para_rx[1])' is never executed. Assuming that CPT was never fully implemented, remove lookalike leftovers. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `76f631cb40` ("rtw88: coex: update the mechanism for A2DP + PAN") Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240809085310.10512-1-d.kandybka@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Yanteng Si	175407d614	net: stmmac: dwmac-loongson: Init ref and PTP clocks rate [ Upstream commit c70f3163681381c15686bdd2fe56bf4af9b8aaaa ] Reference and PTP clocks rate of the Loongson GMAC devices is 125MHz. (So is in the GNET devices which support is about to be added.) Set the respective plat_stmmacenet_data field up in accordance with that so to have the coalesce command and timestamping work correctly. Fixes: `30bba69d7d` ("stmmac: pci: Add dwmac support for Loongson") Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Yinggang Gu <guyinggang@loongson.cn> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Tested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Toke Høiland-Jørgensen	7b5e333a11	wifi: ath9k: Remove error checks when creating debugfs entries [ Upstream commit f6ffe7f0184792c2f99aca6ae5b916683973d7d3 ] We should not be checking the return values from debugfs creation at all: the debugfs functions are designed to handle errors of previously called functions and just transparently abort the creation of debugfs entries when debugfs is disabled. If we check the return value and abort driver initialisation, we break the driver if debugfs is disabled (such as when booting with debugfs=off). Earlier versions of ath9k accidentally did the right thing by checking the return value, but only for NULL, not for IS_ERR(). This was "fixed" by the two commits referenced below, breaking ath9k with debugfs=off starting from the 6.6 kernel (as reported in the Bugzilla linked below). Restore functionality by just getting rid of the return value check entirely. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219122 Fixes: `1e4134610d` ("wifi: ath9k: use IS_ERR() with debugfs_create_dir()") Fixes: `6edb4ba6fb` ("wifi: ath9k: fix parameter check in ath9k_init_debug()") Reported-by: Daniel Tobias <dan.g.tob@gmail.com> Tested-by: Daniel Tobias <dan.g.tob@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240805110225.19690-1-toke@toke.dk Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Minjie Du	4738809fee	wifi: ath9k: fix parameter check in ath9k_init_debug() [ Upstream commit `6edb4ba6fb` ] Make IS_ERR() judge the debugfs_create_dir() function return in ath9k_init_debug() Signed-off-by: Minjie Du <duminjie@vivo.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20230712114740.13226-1-duminjie@vivo.com Stable-dep-of: f6ffe7f01847 ("wifi: ath9k: Remove error checks when creating debugfs entries") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Aleksandr Mishin	f4ea89abbe	ACPI: PMIC: Remove unneeded check in tps68470_pmic_opregion_probe() [ Upstream commit 07442c46abad1d50ac82af5e0f9c5de2732c4592 ] In tps68470_pmic_opregion_probe() pointer 'dev' is compared to NULL which is useless. Fix this issue by removing unneeded check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `e13452ac37` ("ACPI / PMIC: Add TI PMIC TPS68470 operation region driver") Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://patch.msgid.link/20240730225339.13165-1-amishin@t-argos.ru [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Helge Deller	df95378d40	crypto: xor - fix template benchmarking [ Upstream commit ab9a244c396aae4aaa34b2399b82fc15ec2df8c1 ] Commit `c055e3eae0` ("crypto: xor - use ktime for template benchmarking") switched from using jiffies to ktime-based performance benchmarking. This works nicely on machines which have a fine-grained ktime() clocksource as e.g. x86 machines with TSC. But other machines, e.g. my 4-way HP PARISC server, don't have such fine-grained clocksources, which is why it seems that 800 xor loops take zero seconds, which then shows up in the logs as: xor: measuring software checksum speed 8regs : -1018167296 MB/sec 8regs_prefetch : -1018167296 MB/sec 32regs : -1018167296 MB/sec 32regs_prefetch : -1018167296 MB/sec Fix this with some small modifications to the existing code to improve the algorithm to always produce correct results without introducing major delays for architectures with a fine-grained ktime() clocksource: a) Delay start of the timing until ktime() just advanced. On machines with a fast ktime() this should be just one additional ktime() call. b) Count the number of loops. Run at minimum 800 loops and finish earliest when the ktime() counter has progressed. With that the throughput can now be calculated more accurately under all conditions. Fixes: `c055e3eae0` ("crypto: xor - use ktime for template benchmarking") Signed-off-by: Helge Deller <deller@gmx.de> Tested-by: John David Anglin <dave.anglin@bell.net> v2: - clean up coding style (noticed & suggested by Herbert Xu) - rephrased & fixed typo in commit message Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Dmitry Antipov	7887ad1199	wifi: rtw88: always wait for both firmware loading attempts [ Upstream commit 0e735a4c6137262bcefe45bb52fde7b1f5fc6c4d ] In 'rtw_wait_firmware_completion()', always wait for both (regular and wowlan) firmware loading attempts. Otherwise if 'rtw_usb_intf_init()' has failed in 'rtw_usb_probe()', 'rtw_usb_disconnect()' may issue 'ieee80211_free_hw()' when one of 'rtw_load_firmware_cb()' (usually the wowlan one) is still in progress, causing UAF detected by KASAN. Fixes: `c8e5695eae` ("rtw88: load wowlan firmware if wowlan is supported") Reported-by: syzbot+6c6c08700f9480c41fe3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6c6c08700f9480c41fe3 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240726114657.25396-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:36 +02:00
Shubhrajyoti Datta	7d77159149	EDAC/synopsys: Fix error injection on Zynq UltraScale+ [ Upstream commit 35e6dbfe1846caeafabb49b7575adb36b0aa2269 ] The Zynq UltraScale+ MPSoC DDR has a disjoint memory from 2GB to 32GB. The DDR host interface has a contiguous memory so while injecting errors, the driver should remove the hole else the injection fails as the address translation is incorrect. Introduce a get_mem_info() function pointer and set it for Zynq UltraScale+ platform to return host address. Fixes: `1a81361f75` ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller") Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240711100656.31376-1-shubhrajyoti.datta@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:35 +02:00
Serge Semin	a8d9917193	EDAC/synopsys: Fix ECC status and IRQ control race condition [ Upstream commit 591c946675d88dcc0ae9ff54be9d5caaee8ce1e3 ] The race condition around the ECCCLR register access happens in the IRQ disable method called in the device remove() procedure and in the ECC IRQ handler: 1. Enable IRQ: a. ECCCLR = EN_CE \| EN_UE 2. Disable IRQ: a. ECCCLR = 0 3. IRQ handler: a. ECCCLR = CLR_CE \| CLR_CE_CNT \| CLR_CE \| CLR_CE_CNT b. ECCCLR = 0 c. ECCCLR = EN_CE \| EN_UE So if the IRQ disabling procedure is called concurrently with the IRQ handler method the IRQ might be actually left enabled due to the statement 3c. The root cause of the problem is that ECCCLR register (which since v3.10a has been called as ECCCTL) has intermixed ECC status data clear flags and the IRQ enable/disable flags. Thus the IRQ disabling (clear EN flags) and handling (write 1 to clear ECC status data) procedures must be serialised around the ECCCTL register modification to prevent the race. So fix the problem described above by adding the spin-lock around the ECCCLR modifications and preventing the IRQ-handler from modifying the IRQs enable flags (there is no point in disabling the IRQ and then re-enabling it again within a single IRQ handler call, see the statements 3a/3b and 3c above). Fixes: `f7824ded41` ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240222181324.28242-2-fancer.lancer@gmail.com Stable-dep-of: 35e6dbfe1846 ("EDAC/synopsys: Fix error injection on Zynq UltraScale+") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-10-17 15:20:35 +02:00
Greg Kroah-Hartman	aa4cd140bb	Linux 6.1.112 Link: https://lore.kernel.org/r/20240927121719.897851549@linuxfoundation.org Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: kernelci.org bot <bot@kernelci.org> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Edward Adam Davis	ba6269e187	USB: usbtmc: prevent kernel-usb-infoleak commit 625fa77151f00c1bd00d34d60d6f2e710b3f9aad upstream. The syzbot reported a kernel-usb-infoleak in usbtmc_write, we need to clear the structure before filling fields. Fixes: `4ddc645f40` ("usb: usbtmc: Add ioctl for vendor specific write") Reported-and-tested-by: syzbot+9d34f80f841e948c3fdb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9d34f80f841e948c3fdb Signed-off-by: Edward Adam Davis <eadavis@qq.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/tencent_9649AA6EC56EDECCA8A7D106C792D1C66B06@qq.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Junhao Xie	c74796ff4f	USB: serial: pl2303: add device id for Macrosilicon MS3020 commit 7d47d22444bb7dc1b6d768904a22070ef35e1fc0 upstream. Add the device id for the Macrosilicon MS3020 which is a PL2303HXN based device. Signed-off-by: Junhao Xie <bigfoot@classfun.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Tony Luck	a20eea14a6	x86/mm: Switch to new Intel CPU model defines commit 2eda374e883ad297bd9fe575a16c1dc850346075 upstream. New CPU #defines encode vendor and family as well as model. [ dhansen: vertically align 0's in invlpg_miss_ids[] ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com [ Ricardo: I used the old match macro X86_MATCH_INTEL_FAM6_MODEL() instead of X86_MATCH_VFM() as in the upstream commit. I also kept the ALDERLAKE_N name instead of ATOM_GRACEMONT. Both refer to the same CPU model. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Sumeet Pawnikar	ee8adcb4c0	powercap: RAPL: fix invalid initialization for pl4_supported field commit `d05b5e0baf` upstream. The current initialization of the struct x86_cpu_id via pl4_support_ids[] is partial and wrong. It is initializing "stepping" field with "X86_FEATURE_ANY" instead of "feature" field. Use X86_MATCH_INTEL_FAM6_MODEL macro instead of initializing each field of the struct x86_cpu_id for pl4_supported list of CPUs. This X86_MATCH_INTEL_FAM6_MODEL macro internally uses another macro X86_MATCH_VENDOR_FAM_MODEL_FEATURE for X86 based CPU matching with appropriate initialized values. Reported-by: Dave Hansen <dave.hansen@intel.com> Link: https://lore.kernel.org/lkml/28ead36b-2d9e-1a36-6f4e-04684e420260@intel.com Fixes: `eb52bc2ae5` ("powercap: RAPL: Add Power Limit4 support for Meteor Lake SoC") Fixes: `b08b95cf30` ("powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P") Fixes: `5157559069` ("powercap: RAPL: Add Power Limit4 support for RaptorLake") Fixes: `1cc5b9a411` ("powercap: Add Power Limit4 support for Alder Lake SoC") Fixes: `8365a898fe` ("powercap: Add Power Limit4 support") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> [ Ricardo: I removed METEORLAKE and METEORLAKE_L from pl4_support_ids as they are not included in v6.1. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Filipe Manana	563df8b411	btrfs: calculate the right space for delayed refs when updating global reserve commit `f8f210dc84` upstream. When updating the global block reserve, we account for the 6 items needed by an unlink operation and the 6 delayed references for each one of those items. However the calculation for the delayed references is not correct in case we have the free space tree enabled, as in that case we need to touch the free space tree as well and therefore need twice the number of bytes. So use the btrfs_calc_delayed_ref_bytes() helper to calculate the number of bytes need for the delayed references at btrfs_update_global_block_rsv(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [Diogo: this patch has been cherry-picked from the original commit; conflicts included lack of a define (picked from commit `5630e2bcfe`) and lack of btrfs_calc_delayed_ref_bytes (picked from commit `0e55a54502`) - changed const struct -> struct for compatibility.] Signed-off-by: Diogo Jahchan Koike <djahchankoike@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Matthieu Baerts (NGI0)	2626cbee1f	selftests: mptcp: join: restrict fullmesh endp on 1st sf commit 49ac6f05ace5bb0070c68a0193aa05d3c25d4c83 upstream. A new endpoint using the IP of the initial subflow has been recently added to increase the code coverage. But it breaks the test when using old kernels not having commit `86e39e0448` ("mptcp: keep track of local endpoint still available for each msk"), e.g. on v5.15. Similar to commit `d4c81bbb86` ("selftests: mptcp: join: support local endpoint being tracked or not"), it is possible to add the new endpoint conditionally, by checking if "mptcp_pm_subflow_check_next" is present in kallsyms: this is not directly linked to the commit introducing this symbol but for the parent one which is linked anyway. So we can know in advance what will be the expected behaviour, and add the new endpoint only when it makes sense to do so. Fixes: 4878f9f8421f ("selftests: mptcp: join: validate fullmesh endp on 1st sf") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-1-8f124aa9156d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ Conflicts in mptcp_join.sh, because the 'run_tests' helper has been modified in multiple commits that are not in this version, e.g. commit `e571fb09c8` ("selftests: mptcp: add speed env var"). The conflict was in the context, the new lines can still be added at the same place. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Marc Kleine-Budde	0ba8b599c3	can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop() commit a7801540f325d104de5065850a003f1d9bdc6ad3 upstream. The mcp251xfd wakes up from Low Power or Sleep Mode when SPI activity is detected. To avoid this, make sure that the timestamp worker is stopped before shutting down the chip. Split the starting of the timestamp worker out of mcp251xfd_timestamp_init() into the separate function mcp251xfd_timestamp_start(). Call mcp251xfd_timestamp_init() before mcp251xfd_chip_start(), move mcp251xfd_timestamp_start() to mcp251xfd_chip_start(). In this way, mcp251xfd_timestamp_stop() can be called unconditionally by mcp251xfd_chip_stop(). Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Marc Kleine-Budde	88047c4b2d	can: mcp251xfd: properly indent labels commit 51b2a721612236335ddec4f3fb5f59e72a204f3a upstream. To fix the coding style, remove the whitespace in front of labels. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Hagar Hemdan	672c19165f	gpio: prevent potential speculation leaks in gpio_device_get_desc() commit d795848ecce24a75dfd46481aee066ae6fe39775 upstream. Userspace may trigger a speculative read of an address outside the gpio descriptor array. Users can do that by calling gpio_ioctl() with an offset out of range. Offset is copied from user and then used as an array index to get the gpio descriptor without sanitization in gpio_device_get_desc(). This change ensures that the offset is sanitized by using array_index_nospec() to mitigate any possibility of speculative information leaks. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Link: https://lore.kernel.org/r/20240523085332.1801-1-hagarhem@amazon.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Kent Gibson	5c3a421c1f	gpiolib: cdev: Ignore reconfiguration without direction commit b440396387418fe2feaacd41ca16080e7a8bc9ad upstream. linereq_set_config() behaves badly when direction is not set. The configuration validation is borrowed from linereq_create(), where, to verify the intent of the user, the direction must be set to in order to effect a change to the electrical configuration of a line. But, when applied to reconfiguration, that validation does not allow for the unset direction case, making it possible to clear flags set previously without specifying the line direction. Adding to the inconsistency, those changes are not immediately applied by linereq_set_config(), but will take effect when the line value is next get or set. For example, by requesting a configuration with no flags set, an output line with GPIO_V2_LINE_FLAG_ACTIVE_LOW and GPIO_V2_LINE_FLAG_OPEN_DRAIN set could have those flags cleared, inverting the sense of the line and changing the line drive to push-pull on the next line value set. Skip the reconfiguration of lines for which the direction is not set, and only reconfigure the lines for which direction is set. Fixes: `a54756cb24` ("gpiolib: cdev: support GPIO_V2_LINE_SET_CONFIG_IOCTL") Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240626052925.174272-3-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Ping-Ke Shih	e388656a85	Revert "wifi: cfg80211: check wiphy mutex is held for wdev mutex" This reverts commit `19d13ec00a` which is commmit 1474bc87fe57deac726cc10203f73daa6c3212f7 upstream. The reverted commit is based on implementation of wiphy locking that isn't planned to redo on a stable kernel, so revert it to avoid warning: WARNING: CPU: 0 PID: 9 at net/wireless/core.h:231 disconnect_work+0xb8/0x144 [cfg80211] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.51-00141-ga1649b6f8ed6 #7 Hardware name: Freescale i.MX6 SoloX (Device Tree) Workqueue: events disconnect_work [cfg80211] unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0x70/0x1c0 __warn from warn_slowpath_fmt+0x16c/0x294 warn_slowpath_fmt from disconnect_work+0xb8/0x144 [cfg80211] disconnect_work [cfg80211] from process_one_work+0x204/0x620 process_one_work from worker_thread+0x1b0/0x474 worker_thread from kthread+0x10c/0x12c kthread from ret_from_fork+0x14/0x24 Reported-by: petter@technux.se Closes: https://lore.kernel.org/linux-wireless/9e98937d781c990615ef27ee0c858ff9@technux.se/T/#t Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Pablo Neira Ayuso	ddeead4761	netfilter: nf_tables: missing iterator type in lookup walk commit efefd4f00c967d00ad7abe092554ffbb70c1a793 upstream. Add missing decorator type to lookup expression and tighten WARN_ON_ONCE check in pipapo to spot earlier that this is unset. Fixes: 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Pablo Neira Ayuso	52735a010f	netfilter: nft_set_pipapo: walk over current view on netlink dump commit 29b359cf6d95fd60730533f7f10464e95bd17c73 upstream. The generation mask can be updated while netlink dump is in progress. The pipapo set backend walk iterator cannot rely on it to infer what view of the datastructure is to be used. Add notation to specify if user wants to read/update the set. Based on patch from Florian Westphal. Fixes: `2b84e215f8` ("netfilter: nft_set_pipapo: .walk does not deal with generations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Dan Carpenter	8a64f87e74	netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level() commit 7052622fccb1efb850c6b55de477f65d03525a30 upstream. The cgroup_get_from_path() function never returns NULL, it returns error pointers. Update the error handling to match. Fixes: 7f3287db6543 ("netfilter: nft_socket: make cgroupsv2 matching work with namespaces") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/bbc0c4e0-05cc-4f44-8797-2f4b3920a820@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Florian Westphal	ace0db36b4	netfilter: nft_socket: make cgroupsv2 matching work with namespaces commit 7f3287db654395f9c5ddd246325ff7889f550286 upstream. When running in container environmment, /sys/fs/cgroup/ might not be the real root node of the sk-attached cgroup. Example: In container: % stat /sys//fs/cgroup/ Device: 0,21 Inode: 2214 .. % stat /sys/fs/cgroup/foo Device: 0,21 Inode: 2264 .. The expectation would be for: nft add rule .. socket cgroupv2 level 1 "foo" counter to match traffic from a process that got added to "foo" via "echo $pid > /sys/fs/cgroup/foo/cgroup.procs". However, 'level 3' is needed to make this work. Seen from initial namespace, the complete hierarchy is: % stat /sys/fs/cgroup/system.slice/docker-.../foo Device: 0,21 Inode: 2264 .. i.e. hierarchy is 0 1 2 3 / -> system.slice -> docker-1... -> foo ... but the container doesn't know that its "/" is the "docker-1.." cgroup. Current code will retrieve the 'system.slice' cgroup node and store its kn->id in the destination register, so compare with 2264 ("foo" cgroup id) will not match. Fetch "/" cgroup from ->init() and add its level to the level we try to extract. cgroup root-level is 0 for the init-namespace or the level of the ancestor that is exposed as the cgroup root inside the container. In the above case, cgrp->level of "/" resolved in the container is 2 (docker-1...scope/) and request for 'level 1' will get adjusted to fetch the actual level (3). v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it. (kernel test robot) Cc: cgroups@vger.kernel.org Fixes: `e0bb96db96` ("netfilter: nft_socket: add support for cgroupsv2") Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Dave Chinner	5899daf1d8	xfs: journal geometry is not properly bounds checked [ Upstream commit `f1e1765aad` ] If the journal geometry results in a sector or log stripe unit validation problem, it indicates that we cannot set the log up to safely write to the the journal. In these cases, we must abort the mount because the corruption needs external intervention to resolve. Similarly, a journal that is too large cannot be written to safely, either, so we shouldn't allow those geometries to mount, either. If the log is too small, we risk having transaction reservations overruning the available log space and the system hanging waiting for space it can never provide. This is purely a runtime hang issue, not a corruption issue as per the first cases listed above. We abort mounts of the log is too small for V5 filesystems, but we must allow v4 filesystems to mount because, historically, there was no log size validity checking and so some systems may still be out there with undersized logs. The problem is that on V4 filesystems, when we discover a log geometry problem, we skip all the remaining checks and then allow the log to continue mounting. This mean that if one of the log size checks fails, we skip the log stripe unit check. i.e. we allow the mount because a "non-fatal" geometry is violated, and then fail to check the hard fail geometries that should fail the mount. Move all these fatal checks to the superblock verifier, and add a new check for the two log sector size geometry variables having the same values. This will prevent any attempt to mount a log that has invalid or inconsistent geometries long before we attempt to mount the log. However, for the minimum log size checks, we can only do that once we've setup up the log and calculated all the iclog sizes and roundoffs. Hence this needs to remain in the log mount code after the log has been initialised. It is also the only case where we should allow a v4 filesystem to continue running, so leave that handling in place, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	68e6efe0d4	xfs: set bnobt/cntbt numrecs correctly when formatting new AGs [ Upstream commit `8e698ee72c` ] Through generic/300, I discovered that mkfs.xfs creates corrupt filesystems when given these parameters: Filesystems formatted with --unsupported are not supported!! meta-data=/dev/sda isize=512 agcount=8, agsize=16352 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 data = bsize=4096 blocks=130816, imaxpct=25 = sunit=32 swidth=128 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=8192, version=2 = sectsz=512 sunit=32 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 = rgcount=0 rgsize=0 blks Discarding blocks...Done. Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - 16:30:50: zeroing log - 16320 of 16320 blocks done - scan filesystem freespace and inode maps... agf_freeblks 25, counted 0 in ag 4 sb_fdblocks 8823, counted 8798 The root cause of this problem is the numrecs handling in xfs_freesp_init_recs, which is used to initialize a new AG. Prior to calling the function, we set up the new bnobt block with numrecs == 1 and rely on _freesp_init_recs to format that new record. If the last record created has a blockcount of zero, then it sets numrecs = 0. That last bit isn't correct if the AG contains the log, the start of the log is not immediately after the initial blocks due to stripe alignment, and the end of the log is perfectly aligned with the end of the AG. For this case, we actually formatted a single bnobt record to handle the free space before the start of the (stripe aligned) log, and incremented arec to try to format a second record. That second record turned out to be unnecessary, so what we really want is to leave numrecs at 1. The numrecs handling itself is overly complicated because a different function sets numrecs == 1. Change the bnobt creation code to start with numrecs set to zero and only increment it after successfully formatting a free space extent into the btree block. Fixes: `f327a00745` ("xfs: account for log space when formatting new AGs") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	af871df651	xfs: fix reloading entire unlinked bucket lists [ Upstream commit `537c013b14` ] During review of the patcheset that provided reloading of the incore iunlink list, Dave made a few suggestions, and I updated the copy in my dev tree. Unfortunately, I then got distracted by ... who even knows what ... and forgot to backport those changes from my dev tree to my release candidate branch. I then sent multiple pull requests with stale patches, and that's what was merged into -rc3. So. This patch re-adds the use of an unlocked iunlink list check to determine if we want to allocate the resources to recreate the incore list. Since lost iunlinked inodes are supposed to be rare, this change helps us avoid paying the transaction and AGF locking costs every time we open any inode. This also re-adds the shutdowns on failure, and re-applies the restructuring of the inner loop in xfs_inode_reload_unlinked_bucket, and re-adds a requested comment about the quotachecking code. Retain the original RVB tag from Dave since there's no code change from the last submission. Fixes: `68b957f64f` ("xfs: load uncached unlinked inodes into memory on demand") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	62ca591045	xfs: make inode unlinked bucket recovery work with quotacheck [ Upstream commit `49813a21ed` ] Teach quotacheck to reload the unlinked inode lists when walking the inode table. This requires extra state handling, since it's possible that a reloaded inode will get inactivated before quotacheck tries to scan it; in this case, we need to ensure that the reloaded inode does not have dquots attached when it is freed. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	e9d1551f80	xfs: reload entire unlinked bucket lists [ Upstream commit `83771c50e4` ] The previous patch to reload unrecovered unlinked inodes when adding a newly created inode to the unlinked list is missing a key piece of functionality. It doesn't handle the case that someone calls xfs_iget on an inode that is not the last item in the incore list. For example, if at mount time the ondisk iunlink bucket looks like this: AGI -> 7 -> 22 -> 3 -> NULL None of these three inodes are cached in memory. Now let's say that someone tries to open inode 3 by handle. We need to walk the list to make sure that inodes 7 and 22 get loaded cold, and that the i_prev_unlinked of inode 3 gets set to 22. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Darrick J. Wong	8ffd3ae7a0	xfs: use i_prev_unlinked to distinguish inodes that are not on the unlinked list [ Upstream commit `f12b96683d` ] Alter the definition of i_prev_unlinked slightly to make it more obvious when an inode with 0 link count is not part of the iunlink bucket lists rooted in the AGI. This distinction is necessary because it is not sufficient to check inode.i_nlink to decide if an inode is on the unlinked list. Updates to i_nlink can happen while holding only ILOCK_EXCL, but updates to an inode's position in the AGI unlinked list (which happen after the nlink update) requires both ILOCK_EXCL and the AGI buffer lock. The next few patches will make it possible to reload an entire unlinked bucket list when we're walking the inode table or performing handle operations and need more than the ability to iget the last inode in the chain. The upcoming directory repair code also needs to be able to make this distinction to decide if a zero link count directory should be moved to the orphanage or allowed to inactivate. An upcoming enhancement to the online AGI fsck code will need this distinction to check and rebuild the AGI unlinked buckets. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Shiyang Ruan	8e2147f37f	xfs: correct calculation for agend and blockcount [ Upstream commit `3c90c01e49` ] The agend should be "start + length - 1", then, blockcount should be "end + 1 - start". Correct 2 calculation mistakes. Also, rename "agend" to "range_agend" because it's not the end of the AG per se; it's the end of the dead region within an AG's agblock space. Fixes: `5cf32f63b0` ("xfs: fix the calculation for "end" and "length"") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Dave Chinner	d931b6c6a9	xfs: fix unlink vs cluster buffer instantiation race [ Upstream commit 348a1983cf4cf5099fc398438a968443af4c9f65 ] Luis has been reporting an assert failure when freeing an inode cluster during inode inactivation for a while. The assert looks like: XFS: Assertion failed: bp->b_flags & XBF_DONE, file: fs/xfs/xfs_trans_buf.c, line: 241 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:102! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 4 PID: 73 Comm: kworker/4:1 Not tainted 6.10.0-rc1 #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Workqueue: xfs-inodegc/loop5 xfs_inodegc_worker [xfs] RIP: 0010:assfail (fs/xfs/xfs_message.c:102) xfs RSP: 0018:ffff88810188f7f0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88816e748250 RCX: 1ffffffff844b0e7 RDX: 0000000000000004 RSI: ffff88810188f558 RDI: ffffffffc2431fa0 RBP: 1ffff11020311f01 R08: 0000000042431f9f R09: ffffed1020311e9b R10: ffff88810188f4df R11: ffffffffac725d70 R12: ffff88817a3f4000 R13: ffff88812182f000 R14: ffff88810188f998 R15: ffffffffc2423f80 FS: 0000000000000000(0000) GS:ffff8881c8400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055fe9d0f109c CR3: 000000014426c002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:241 (discriminator 1)) xfs xfs_imap_to_bp (fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_inode_buf.c:138) xfs xfs_inode_item_precommit (fs/xfs/xfs_inode_item.c:145) xfs xfs_trans_run_precommits (fs/xfs/xfs_trans.c:931) xfs __xfs_trans_commit (fs/xfs/xfs_trans.c:966) xfs xfs_inactive_ifree (fs/xfs/xfs_inode.c:1811) xfs xfs_inactive (fs/xfs/xfs_inode.c:2013) xfs xfs_inodegc_worker (fs/xfs/xfs_icache.c:1841 fs/xfs/xfs_icache.c:1886) xfs process_one_work (kernel/workqueue.c:3231) worker_thread (kernel/workqueue.c:3306 (discriminator 2) kernel/workqueue.c:3393 (discriminator 2)) kthread (kernel/kthread.c:389) ret_from_fork (arch/x86/kernel/process.c:147) ret_from_fork_asm (arch/x86/entry/entry_64.S:257) </TASK> And occurs when the the inode precommit handlers is attempt to look up the inode cluster buffer to attach the inode for writeback. The trail of logic that I can reconstruct is as follows. 1. the inode is clean when inodegc runs, so it is not attached to a cluster buffer when precommit runs. 2. #1 implies the inode cluster buffer may be clean and not pinned by dirty inodes when inodegc runs. 3. #2 implies that the inode cluster buffer can be reclaimed by memory pressure at any time. 4. The assert failure implies that the cluster buffer was attached to the transaction, but not marked done. It had been accessed earlier in the transaction, but not marked done. 5. #4 implies the cluster buffer has been invalidated (i.e. marked stale). 6. #5 implies that the inode cluster buffer was instantiated uninitialised in the transaction in xfs_ifree_cluster(), which only instantiates the buffers to invalidate them and never marks them as done. Given factors 1-3, this issue is highly dependent on timing and environmental factors. Hence the issue can be very difficult to reproduce in some situations, but highly reliable in others. Luis has an environment where it can be reproduced easily by g/531 but, OTOH, I've reproduced it only once in ~2000 cycles of g/531. I think the fix is to have xfs_ifree_cluster() set the XBF_DONE flag on the cluster buffers, even though they may not be initialised. The reasons why I think this is safe are: 1. A buffer cache lookup hit on a XBF_STALE buffer will clear the XBF_DONE flag. Hence all future users of the buffer know they have to re-initialise the contents before use and mark it done themselves. 2. xfs_trans_binval() sets the XFS_BLI_STALE flag, which means the buffer remains locked until the journal commit completes and the buffer is unpinned. Hence once marked XBF_STALE/XFS_BLI_STALE by xfs_ifree_cluster(), the only context that can access the freed buffer is the currently running transaction. 3. #2 implies that future buffer lookups in the currently running transaction will hit the transaction match code and not the buffer cache. Hence XBF_STALE and XFS_BLI_STALE will not be cleared unless the transaction initialises and logs the buffer with valid contents again. At which point, the buffer will be marked marked XBF_DONE again, so having XBF_DONE already set on the stale buffer is a moot point. 4. #2 also implies that any concurrent access to that cluster buffer will block waiting on the buffer lock until the inode cluster has been fully freed and is no longer an active inode cluster buffer. 5. #4 + #1 means that any future user of the disk range of that buffer will always see the range of disk blocks covered by the cluster buffer as not done, and hence must initialise the contents themselves. 6. Setting XBF_DONE in xfs_ifree_cluster() then means the unlinked inode precommit code will see a XBF_DONE buffer from the transaction match as it expects. It can then attach the stale but newly dirtied inode to the stale but newly dirtied cluster buffer without unexpected failures. The stale buffer will then sail through the journal and do the right thing with the attached stale inode during unpin. Hence the fix is just one line of extra code. The explanation of why we have to set XBF_DONE in xfs_ifree_cluster, OTOH, is long and complex.... Fixes: `82842fee6e` ("xfs: fix AGF vs inode cluster buffer deadlock") Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Darrick J. Wong	1486aeb788	xfs: fix negative array access in xfs_getbmap [ Upstream commit `1bba82fe1a` ] In commit `8ee81ed581`, Ye Bin complained about an ASSERT in the bmapx code that trips if we encounter a delalloc extent after flushing the pagecache to disk. The ioctl code does not hold MMAPLOCK so it's entirely possible that a racing write page fault can create a delalloc extent after the file has been flushed. The proposed solution was to replace the assertion with an early return that avoids filling out the bmap recordset with a delalloc entry if the caller didn't ask for it. At the time, I recall thinking that the forward logic sounded ok, but felt hesitant because I suspected that changing this code would cause something /else/ to burst loose due to some other subtlety. syzbot of course found that subtlety. If all the extent mappings found after the flush are delalloc mappings, we'll reach the end of the data fork without ever incrementing bmv->bmv_entries. This is new, since before we'd have emitted the delalloc mappings even though the caller didn't ask for them. Once we reach the end, we'll try to set BMV_OF_LAST on the -1st entry (because bmv_entries is zero) and go corrupt something else in memory. Yay. I really dislike all these stupid patches that fiddle around with debug code and break things that otherwise worked well enough. Nobody was complaining that calling XFS_IOC_BMAPX without BMV_IF_DELALLOC would return BMV_OF_DELALLOC records, and now we've gone from "weird behavior that nobody cared about" to "bad behavior that must be addressed immediately". Maybe I'll just ignore anything from Huawei from now on for my own sake. Reported-by: syzbot+c103d3808a0de5faaf80@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-xfs/20230412024907.GP360889@frogsfrogsfrogs/ Fixes: `8ee81ed581` ("xfs: fix BUG_ON in xfs_getbmap()") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Darrick J. Wong	4790c167cc	xfs: load uncached unlinked inodes into memory on demand [ Upstream commit `68b957f64f` ] shrikanth hegde reports that filesystems fail shortly after mount with the following failure: WARNING: CPU: 56 PID: 12450 at fs/xfs/xfs_inode.c:1839 xfs_iunlink_lookup+0x58/0x80 [xfs] This of course is the WARN_ON_ONCE in xfs_iunlink_lookup: ip = radix_tree_lookup(&pag->pag_ici_root, agino); if (WARN_ON_ONCE(!ip \|\| !ip->i_ino)) { ... } >From diagnostic data collected by the bug reporters, it would appear that we cleanly mounted a filesystem that contained unlinked inodes. Unlinked inodes are only processed as a final step of log recovery, which means that clean mounts do not process the unlinked list at all. Prior to the introduction of the incore unlinked lists, this wasn't a problem because the unlink code would (very expensively) traverse the entire ondisk metadata iunlink chain to keep things up to date. However, the incore unlinked list code complains when it realizes that it is out of sync with the ondisk metadata and shuts down the fs, which is bad. Ritesh proposed to solve this problem by unconditionally parsing the unlinked lists at mount time, but this imposes a mount time cost for every filesystem to catch something that should be very infrequent. Instead, let's target the places where we can encounter a next_unlinked pointer that refers to an inode that is not in cache, and load it into cache. Note: This patch does not address the problem of iget loading an inode from the middle of the iunlink list and needing to set i_prev_unlinked correctly. Reported-by: shrikanth hegde <sshegde@linux.vnet.ibm.com> Triaged-by: Ritesh Harjani <ritesh.list@gmail.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Shiyang Ruan	0cc1922687	xfs: fix the calculation for "end" and "length" [ Upstream commit `5cf32f63b0` ] The value of "end" should be "start + length - 1". Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Dave Chinner	4427e3d362	xfs: remove WARN when dquot cache insertion fails [ Upstream commit `4b827b3f30` ] It just creates unnecessary bot noise these days. Reported-by: syzbot+6ae213503fb12e87934f@syzkaller.appspotmail.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:52 +02:00
Long Li	e8c6533404	xfs: fix ag count overflow during growfs [ Upstream commit `c3b880acad` ] I found a corruption during growfs: XFS (loop0): Internal error agbno >= mp->m_sb.sb_agblocks at line 3661 of file fs/xfs/libxfs/xfs_alloc.c. Caller __xfs_free_extent+0x28e/0x3c0 CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257 Call Trace: <TASK> dump_stack_lvl+0x50/0x70 xfs_corruption_error+0x134/0x150 __xfs_free_extent+0x2c1/0x3c0 xfs_ag_extend_space+0x291/0x3e0 xfs_growfs_data+0xd72/0xe90 xfs_file_ioctl+0x5f9/0x14a0 __x64_sys_ioctl+0x13e/0x1c0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd XFS (loop0): Corruption detected. Unmount and run xfs_repair XFS (loop0): Internal error xfs_trans_cancel at line 1097 of file fs/xfs/xfs_trans.c. Caller xfs_growfs_data+0x691/0xe90 CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257 Call Trace: <TASK> dump_stack_lvl+0x50/0x70 xfs_error_report+0x93/0xc0 xfs_trans_cancel+0x2c0/0x350 xfs_growfs_data+0x691/0xe90 xfs_file_ioctl+0x5f9/0x14a0 __x64_sys_ioctl+0x13e/0x1c0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f2d86706577 The bug can be reproduced with the following sequence: # truncate -s 1073741824 xfs_test.img # mkfs.xfs -f -b size=1024 -d agcount=4 xfs_test.img # truncate -s 2305843009213693952 xfs_test.img # mount -o loop xfs_test.img /mnt/test # xfs_growfs -D 1125899907891200 /mnt/test The root cause is that during growfs, user space passed in a large value of newblcoks to xfs_growfs_data_private(), due to current sb_agblocks is too small, new AG count will exceed UINT_MAX. Because of AG number type is unsigned int and it would overflow, that caused nagcount much smaller than the actual value. During AG extent space, delta blocks in xfs_resizefs_init_new_ags() will much larger than the actual value due to incorrect nagcount, even exceed UINT_MAX. This will cause corruption and be detected in __xfs_free_extent. Fix it by growing the filesystem to up to the maximally allowed AGs and not return EINVAL when new AG count overflow. Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:52 +02:00
Dave Chinner	02f44e7ff6	xfs: collect errors from inodegc for unlinked inode recovery [ Upstream commit `d4d12c02bf` ] Unlinked list recovery requires errors removing the inode the from the unlinked list get fed back to the main recovery loop. Now that we offload the unlinking to the inodegc work, we don't get errors being fed back when we trip over a corruption that prevents the inode from being removed from the unlinked list. This means we never clear the corrupt unlinked list bucket, resulting in runtime operations eventually tripping over it and shutting down. Fix this by collecting inodegc worker errors and feed them back to the flush caller. This is largely best effort - the only context that really cares is log recovery, and it only flushes a single inode at a time so we don't need complex synchronised handling. Essentially the inodegc workers will capture the first error that occurs and the next flush will gather them and clear them. The flush itself will only report the first gathered error. In the cases where callers can return errors, propagate the collected inodegc flush error up the error handling chain. In the case of inode unlinked list recovery, there are several superfluous calls to flush queued unlinked inodes - xlog_recover_iunlink_bucket() guarantees that it has flushed the inodegc and collected errors before it returns. Hence nothing in the calling path needs to run a flush, even when an error is returned. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:52 +02:00

1 2 3 4 5 ...

1157466 Commits