linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Felix Fietkau	07b3af42d8	net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links Using the QDMA tx scheduler to throttle tx to line speed works fine for switch ports, but apparently caused a regression on non-switch ports. Based on a number of tests, it seems that this throttling can be safely dropped without re-introducing the issues on switch ports that the tx scheduling changes resolved. Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ Fixes: `f63959c7ee` ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Reported-by: Frank Wunderlich <frank-w@public-files.de> Reported-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230324140404.95745-1-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 23:23:50 -07:00
Filipe Manana	2280d425ba	btrfs: ignore fiemap path cache when there are multiple paths for a node During fiemap, when walking backreferences to determine if a b+tree node/leaf is shared, we may find a tree block (leaf or node) for which two parents were added to the references ulist. This happens if we get for example one direct ref (shared tree block ref) and one indirect ref (non-shared tree block ref) for the tree block at the current level, which can happen during relocation. In that case the fiemap path cache can not be used since it's meant for a single path, with one tree block at each possible level, so having multiple references for a tree block at any level may result in getting the level counter exceed BTRFS_MAX_LEVEL and eventually trigger the warning: WARN_ON_ONCE(level >= BTRFS_MAX_LEVEL) at lookup_backref_shared_cache() and at store_backref_shared_cache(). This is harmless since the code ignores any level >= BTRFS_MAX_LEVEL, the warning is there just to catch any unexpected case like the one described above. However if a user finds this it may be scary and get reported. So just ignore the path cache once we find a tree block for which there are more than one reference, which is the less common case, and update the cache with the sharedness check result for all levels below the level for which we found multiple references. Reported-by: Jarno Pelkonen <jarno.pelkonen@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAKv8qLmDNAGJGCtsevxx_VZ_YOvvs1L83iEJkTgyA4joJertng@mail.gmail.com/ Fixes: `12a824dc67` ("btrfs: speedup checking for extent sharedness during fiemap") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-29 01:16:23 +02:00
Linus Torvalds	fcd476ea6a	Merge tag 'urgent-rcu.2023.03.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU fix from Paul McKenney: "This brings the rcu_torture_read event trace into line with the new trace tools by replacing this event trace's __field() with the corresponding __array(). Without this, the new trace tools will fail when presented wtih an rcu_torture_read event trace, which is a regression from the viewpoint of trace tools users" Link: https://lore.kernel.org/all/20230320133650.5388a05e@gandalf.local.home/ * tag 'urgent-rcu.2023.03.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: rcu: Fix rcu_torture_read ftrace event	2023-03-28 13:28:52 -07:00
Linus Torvalds	756c1a0593	Merge tag 'linux-kselftest-fixes-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "One single fix for sigaltstack test -Wuninitialized warning found when building with clang" * tag 'linux-kselftest-fixes-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: sigaltstack: fix -Wuninitialized	2023-03-28 13:14:47 -07:00
Rafael J. Wysocki	b57841fb0b	thermal: core: Drop excessive lockdep_assert_held() calls The lockdep_assert_held() calls added to cooling_device_stats_setup() and cooling_device_stats_destroy() by commit `790930f442` ("thermal: core: Introduce thermal_cooling_device_update()") trigger false-positive lockdep reports in code paths that are not subject to race conditions (before cooling device registration and after cooling device removal). For this reason, remove the lockdep_assert_held() calls from both cooling_device_stats_setup() and cooling_device_stats_destroy() and add one to thermal_cooling_device_stats_reinit() that has to be called under the cdev lock. Fixes: `790930f442` ("thermal: core: Introduce thermal_cooling_device_update()") Link: https://lore.kernel.org/linux-acpi/ZCIDTLFt27Ei7+V6@ideak-desk.fi.intel.com Reported-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2023-03-28 20:49:47 +02:00
Linus Torvalds	05c24161f4	Merge tag 's390-6.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix an error handling issue with PTRACE_GET_LAST_BREAK request so that -EFAULT is returned if put_user() fails, instead of ignoring it - Fix a build race for the modules_prepare target when CONFIG_EXPOLINE_EXTERN is enabled by reintroducing the dependence on scripts - Fix a memory leak in vfio_ap device driver - Add missing earlyclobber annotations to __clear_user() inline assembly to prevent incorrect register allocation * tag 's390-6.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling s390: reintroduce expoline dependence to scripts s390/vfio-ap: fix memory leak in vfio_ap device driver s390/uaccess: add missing earlyclobber annotations to __clear_user()	2023-03-28 10:43:04 -07:00
Jakob Koschel	e9a1cc2e4c	ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg() The code implicitly assumes that the list iterator finds a correct handle. If 'vsi_handle' is not found the 'old_agg_vsi_info' was pointing to an bogus memory location. For safety a separate list iterator variable should be used to make the != NULL check on 'old_agg_vsi_info' correct under any circumstances. Additionally Linus proposed to avoid any use of the list iterator variable after the loop, in the attempt to move the list iterator variable declaration into the macro to avoid any potential misuse after the loop. Using it in a pointer comparison after the loop is undefined behavior and should be omitted if possible [1]. Fixes: `37c592062b` ("ice: remove the VSI info from previous agg") Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel <jkl820.git@gmail.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-03-28 09:48:49 -07:00
Junfeng Guo	29486b6df3	ice: add profile conflict check for AVF FDIR Add profile conflict check while adding some FDIR rules to avoid unexpected flow behavior, rules may have conflict including: IPv4 <---> {IPv4_UDP, IPv4_TCP, IPv4_SCTP} IPv6 <---> {IPv6_UDP, IPv6_TCP, IPv6_SCTP} For example, when we create an FDIR rule for IPv4, this rule will work on packets including IPv4, IPv4_UDP, IPv4_TCP and IPv4_SCTP. But if we then create an FDIR rule for IPv4_UDP and then destroy it, the first FDIR rule for IPv4 cannot work on pkt IPv4_UDP then. To prevent this unexpected behavior, we add restriction in software when creating FDIR rules by adding necessary profile conflict check. Fixes: `1f7ea1cd6a` ("ice: Enable FDIR Configure for AVF") Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-03-28 09:48:41 -07:00
Brett Creeley	d94dbdc4e0	ice: Fix ice_cfg_rdma_fltr() to only update relevant fields The current implementation causes ice_vsi_update() to update all VSI fields based on the cached VSI context. This also assumes that the ICE_AQ_VSI_PROP_Q_OPT_VALID bit is set. This can cause problems if the VSI context is not correctly synced by the driver. Fix this by only updating the fields that correspond to ICE_AQ_VSI_PROP_Q_OPT_VALID. Also, make sure to save the updated result in the cached VSI context on success. Fixes: `348048e724` ("ice: Implement iidc operations") Co-developed-by: Robert Malz <robertx.malz@intel.com> Signed-off-by: Robert Malz <robertx.malz@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-03-28 09:43:12 -07:00
Jesse Brandeburg	66ceaa4c45	ice: fix W=1 headers mismatch make modules W=1 returns: .../ice/ice_txrx_lib.c:448: warning: Function parameter or member 'first_idx' not described in 'ice_finalize_xdp_rx' .../ice/ice_txrx.c:948: warning: Function parameter or member 'ntc' not described in 'ice_get_rx_buf' .../ice/ice_txrx.c:1038: warning: Excess function parameter 'rx_buf' description in 'ice_construct_skb' Fix these warnings by adding and deleting the deviant arguments. Fixes: `2fba7dc515` ("ice: Add support for XDP multi-buffer on Rx side") Fixes: `d7956d81f1` ("ice: Pull out next_to_clean bump out of ice_put_rx_buf()") CC: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-03-28 09:42:05 -07:00
Marek Szyprowski	f91bf3272a	iommu/exynos: Fix set_platform_dma_ops() callback There are some subtle differences between release_device() and set_platform_dma_ops() callbacks, so separate those two callbacks. Device links should be removed only in release_device(), because they were created in probe_device() on purpose and they are needed for proper Exynos IOMMU driver operation. While fixing this, remove the conditional code as it is not really needed. Reported-by: Jason Gunthorpe <jgg@ziepe.ca> Fixes: `189d496b48` ("iommu/exynos: Add missing set_platform_dma_ops callback") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Link: https://lore.kernel.org/r/20230315232514.1046589-1-m.szyprowski@samsung.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-03-28 15:33:50 +02:00
Kornel Dulęba	b26cd9325b	pinctrl: amd: Disable and mask interrupts on resume This fixes a similar problem to the one observed in: commit `4e5a04be88` ("pinctrl: amd: disable and mask interrupts on probe"). On some systems, during suspend/resume cycle firmware leaves an interrupt enabled on a pin that is not used by the kernel. This confuses the AMD pinctrl driver and causes spurious interrupts. The driver already has logic to detect if a pin is used by the kernel. Leverage it to re-initialize interrupt fields of a pin only if it's not used by us. Cc: stable@vger.kernel.org Fixes: `dbad75dd1f` ("pinctrl: add AMD GPIO driver support.") Signed-off-by: Kornel Dulęba <korneld@chromium.org> Link: https://lore.kernel.org/r/20230320093259.845178-1-korneld@chromium.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2023-03-28 15:20:18 +02:00
Jens Axboe	005308f7bd	io_uring/poll: clear single/double poll flags on poll arming Unless we have at least one entry queued, then don't call into io_poll_remove_entries(). Normally this isn't possible, but if we retry poll then we can have ->nr_entries cleared again as we're setting it up. If this happens for a poll retry, then we'll still have at least REQ_F_SINGLE_POLL set. io_poll_remove_entries() then thinks it has entries to remove. Clear REQ_F_SINGLE_POLL and REQ_F_DOUBLE_POLL unconditionally when arming a poll request. Fixes: `c16bda3759` ("io_uring/poll: allow some retries for poll triggering spuriously") Cc: stable@vger.kernel.org Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-03-28 07:09:01 -06:00
Paolo Abeni	917fd7d6cd	Merge branch 'xen-netback-fix-issue-introduced-recently' Juergen Gross says: ==================== xen/netback: fix issue introduced recently The fix for XSA-423 introduced a bug which resulted in loss of network connection in some configurations. The first patch is fixing the issue, while the second one is removing a test which isn't needed. ==================== Link: https://lore.kernel.org/r/20230327083646.18690-1-jgross@suse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 14:16:42 +02:00
Juergen Gross	8fb8ebf948	xen/netback: remove not needed test in xenvif_tx_build_gops() The tests for the number of grant mapping or copy operations reaching the array size of the operations buffer at the end of the main loop in xenvif_tx_build_gops() isn't needed. The loop can handle at maximum MAX_PENDING_REQS transfer requests, as XEN_RING_NR_UNCONSUMED_REQUESTS() is taking unsent responses into consideration, too. Remove the tests. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 14:16:40 +02:00
Juergen Gross	05310f31ca	xen/netback: don't do grant copy across page boundary Fix xenvif_get_requests() not to do grant copy operations across local page boundaries. This requires to double the maximum number of copy operations per queue, as each copy could now be split into 2. Make sure that struct xenvif_tx_cb doesn't grow too large. Cc: stable@vger.kernel.org Fixes: `ad7f402ae4` ("xen/netback: Ensure protocol headers don't fall in the non-linear area") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 14:16:40 +02:00
Wolfram Sang	f22c993f31	smsc911x: avoid PHY being resumed when interface is not up SMSC911x doesn't need mdiobus suspend/resume, that's why it sets 'mac_managed_pm'. However, setting it needs to be moved from init to probe, so mdiobus PM functions will really never be called (e.g. when the interface is not up yet during suspend/resume). Fixes: `3ce9f2bef7` ("net: smsc911x: Stop and start PHY during suspend and resume") Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230327083138.6044-1-wsa+renesas@sang-engineering.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 13:39:47 +02:00
Jens Axboe	fd72761894	powerpc: Don't try to copy PPR for task with NULL pt_regs powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which from my (arguably very short) checking is not commonly done for other archs. This is fine, except when PF_IO_WORKER's have been created and the task does something that causes a coredump to be generated. Then we get this crash: Kernel attempted to read user page (160) - exploit attempt? (uid: 1000) BUG: Kernel NULL pointer dereference on read at 0x00000160 Faulting instruction address: 0xc0000000000c3a60 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0 REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8 ... NIP memcpy_power7+0x200/0x7d0 LR ppr_get+0x64/0xb0 Call Trace: ppr_get+0x40/0xb0 (unreliable) __regset_get+0x180/0x1f0 regset_get_alloc+0x64/0x90 elf_core_dump+0xb98/0x1b60 do_coredump+0x1c34/0x24a0 get_signal+0x71c/0x1410 do_notify_resume+0x140/0x6f0 interrupt_exit_user_prepare_main+0x29c/0x320 interrupt_exit_user_prepare+0x6c/0xa0 interrupt_return_srr_user+0x8/0x138 Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL pt_regs. Check for a valid pt_regs in both ppc_get/ppr_set, and return an error if not set. The actual error value doesn't seem to be important here, so just pick -EINVAL. Fixes: `fa439810cc` ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Jens Axboe <axboe@kernel.dk> [mpe: Trim oops in change log, add Fixes & Cc stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/d9f63344-fe7c-56ae-b420-4a1a04a2ae4c@kernel.dk	2023-03-28 22:11:41 +11:00
Benjamin Gray	1abce0580b	powerpc/64s: Fix __pte_needs_flush() false positive warning Userspace PROT_NONE ptes set _PAGE_PRIVILEGED, triggering a false positive debug assertion that __pte_flags_need_flush() is not called on a kernel mapping. Detect when it is a userspace PROT_NONE page by checking the required bits of PAGE_NONE are set, and none of the RWX bits are set. pte_protnone() is insufficient here because it always returns 0 when CONFIG_NUMA_BALANCING=n. Fixes: `b11931e9ad` ("powerpc/64s: add pte_needs_flush and huge_pmd_needs_flush") Cc: stable@vger.kernel.org # v6.1+ Reported-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230302225947.81083-1-bgray@linux.ibm.com	2023-03-28 21:37:23 +11:00
Paolo Abeni	b4c66d755e	Merge branch 'net-mvpp2-rss-fixes' Sven Auhagen says: ==================== net: mvpp2: rss fixes This patch series fixes up some rss problems in the mvpp2 driver. The classifier is missing some fragmentation flags, the parser has the QinQ headers switched and the PPPoE Layer 4 detecion is not working correctly. This is leading to no or bad rss for the default settings. ==================== Link: https://lore.kernel.org/r/20230325163903.ofefgus43x66as7i@Svens-MacBookPro.local Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 11:34:12 +02:00
Sven Auhagen	031a416c21	net: mvpp2: parser fix PPPoE In PPPoE add all IPv4 header option length to the parser and adjust the L3 and L4 offset accordingly. Currently the L4 match does not work with PPPoE and all packets are matched as L3 IP4 OPT. Fixes: `3f518509de` ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 11:34:01 +02:00
Sven Auhagen	a587a84813	net: mvpp2: parser fix QinQ The mvpp2 parser entry for QinQ has the inner and outer VLAN in the wrong order. Fix the problem by swapping them. Fixes: `3f518509de` ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 11:34:01 +02:00
Sven Auhagen	9a251cae51	net: mvpp2: classifier flow fix fragmentation flags Add missing IP Fragmentation Flag. Fixes: `f9358e12a0` ("net: mvpp2: split ingress traffic into multiple flows") Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-28 11:34:01 +02:00
Jakub Kicinski	ebd3b82634	Merge tag 'linux-can-fixes-for-6.3-20230327' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2023-03-27 Oleksij Rempel and Hillf Danton contribute a patch for the CAN J1939 protocol that prevents a potential deadlock in j1939_sk_errqueue(). Ivan Orlov fixes an uninit-value in the CAN BCM protocol in the bcm_tx_setup() function. * tag 'linux-can-fixes-for-6.3-20230327' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: bcm: bcm_tx_setup(): fix KMSAN uninit-value in vfs_write can: j1939: prevent deadlock by moving j1939_sk_errqueue() ==================== Link: https://lore.kernel.org/r/20230327124807.1157134-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-27 19:47:43 -07:00
Lukas Bulwahn	4f7702ab6f	MAINTAINERS: remove the linux-nfc@lists.01.org list Some MAINTAINERS sections mention to mail patches to the list linux-nfc@lists.01.org. Probably due to changes on Intel's 01.org website and servers, the list server lists.01.org/ml01.01.org is simply gone. Considering emails recorded on lore.kernel.org, only a handful of emails where sent to the linux-nfc@lists.01.org list, and they are usually also sent to the netdev mailing list as well, where they are then picked up. So, there is no big benefit in restoring the linux-nfc elsewhere. Remove all occurrences of the linux-nfc@lists.01.org list in MAINTAINERS. Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/all/CAKXUXMzggxQ43DUZZRkPMGdo5WkzgA=i14ySJUFw4kZfE5ZaZA@mail.gmail.com/ Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230324081613.32000-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-27 19:32:48 -07:00
Sean Anderson	ac9bba3ff1	net: fman: Add myself as a reviewer I've read through or reworked a good portion of this driver. Add myself as a reviewer. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://lore.kernel.org/r/20230323145957.2999211-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-27 19:30:27 -07:00
Juraj Pecigos	1231363aec	nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQN A system with more than one of these SSDs will only have one usable. The kernel fails to detect more than one nvme device due to duplicate cntlids. before: [ 9.395229] nvme 0000:01:00.0: platform quirk: setting simple suspend [ 9.395262] nvme nvme0: pci function 0000:01:00.0 [ 9.395282] nvme 0000:03:00.0: platform quirk: setting simple suspend [ 9.395305] nvme nvme1: pci function 0000:03:00.0 [ 9.409873] nvme nvme0: Duplicate cntlid 1 with nvme1, subsys nqn.2022-07.com.siliconmotion:nvm-subsystem-sn- , rejecting [ 9.409982] nvme nvme0: Removing after probe failure status: -22 [ 9.427487] nvme nvme1: allocated 64 MiB host memory buffer. [ 9.445088] nvme nvme1: 16/0/0 default/read/poll queues [ 9.449898] nvme nvme1: Ignoring bogus Namespace Identifiers after: [ 1.161890] nvme 0000:01:00.0: platform quirk: setting simple suspend [ 1.162660] nvme nvme0: pci function 0000:01:00.0 [ 1.162684] nvme 0000:03:00.0: platform quirk: setting simple suspend [ 1.162707] nvme nvme1: pci function 0000:03:00.0 [ 1.191354] nvme nvme0: allocated 64 MiB host memory buffer. [ 1.193378] nvme nvme1: allocated 64 MiB host memory buffer. [ 1.211044] nvme nvme1: 16/0/0 default/read/poll queues [ 1.211080] nvme nvme0: 16/0/0 default/read/poll queues [ 1.216145] nvme nvme0: Ignoring bogus Namespace Identifiers [ 1.216261] nvme nvme1: Ignoring bogus Namespace Identifiers Adding the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk to resolves the issue. Signed-off-by: Juraj Pecigos <kernel@juraj.dev> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2023-03-28 10:09:11 +09:00
Filipe Manana	2d82a40aa7	btrfs: fix deadlock when aborting transaction during relocation with scrub Before relocating a block group we pause scrub, then do the relocation and then unpause scrub. The relocation process requires starting and committing a transaction, and if we have a failure in the critical section of the transaction commit path (transaction state >= TRANS_STATE_COMMIT_START), we will deadlock if there is a paused scrub. That results in stack traces like the following: [42.479] BTRFS info (device sdc): relocating block group 53876686848 flags metadata\|raid6 [42.936] BTRFS warning (device sdc): Skipping commit of aborted transaction. [42.936] ------------[ cut here ]------------ [42.936] BTRFS: Transaction aborted (error -28) [42.936] WARNING: CPU: 11 PID: 346822 at fs/btrfs/transaction.c:1977 btrfs_commit_transaction+0xcc8/0xeb0 [btrfs] [42.936] Modules linked in: dm_flakey dm_mod loop btrfs (...) [42.936] CPU: 11 PID: 346822 Comm: btrfs Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [42.936] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [42.936] RIP: 0010:btrfs_commit_transaction+0xcc8/0xeb0 [btrfs] [42.936] Code: ff ff 45 8b (...) [42.936] RSP: 0018:ffffb58649633b48 EFLAGS: 00010282 [42.936] RAX: 0000000000000000 RBX: ffff8be6ef4d5bd8 RCX: 0000000000000000 [42.936] RDX: 0000000000000002 RSI: ffffffffb35e7782 RDI: 00000000ffffffff [42.936] RBP: ffff8be6ef4d5c98 R08: 0000000000000000 R09: ffffb586496339e8 [42.936] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8be6d38c7c00 [42.936] R13: 00000000ffffffe4 R14: ffff8be6c268c000 R15: ffff8be6ef4d5cf0 [42.936] FS: 00007f381a82b340(0000) GS:ffff8beddfcc0000(0000) knlGS:0000000000000000 [42.936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [42.936] CR2: 00007f1e35fb7638 CR3: 0000000117680006 CR4: 0000000000370ee0 [42.936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [42.936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [42.936] Call Trace: [42.936] <TASK> [42.936] ? start_transaction+0xcb/0x610 [btrfs] [42.936] prepare_to_relocate+0x111/0x1a0 [btrfs] [42.936] relocate_block_group+0x57/0x5d0 [btrfs] [42.936] ? btrfs_wait_nocow_writers+0x25/0xb0 [btrfs] [42.936] btrfs_relocate_block_group+0x248/0x3c0 [btrfs] [42.936] ? __pfx_autoremove_wake_function+0x10/0x10 [42.936] btrfs_relocate_chunk+0x3b/0x150 [btrfs] [42.936] btrfs_balance+0x8ff/0x11d0 [btrfs] [42.936] ? __kmem_cache_alloc_node+0x14a/0x410 [42.936] btrfs_ioctl+0x2334/0x32c0 [btrfs] [42.937] ? mod_objcg_state+0xd2/0x360 [42.937] ? refill_obj_stock+0xb0/0x160 [42.937] ? seq_release+0x25/0x30 [42.937] ? __rseq_handle_notify_resume+0x3b5/0x4b0 [42.937] ? percpu_counter_add_batch+0x2e/0xa0 [42.937] ? __x64_sys_ioctl+0x88/0xc0 [42.937] __x64_sys_ioctl+0x88/0xc0 [42.937] do_syscall_64+0x38/0x90 [42.937] entry_SYSCALL_64_after_hwframe+0x72/0xdc [42.937] RIP: 0033:0x7f381a6ffe9b [42.937] Code: 00 48 89 44 24 (...) [42.937] RSP: 002b:00007ffd45ecf060 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [42.937] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f381a6ffe9b [42.937] RDX: 00007ffd45ecf150 RSI: 00000000c4009420 RDI: 0000000000000003 [42.937] RBP: 0000000000000003 R08: 0000000000000013 R09: 0000000000000000 [42.937] R10: 00007f381a60c878 R11: 0000000000000246 R12: 00007ffd45ed0423 [42.937] R13: 00007ffd45ecf150 R14: 0000000000000000 R15: 00007ffd45ecf148 [42.937] </TASK> [42.937] ---[ end trace 0000000000000000 ]--- [42.937] BTRFS: error (device sdc: state A) in cleanup_transaction:1977: errno=-28 No space left [59.196] INFO: task btrfs:346772 blocked for more than 120 seconds. [59.196] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.196] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.196] task:btrfs state:D stack:0 pid:346772 ppid:1 flags:0x00004002 [59.196] Call Trace: [59.196] <TASK> [59.196] __schedule+0x392/0xa70 [59.196] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.196] schedule+0x5d/0xd0 [59.196] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.197] ? __pfx_autoremove_wake_function+0x10/0x10 [59.197] scrub_pause_off+0x21/0x50 [btrfs] [59.197] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.197] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.198] ? __pfx_autoremove_wake_function+0x10/0x10 [59.198] scrub_stripe+0x20d/0x740 [btrfs] [59.198] scrub_chunk+0xc4/0x130 [btrfs] [59.198] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.198] ? __pfx_autoremove_wake_function+0x10/0x10 [59.198] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.199] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.199] ? _copy_from_user+0x7b/0x80 [59.199] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.199] ? refill_stock+0x33/0x50 [59.199] ? should_failslab+0xa/0x20 [59.199] ? kmem_cache_alloc_node+0x151/0x460 [59.199] ? alloc_io_context+0x1b/0x80 [59.199] ? preempt_count_add+0x70/0xa0 [59.199] ? __x64_sys_ioctl+0x88/0xc0 [59.199] __x64_sys_ioctl+0x88/0xc0 [59.199] do_syscall_64+0x38/0x90 [59.199] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.199] RIP: 0033:0x7f82ffaffe9b [59.199] RSP: 002b:00007f82ff9fcc50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.199] RAX: ffffffffffffffda RBX: 000055b191e36310 RCX: 00007f82ffaffe9b [59.199] RDX: 000055b191e36310 RSI: 00000000c400941b RDI: 0000000000000003 [59.199] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.199] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82ff9fd640 [59.199] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.199] </TASK> [59.199] INFO: task btrfs:346773 blocked for more than 120 seconds. [59.200] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.201] task:btrfs state:D stack:0 pid:346773 ppid:1 flags:0x00004002 [59.201] Call Trace: [59.201] <TASK> [59.201] __schedule+0x392/0xa70 [59.201] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.201] schedule+0x5d/0xd0 [59.201] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.201] ? __pfx_autoremove_wake_function+0x10/0x10 [59.201] scrub_pause_off+0x21/0x50 [btrfs] [59.202] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.202] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.202] ? __pfx_autoremove_wake_function+0x10/0x10 [59.202] scrub_stripe+0x20d/0x740 [btrfs] [59.202] scrub_chunk+0xc4/0x130 [btrfs] [59.203] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.203] ? __pfx_autoremove_wake_function+0x10/0x10 [59.203] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.203] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.203] ? _copy_from_user+0x7b/0x80 [59.203] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.204] ? should_failslab+0xa/0x20 [59.204] ? kmem_cache_alloc_node+0x151/0x460 [59.204] ? alloc_io_context+0x1b/0x80 [59.204] ? preempt_count_add+0x70/0xa0 [59.204] ? __x64_sys_ioctl+0x88/0xc0 [59.204] __x64_sys_ioctl+0x88/0xc0 [59.204] do_syscall_64+0x38/0x90 [59.204] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.204] RIP: 0033:0x7f82ffaffe9b [59.204] RSP: 002b:00007f82ff1fbc50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.204] RAX: ffffffffffffffda RBX: 000055b191e36790 RCX: 00007f82ffaffe9b [59.204] RDX: 000055b191e36790 RSI: 00000000c400941b RDI: 0000000000000003 [59.204] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.204] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82ff1fc640 [59.204] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.204] </TASK> [59.204] INFO: task btrfs:346774 blocked for more than 120 seconds. [59.205] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.205] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.206] task:btrfs state:D stack:0 pid:346774 ppid:1 flags:0x00004002 [59.206] Call Trace: [59.206] <TASK> [59.206] __schedule+0x392/0xa70 [59.206] schedule+0x5d/0xd0 [59.206] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.206] ? __pfx_autoremove_wake_function+0x10/0x10 [59.206] scrub_pause_off+0x21/0x50 [btrfs] [59.207] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.207] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.207] ? __pfx_autoremove_wake_function+0x10/0x10 [59.207] scrub_stripe+0x20d/0x740 [btrfs] [59.208] scrub_chunk+0xc4/0x130 [btrfs] [59.208] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.208] ? __mutex_unlock_slowpath.isra.0+0x9a/0x120 [59.208] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.208] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.209] ? _copy_from_user+0x7b/0x80 [59.209] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.209] ? should_failslab+0xa/0x20 [59.209] ? kmem_cache_alloc_node+0x151/0x460 [59.209] ? alloc_io_context+0x1b/0x80 [59.209] ? preempt_count_add+0x70/0xa0 [59.209] ? __x64_sys_ioctl+0x88/0xc0 [59.209] __x64_sys_ioctl+0x88/0xc0 [59.209] do_syscall_64+0x38/0x90 [59.209] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.209] RIP: 0033:0x7f82ffaffe9b [59.209] RSP: 002b:00007f82fe9fac50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.209] RAX: ffffffffffffffda RBX: 000055b191e36c10 RCX: 00007f82ffaffe9b [59.209] RDX: 000055b191e36c10 RSI: 00000000c400941b RDI: 0000000000000003 [59.209] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.209] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fe9fb640 [59.209] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.209] </TASK> [59.209] INFO: task btrfs:346775 blocked for more than 120 seconds. [59.210] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.210] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.211] task:btrfs state:D stack:0 pid:346775 ppid:1 flags:0x00004002 [59.211] Call Trace: [59.211] <TASK> [59.211] __schedule+0x392/0xa70 [59.211] schedule+0x5d/0xd0 [59.211] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.211] ? __pfx_autoremove_wake_function+0x10/0x10 [59.211] scrub_pause_off+0x21/0x50 [btrfs] [59.212] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.212] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.212] ? __pfx_autoremove_wake_function+0x10/0x10 [59.212] scrub_stripe+0x20d/0x740 [btrfs] [59.213] scrub_chunk+0xc4/0x130 [btrfs] [59.213] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.213] ? __mutex_unlock_slowpath.isra.0+0x9a/0x120 [59.213] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.213] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.214] ? _copy_from_user+0x7b/0x80 [59.214] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.214] ? should_failslab+0xa/0x20 [59.214] ? kmem_cache_alloc_node+0x151/0x460 [59.214] ? alloc_io_context+0x1b/0x80 [59.214] ? preempt_count_add+0x70/0xa0 [59.214] ? __x64_sys_ioctl+0x88/0xc0 [59.214] __x64_sys_ioctl+0x88/0xc0 [59.214] do_syscall_64+0x38/0x90 [59.214] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.214] RIP: 0033:0x7f82ffaffe9b [59.214] RSP: 002b:00007f82fe1f9c50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.214] RAX: ffffffffffffffda RBX: 000055b191e37090 RCX: 00007f82ffaffe9b [59.214] RDX: 000055b191e37090 RSI: 00000000c400941b RDI: 0000000000000003 [59.214] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.214] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fe1fa640 [59.214] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.214] </TASK> [59.214] INFO: task btrfs:346776 blocked for more than 120 seconds. [59.215] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.216] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.217] task:btrfs state:D stack:0 pid:346776 ppid:1 flags:0x00004002 [59.217] Call Trace: [59.217] <TASK> [59.217] __schedule+0x392/0xa70 [59.217] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.217] schedule+0x5d/0xd0 [59.217] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.217] ? __pfx_autoremove_wake_function+0x10/0x10 [59.217] scrub_pause_off+0x21/0x50 [btrfs] [59.217] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.217] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.218] ? __pfx_autoremove_wake_function+0x10/0x10 [59.218] scrub_stripe+0x20d/0x740 [btrfs] [59.218] scrub_chunk+0xc4/0x130 [btrfs] [59.218] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.219] ? __pfx_autoremove_wake_function+0x10/0x10 [59.219] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.219] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.219] ? _copy_from_user+0x7b/0x80 [59.219] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.219] ? should_failslab+0xa/0x20 [59.219] ? kmem_cache_alloc_node+0x151/0x460 [59.219] ? alloc_io_context+0x1b/0x80 [59.219] ? preempt_count_add+0x70/0xa0 [59.219] ? __x64_sys_ioctl+0x88/0xc0 [59.219] __x64_sys_ioctl+0x88/0xc0 [59.219] do_syscall_64+0x38/0x90 [59.219] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.219] RIP: 0033:0x7f82ffaffe9b [59.219] RSP: 002b:00007f82fd9f8c50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.219] RAX: ffffffffffffffda RBX: 000055b191e37510 RCX: 00007f82ffaffe9b [59.219] RDX: 000055b191e37510 RSI: 00000000c400941b RDI: 0000000000000003 [59.219] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.219] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fd9f9640 [59.219] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.219] </TASK> [59.219] INFO: task btrfs:346822 blocked for more than 120 seconds. [59.220] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.221] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.222] task:btrfs state:D stack:0 pid:346822 ppid:1 flags:0x00004002 [59.222] Call Trace: [59.222] <TASK> [59.222] __schedule+0x392/0xa70 [59.222] schedule+0x5d/0xd0 [59.222] btrfs_scrub_cancel+0x91/0x100 [btrfs] [59.222] ? __pfx_autoremove_wake_function+0x10/0x10 [59.222] btrfs_commit_transaction+0x572/0xeb0 [btrfs] [59.223] ? start_transaction+0xcb/0x610 [btrfs] [59.223] prepare_to_relocate+0x111/0x1a0 [btrfs] [59.223] relocate_block_group+0x57/0x5d0 [btrfs] [59.223] ? btrfs_wait_nocow_writers+0x25/0xb0 [btrfs] [59.223] btrfs_relocate_block_group+0x248/0x3c0 [btrfs] [59.224] ? __pfx_autoremove_wake_function+0x10/0x10 [59.224] btrfs_relocate_chunk+0x3b/0x150 [btrfs] [59.224] btrfs_balance+0x8ff/0x11d0 [btrfs] [59.224] ? __kmem_cache_alloc_node+0x14a/0x410 [59.224] btrfs_ioctl+0x2334/0x32c0 [btrfs] [59.225] ? mod_objcg_state+0xd2/0x360 [59.225] ? refill_obj_stock+0xb0/0x160 [59.225] ? seq_release+0x25/0x30 [59.225] ? __rseq_handle_notify_resume+0x3b5/0x4b0 [59.225] ? percpu_counter_add_batch+0x2e/0xa0 [59.225] ? __x64_sys_ioctl+0x88/0xc0 [59.225] __x64_sys_ioctl+0x88/0xc0 [59.225] do_syscall_64+0x38/0x90 [59.225] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.225] RIP: 0033:0x7f381a6ffe9b [59.225] RSP: 002b:00007ffd45ecf060 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.225] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f381a6ffe9b [59.225] RDX: 00007ffd45ecf150 RSI: 00000000c4009420 RDI: 0000000000000003 [59.225] RBP: 0000000000000003 R08: 0000000000000013 R09: 0000000000000000 [59.225] R10: 00007f381a60c878 R11: 0000000000000246 R12: 00007ffd45ed0423 [59.225] R13: 00007ffd45ecf150 R14: 0000000000000000 R15: 00007ffd45ecf148 [59.225] </TASK> What happens is the following: 1) A scrub is running, so fs_info->scrubs_running is 1; 2) Task A starts block group relocation, and at btrfs_relocate_chunk() it pauses scrub by calling btrfs_scrub_pause(). That increments fs_info->scrub_pause_req from 0 to 1 and waits for the scrub task to pause (for fs_info->scrubs_paused to be == to fs_info->scrubs_running); 3) The scrub task pauses at scrub_pause_off(), waiting for fs_info->scrub_pause_req to decrease to 0; 4) Task A then enters btrfs_relocate_block_group(), and down that call chain we start a transaction and then attempt to commit it; 5) When task A calls btrfs_commit_transaction(), it either will do the commit itself or wait for some other task that already started the commit of the transaction - it doesn't matter which case; 6) The transaction commit enters state TRANS_STATE_COMMIT_START; 7) An error happens during the transaction commit, like -ENOSPC when running delayed refs or delayed items for example; 8) This results in calling transaction.c:cleanup_transaction(), where we call btrfs_scrub_cancel(), incrementing fs_info->scrub_cancel_req from 0 to 1, and blocking this task waiting for fs_info->scrubs_running to decrease to 0; 9) From this point on, both the transaction commit and the scrub task hang forever: 1) The transaction commit is waiting for fs_info->scrubs_running to be decreased to 0; 2) The scrub task is at scrub_pause_off() waiting for fs_info->scrub_pause_req to decrease to 0 - so it can not proceed to stop the scrub and decrement fs_info->scrubs_running from 0 to 1. Therefore resulting in a deadlock. Fix this by having cleanup_transaction(), called if a transaction commit fails, not call btrfs_scrub_cancel() if relocation is in progress, and having btrfs_relocate_block_group() call btrfs_scrub_cancel() instead if the relocation failed and a transaction abort happened. This was triggered with btrfs/061 from fstests. Fixes: `55e3a601c8` ("btrfs: Fix data checksum error cause by replace with io-load.") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-28 00:47:00 +02:00
Anand Jain	50d281fc43	btrfs: scan device in non-exclusive mode This fixes mkfs/mount/check failures due to race with systemd-udevd scan. During the device scan initiated by systemd-udevd, other user space EXCL operations such as mkfs, mount, or check may get blocked and result in a "Device or resource busy" error. This is because the device scan process opens the device with the EXCL flag in the kernel. Two reports were received: - btrfs/179 test case, where the fsck command failed with the -EBUSY error - LTP pwritev03 test case, where mkfs.vfs failed with the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem on the device. In both cases, fsck and mkfs (respectively) were racing with a systemd-udevd device scan, and systemd-udevd won, resulting in the -EBUSY error for fsck and mkfs. Reproducing the problem has been difficult because there is a very small window during which these userspace threads can race to acquire the exclusive device open. Even on the system where the problem was observed, the problem occurrences were anywhere between 10 to 400 iterations and chances of reproducing decreases with debug printk()s. However, an exclusive device open is unnecessary for the scan process, as there are no write operations on the device during scan. Furthermore, during the mount process, the superblock is re-read in the below function call chain: btrfs_mount_root btrfs_open_devices open_fs_devices btrfs_open_one_device btrfs_get_bdev_and_sb So, to fix this issue, removes the FMODE_EXCL flag from the scan operation, and add a comment. The case where mkfs may still write to the device and a scan is running, the btrfs signature is not written at that time so scan will not recognize such device. Reported-by: Sherry Yang <sherry.yang@oracle.com> Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-28 00:46:56 +02:00
Filipe Manana	2f1a6be12a	btrfs: fix race between quota disable and quota assign ioctls The quota assign ioctl can currently run in parallel with a quota disable ioctl call. The assign ioctl uses the quota root, while the disable ioctl frees that root, and therefore we can have a use-after-free triggered in the assign ioctl, leading to a trace like the following when KASAN is enabled: [672.723][T736] BUG: KASAN: slab-use-after-free in btrfs_search_slot+0x2962/0x2db0 [672.723][T736] Read of size 8 at addr ffff888022ec0208 by task btrfs_search_sl/27736 [672.724][T736] [672.725][T736] CPU: 1 PID: 27736 Comm: btrfs_search_sl Not tainted 6.3.0-rc3 #37 [672.723][T736] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [672.727][T736] Call Trace: [672.728][T736] <TASK> [672.728][T736] dump_stack_lvl+0xd9/0x150 [672.725][T736] print_report+0xc1/0x5e0 [672.720][T736] ? __virt_addr_valid+0x61/0x2e0 [672.727][T736] ? __phys_addr+0xc9/0x150 [672.725][T736] ? btrfs_search_slot+0x2962/0x2db0 [672.722][T736] kasan_report+0xc0/0xf0 [672.729][T736] ? btrfs_search_slot+0x2962/0x2db0 [672.724][T736] btrfs_search_slot+0x2962/0x2db0 [672.723][T736] ? fs_reclaim_acquire+0xba/0x160 [672.722][T736] ? split_leaf+0x13d0/0x13d0 [672.726][T736] ? rcu_is_watching+0x12/0xb0 [672.723][T736] ? kmem_cache_alloc+0x338/0x3c0 [672.722][T736] update_qgroup_status_item+0xf7/0x320 [672.724][T736] ? add_qgroup_rb+0x3d0/0x3d0 [672.739][T736] ? do_raw_spin_lock+0x12d/0x2b0 [672.730][T736] ? spin_bug+0x1d0/0x1d0 [672.737][T736] btrfs_run_qgroups+0x5de/0x840 [672.730][T736] ? btrfs_qgroup_rescan_worker+0xa70/0xa70 [672.738][T736] ? __del_qgroup_relation+0x4ba/0xe00 [672.738][T736] btrfs_ioctl+0x3d58/0x5d80 [672.735][T736] ? tomoyo_path_number_perm+0x16a/0x550 [672.737][T736] ? tomoyo_execute_permission+0x4a0/0x4a0 [672.731][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50 [672.737][T736] ? __sanitizer_cov_trace_switch+0x54/0x90 [672.734][T736] ? do_vfs_ioctl+0x132/0x1660 [672.730][T736] ? vfs_fileattr_set+0xc40/0xc40 [672.730][T736] ? _raw_spin_unlock_irq+0x2e/0x50 [672.732][T736] ? sigprocmask+0xf2/0x340 [672.737][T736] ? __fget_files+0x26a/0x480 [672.732][T736] ? bpf_lsm_file_ioctl+0x9/0x10 [672.738][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50 [672.736][T736] __x64_sys_ioctl+0x198/0x210 [672.736][T736] do_syscall_64+0x39/0xb0 [672.731][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.739][T736] RIP: 0033:0x4556ad [672.742][T736] </TASK> [672.743][T736] [672.748][T736] Allocated by task 27677: [672.743][T736] kasan_save_stack+0x22/0x40 [672.741][T736] kasan_set_track+0x25/0x30 [672.741][T736] __kasan_kmalloc+0xa4/0xb0 [672.749][T736] btrfs_alloc_root+0x48/0x90 [672.746][T736] btrfs_create_tree+0x146/0xa20 [672.744][T736] btrfs_quota_enable+0x461/0x1d20 [672.743][T736] btrfs_ioctl+0x4a1c/0x5d80 [672.747][T736] __x64_sys_ioctl+0x198/0x210 [672.749][T736] do_syscall_64+0x39/0xb0 [672.744][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.756][T736] [672.757][T736] Freed by task 27677: [672.759][T736] kasan_save_stack+0x22/0x40 [672.759][T736] kasan_set_track+0x25/0x30 [672.756][T736] kasan_save_free_info+0x2e/0x50 [672.751][T736] ____kasan_slab_free+0x162/0x1c0 [672.758][T736] slab_free_freelist_hook+0x89/0x1c0 [672.752][T736] __kmem_cache_free+0xaf/0x2e0 [672.752][T736] btrfs_put_root+0x1ff/0x2b0 [672.759][T736] btrfs_quota_disable+0x80a/0xbc0 [672.752][T736] btrfs_ioctl+0x3e5f/0x5d80 [672.756][T736] __x64_sys_ioctl+0x198/0x210 [672.753][T736] do_syscall_64+0x39/0xb0 [672.765][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.769][T736] [672.768][T736] The buggy address belongs to the object at ffff888022ec0000 [672.768][T736] which belongs to the cache kmalloc-4k of size 4096 [672.769][T736] The buggy address is located 520 bytes inside of [672.769][T736] freed 4096-byte region [ffff888022ec0000, ffff888022ec1000) [672.760][T736] [672.764][T736] The buggy address belongs to the physical page: [672.761][T736] page:ffffea00008bb000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x22ec0 [672.766][T736] head:ffffea00008bb000 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [672.779][T736] flags: 0xfff00000010200(slab\|head\|node=0\|zone=1\|lastcpupid=0x7ff) [672.770][T736] raw: 00fff00000010200 ffff888012842140 ffffea000054ba00 dead000000000002 [672.770][T736] raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000 [672.771][T736] page dumped because: kasan: bad access detected [672.778][T736] page_owner tracks the page as allocated [672.777][T736] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd2040(__GFP_IO\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_COMP\|__GFP_NOMEMALLOC), pid 88 [672.779][T736] get_page_from_freelist+0x119c/0x2d50 [672.779][T736] __alloc_pages+0x1cb/0x4a0 [672.776][T736] alloc_pages+0x1aa/0x270 [672.773][T736] allocate_slab+0x260/0x390 [672.771][T736] ___slab_alloc+0xa9a/0x13e0 [672.778][T736] __slab_alloc.constprop.0+0x56/0xb0 [672.771][T736] __kmem_cache_alloc_node+0x136/0x320 [672.789][T736] __kmalloc+0x4e/0x1a0 [672.783][T736] tomoyo_realpath_from_path+0xc3/0x600 [672.781][T736] tomoyo_path_perm+0x22f/0x420 [672.782][T736] tomoyo_path_unlink+0x92/0xd0 [672.780][T736] security_path_unlink+0xdb/0x150 [672.788][T736] do_unlinkat+0x377/0x680 [672.788][T736] __x64_sys_unlink+0xca/0x110 [672.789][T736] do_syscall_64+0x39/0xb0 [672.783][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.784][T736] page last free stack trace: [672.787][T736] free_pcp_prepare+0x4e5/0x920 [672.787][T736] free_unref_page+0x1d/0x4e0 [672.784][T736] __unfreeze_partials+0x17c/0x1a0 [672.797][T736] qlist_free_all+0x6a/0x180 [672.796][T736] kasan_quarantine_reduce+0x189/0x1d0 [672.797][T736] __kasan_slab_alloc+0x64/0x90 [672.793][T736] kmem_cache_alloc+0x17c/0x3c0 [672.799][T736] getname_flags.part.0+0x50/0x4e0 [672.799][T736] getname_flags+0x9e/0xe0 [672.792][T736] vfs_fstatat+0x77/0xb0 [672.791][T736] __do_sys_newlstat+0x84/0x100 [672.798][T736] do_syscall_64+0x39/0xb0 [672.796][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.790][T736] [672.791][T736] Memory state around the buggy address: [672.799][T736] ffff888022ec0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.805][T736] ffff888022ec0180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.802][T736] >ffff888022ec0200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.809][T736] ^ [672.809][T736] ffff888022ec0280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.809][T736] ffff888022ec0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fix this by having the qgroup assign ioctl take the qgroup ioctl mutex before calling btrfs_run_qgroups(), which is what all qgroup ioctls should call. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAFcO6XN3VD8ogmHwqRk4kbiwtpUSNySu2VAxN8waEPciCHJvMA@mail.gmail.com/ CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-28 00:46:53 +02:00
Alyssa Ross	bb430b6942	loop: LOOP_CONFIGURE: send uevents for partitions LOOP_CONFIGURE is, as far as I understand it, supposed to be a way to combine LOOP_SET_FD and LOOP_SET_STATUS64 into a single syscall. When using LOOP_SET_FD+LOOP_SET_STATUS64, a single uevent would be sent for each partition found on the loop device after the second ioctl(), but when using LOOP_CONFIGURE, no such uevent was being sent. In the old setup, uevents are disabled for LOOP_SET_FD, but not for LOOP_SET_STATUS64. This makes sense, as it prevents uevents being sent for a partially configured device during LOOP_SET_FD - they're only sent at the end of LOOP_SET_STATUS64. But for LOOP_CONFIGURE, uevents were disabled for the entire operation, so that final notification was never issued. To fix this, reduce the critical section to exclude the loop_reread_partitions() call, which causes the uevents to be issued, to after uevents are re-enabled, matching the behaviour of the LOOP_SET_FD+LOOP_SET_STATUS64 combination. I noticed this because Busybox's losetup program recently changed from using LOOP_SET_FD+LOOP_SET_STATUS64 to LOOP_CONFIGURE, and this broke my setup, for which I want a notification from the kernel any time a new partition becomes available. Signed-off-by: Alyssa Ross <hi@alyssa.is> [hch: reduced the critical section] Signed-off-by: Christoph Hellwig <hch@lst.de> Fixes: `3448914e8c` ("loop: Add LOOP_CONFIGURE ioctl") Link: https://lore.kernel.org/r/20230320125430.55367-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-03-27 13:27:06 -06:00
Linus Torvalds	3a93e40326	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "RISC-V: - Fix VM hang in case of timer delta being zero ARM: - MMU fixes: - Read the MMU notifier seq before dropping the mmap lock to guard against reading a potentially stale VMA - Disable interrupts when walking user page tables to protect against the page table being freed - Read the MTE permissions for the VMA within the mmap lock critical section, avoiding the use of a potentally stale VMA pointer - vPMU fixes: - Return the sum of the current perf event value and PMC snapshot for reads from userspace - Don't save the value of guest writes to PMCR_EL0.{C,P}, which could otherwise lead to userspace erroneously resetting the vPMU during VM save/restore" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: riscv/kvm: Fix VM hang in case of timer delta being zero. KVM: arm64: Check for kvm_vma_mte_allowed in the critical section KVM: arm64: Disable interrupts while walking userspace PTs KVM: arm64: Retry fault if vma_lookup() results become invalid KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value	2023-03-27 12:22:45 -07:00
Rafael J. Wysocki	c56610a869	ACPI: bus: Rework system-level device notification handling For ACPI drivers that provide a ->notify() callback and set ACPI_DRIVER_ALL_NOTIFY_EVENTS in their flags, that callback can be invoked while either the ->add() or the ->remove() callback is running without any synchronization at the bus type level which is counter to the common-sense expectation that notification handling should only be enabled when the driver is actually bound to the device. As a result, if the driver is not careful enough, it's ->notify() callback may crash when it is invoked too early or too late [1]. This issue has been amplified by commit `d6fb6ee182` ("ACPI: bus: Drop driver member of struct acpi_device") that made acpi_bus_notify() check for the presence of the driver and its ->notify() callback directly instead of using an extra driver pointer that was only set and cleared by the bus type code, but it was present before that commit although it was harder to reproduce then. It can be addressed by using the observation that acpi_device_install_notify_handler() can be modified to install the handler for all types of events when ACPI_DRIVER_ALL_NOTIFY_EVENTS is set in the driver flags, in which case acpi_bus_notify() will not need to invoke the driver's ->notify() callback any more and that callback will only be invoked after acpi_device_install_notify_handler() has run and before acpi_device_remove_notify_handler() runs, which implies the correct ordering with respect to the other ACPI driver callbacks. Modify the code accordingly and while at it, drop two redundant local variables from acpi_bus_notify() and turn its description comment into a proper kerneldoc one. Fixes: `d6fb6ee182` ("ACPI: bus: Drop driver member of struct acpi_device") Link: https://lore.kernel.org/linux-acpi/9f6cba7a8a57e5a687c934e8e406e28c.squirrel@mail.panix.com # [1] Reported-by: Pierre Asselin <pa@panix.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Pierre Asselin <pa@panix.com>	2023-03-27 19:56:03 +02:00
Linus Torvalds	91fe204578	Merge tag 'platform-drivers-x86-v6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Intel tpmi/vsec fixes - think-lmi fixes - two other small fixes / hw-id additions * tag 'platform-drivers-x86-v6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/surface: aggregator: Add missing fwnode_handle_put() platform/x86: think-lmi: Add possible_values for ThinkStation platform/x86: think-lmi: only display possible_values if available platform/x86: think-lmi: use correct possible_values delimiters platform/x86: think-lmi: add missing type attribute platform/x86 (gigabyte-wmi): Add support for A320M-S2H V2 platform/x86/intel: tpmi: Revise the comment of intel_vsec_add_aux platform/x86/intel: tpmi: Fix double free in tpmi_create_device() platform/x86/intel: vsec: Fix a memory leak in intel_vsec_add_aux	2023-03-27 09:45:39 -07:00
Linus Torvalds	fc5d1a9233	Merge tag 'mtd/fixes-for-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Miquel Raynal: "Raw NAND controller driver fixes: - meson: - Invalidate cache on polling ECC bit - Initialize struct with zeroes - nandsim: Artificially prevent sequential page reads ECC engine driver fixes: - mxic-ecc: Fix mxic_ecc_data_xfer_wait_for_completion() when irq is used Binging fixes: - jedec,spi-nor: Document CPOL/CPHA support" * tag 'mtd/fixes-for-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: meson: invalidate cache on polling ECC bit mtd: rawnand: nandsim: Artificially prevent sequential page reads dt-bindings: mtd: jedec,spi-nor: Document CPOL/CPHA support mtd: nand: mxic-ecc: Fix mxic_ecc_data_xfer_wait_for_completion() when irq is used mtd: rawnand: meson: initialize struct with zeroes	2023-03-27 09:41:17 -07:00
Heiko Carstens	f9bbf25e7b	s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling Return -EFAULT if put_user() for the PTRACE_GET_LAST_BREAK request fails, instead of silently ignoring it. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2023-03-27 17:23:08 +02:00
Jiri Slaby (SUSE)	7bb2107e63	s390: reintroduce expoline dependence to scripts Expolines depend on scripts/basic/fixdep. And build of expolines can now race with the fixdep build: make[1]: * Deleting file 'arch/s390/lib/expoline/expoline.o' /bin/sh: line 1: scripts/basic/fixdep: Permission denied make[1]: * [../scripts/Makefile.build:385: arch/s390/lib/expoline/expoline.o] Error 126 make: *** [../arch/s390/Makefile:166: expoline_prepare] Error 2 The dependence was removed in the below Fixes: commit. So reintroduce the dependence on scripts. Fixes: `a0b0987a78` ("s390/nospec: remove unneeded header includes") Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: stable@vger.kernel.org Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20230316112809.7903-1-jirislaby@kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2023-03-27 17:23:08 +02:00
Tony Krowiak	8f8cf76758	s390/vfio-ap: fix memory leak in vfio_ap device driver The device release callback function invoked to release the matrix device uses the dev_get_drvdata(device *dev) function to retrieve the pointer to the vfio_matrix_dev object in order to free its storage. The problem is, this object is not stored as drvdata with the device; since the kfree function will accept a NULL pointer, the memory for the vfio_matrix_dev object is never freed. Since the device being released is contained within the vfio_matrix_dev object, the container_of macro will be used to retrieve its pointer. Fixes: `1fde573413` ("s390: vfio-ap: base implementation of VFIO AP device driver") Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Link: https://lore.kernel.org/r/20230320150447.34557-1-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2023-03-27 17:23:08 +02:00
Heiko Carstens	89aba4c26f	s390/uaccess: add missing earlyclobber annotations to __clear_user() Add missing earlyclobber annotation to size, to, and tmp2 operands of the __clear_user() inline assembly since they are modified or written to before the last usage of all input operands. This can lead to incorrect register allocation for the inline assembly. Fixes: `6c2a9e6df6` ("[S390] Use alternative user-copy operations for new hardware.") Reported-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/ Cc: stable@vger.kernel.org Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2023-03-27 17:23:08 +02:00
Paolo Bonzini	9e347ba030	Merge tag 'kvm-riscv-fixes-6.3-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv fixes for 6.3, take #1 - Fix VM hang in case of timer delta being zero	2023-03-27 10:04:07 -04:00
Paolo Bonzini	8607daa214	Merge tag 'kvmarm-fixes-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.3, part #2 Fixes for a rather interesting set of bugs relating to the MMU: - Read the MMU notifier seq before dropping the mmap lock to guard against reading a potentially stale VMA - Disable interrupts when walking user page tables to protect against the page table being freed - Read the MTE permissions for the VMA within the mmap lock critical section, avoiding the use of a potentally stale VMA pointer Additionally, some fixes targeting the vPMU: - Return the sum of the current perf event value and PMC snapshot for reads from userspace - Don't save the value of guest writes to PMCR_EL0.{C,P}, which could otherwise lead to userspace erroneously resetting the vPMU during VM save/restore	2023-03-27 10:03:16 -04:00
Thomas Weißschuh	441d901fbf	platform/x86: gigabyte-wmi: add support for B650 AORUS ELITE AX This has been reported as working. Suggested-by: got3nks <got3nks@users.noreply.github.com> Link: https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/15#issuecomment-1483942966 Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230327-gigabyte-wmi-b650-elite-ax-v1-1-d4d645c21d0b@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2023-03-27 15:20:54 +02:00
Ivan Orlov	2b4c99f7d9	can: bcm: bcm_tx_setup(): fix KMSAN uninit-value in vfs_write Syzkaller reported the following issue: ===================================================== BUG: KMSAN: uninit-value in aio_rw_done fs/aio.c:1520 [inline] BUG: KMSAN: uninit-value in aio_write+0x899/0x950 fs/aio.c:1600 aio_rw_done fs/aio.c:1520 [inline] aio_write+0x899/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was created at: slab_post_alloc_hook mm/slab.h:766 [inline] slab_alloc_node mm/slub.c:3452 [inline] __kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491 __do_kmalloc_node mm/slab_common.c:967 [inline] __kmalloc+0x11d/0x3b0 mm/slab_common.c:981 kmalloc_array include/linux/slab.h:636 [inline] bcm_tx_setup+0x80e/0x29d0 net/can/bcm.c:930 bcm_sendmsg+0x3a2/0xce0 net/can/bcm.c:1351 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] sock_write_iter+0x495/0x5e0 net/socket.c:1108 call_write_iter include/linux/fs.h:2189 [inline] aio_write+0x63a/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd CPU: 1 PID: 5034 Comm: syz-executor350 Not tainted 6.2.0-rc6-syzkaller-80422-geda666ff2276 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 ===================================================== We can follow the call chain and find that 'bcm_tx_setup' function calls 'memcpy_from_msg' to copy some content to the newly allocated frame of 'op->frames'. After that the 'len' field of copied structure being compared with some constant value (64 or 8). However, if 'memcpy_from_msg' returns an error, we will compare some uninitialized memory. This triggers 'uninit-value' issue. This patch will add 'memcpy_from_msg' possible errors processing to avoid uninit-value issue. Tested via syzkaller Reported-by: syzbot+c9bfd85eca611ebf5db1@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=47f897f8ad958bbde5790ebf389b5e7e0a345089 Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Fixes: `6f3b911d5f` ("can: bcm: add support for CAN FD frames") Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230314120445.12407-1-ivan.orlov0322@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-03-27 14:40:45 +02:00
Rajvi Jingar	fb5755100a	platform/x86/intel/pmc: Alder Lake PCH slp_s0_residency fix For platforms with Alder Lake PCH (Alder Lake S and Raptor Lake S) the slp_s0_residency attribute has been reporting the wrong value. Unlike other platforms, ADL PCH does not have a counter for the time that the SLP_S0 signal was asserted. Instead, firmware uses the aggregate of the Low Power Mode (LPM) substate counters as the S0ix value. Since the LPM counters run at a different frequency, this lead to misreporting of the S0ix time. Add a check for Alder Lake PCH and adjust the frequency accordingly when display slp_s0_residency. Fixes: `bbab31101f` ("platform/x86/intel: pmc/core: Add Alderlake support to pmc core driver") Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20230320212029.3154407-1-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2023-03-27 13:34:34 +02:00
Chris Wilson	5c95b2d5d4	drm/i915/perf: Drop wakeref on GuC RC error If we fail to adjust the GuC run-control on opening the perf stream, make sure we unwind the wakeref just taken. v2: Retain old goto label names (Ashutosh) v3: Drop bitfield boolean Fixes: `01e7427467` ("drm/i915/guc: Support OA when Wa_16011777198 is enabled") Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit `2810ac6c75`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:22:24 +03:00
Ville Syrjälä	3413881e1e	drm/i915/dpt: Treat the DPT BO as a framebuffer Currently i915_gem_object_is_framebuffer() doesn't treat the BO containing the framebuffer's DPT as a framebuffer itself. This means eg. that the shrinker can evict the DPT BO while leaving the actual FB BO bound, when the DPT is allocated from regular shmem. That causes an immediate oops during hibernate as we try to rewrite the PTEs inside the already evicted DPT obj. TODO: presumably this might also be the reason for the DPT related display faults under heavy memory pressure, but I'm still not sure how that would happen as the object should be pinned by intel_dpt_pin() while in active use by the display engine... Cc: stable@vger.kernel.org Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Imre Deak <imre.deak@intel.com> Fixes: `0dc987b699` ("drm/i915/display: Add smem fallback allocation for dpt") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320090522.9909-2-ville.syrjala@linux.intel.com Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> (cherry picked from commit `779cb5ba64`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:22:19 +03:00
Chris Wilson	d032ca43f2	drm/i915/gem: Flush lmem contents after construction i915_gem_object_create_lmem_from_data() lacks the flush of the data written to lmem to ensure the object is marked as dirty and the writes flushed to the backing store. Once created, we can immediately release the obj->mm.mapping caching of the vmap. Fixes: `7acbbc7cf4` ("drm/i915/guc: put all guc objects in lmem when available") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230316165918.13074-1-nirmoy.das@intel.com (cherry picked from commit `e2ee10474c`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:22:14 +03:00
Imre Deak	38c5830194	drm/i915/tc: Fix the ICL PHY ownership check in TC-cold state The commit renaming icl_tc_phy_is_in_safe_mode() to icl_tc_phy_take_ownership() didn't flip the function's return value accordingly, fix this up. This didn't cause an actual problem besides state check errors, since the function is only used during HW readout. Cc: José Roberto de Souza <jose.souza@intel.com> Fixes: `f53979d68a` ("drm/i915/display/tc: Rename safe_mode functions ownership") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230316131724.359612-4-imre.deak@intel.com (cherry picked from commit `f2c7959dda`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:22:09 +03:00
Ville Syrjälä	a2b6e99d8a	drm/i915: Disable DC states for all commits Keeping DC states enabled is incompatible with the _noarm()/_arm() split we use for writing pipe/plane registers. When DC5 and PSR are enabled, all pipe/plane registers effectively become self-arming on account of DC5 exit arming the update, and PSR exit latching it. What probably saves us most of the time is that (with PIPE_MISC[21]=0) all pipe register writes themselves trigger PSR exit, and then we don't re-enter PSR until the idle frame count has elapsed. So it may be that the PSR exit happens already before we've updated the state too much. Also the PSR1 panel (at least on this KBL) seems to discard the first frame we trasmit, presumably still scanning out from its internal framebuffer at that point. So only the second frame we transmit is actually visible. But I suppose that could also be panel specific behaviour. I haven't checked out how other PSR panels behave, nor did I bother to check what the eDP spec has to say about this. And since this really is all about DC states, let's switch from the MODESET domain to the DC_OFF domain. Functionally they are 100% identical. We should probably remove the MODESET domain... And for good measure let's toss in an assert to the place where we do the _noarm() register writes to make sure DC states are in fact off. v2: Just use intel_display_power_is_enabled() (Imre) Cc: <stable@vger.kernel.org> #v5.17+ Cc: Manasi Navare <navaremanasi@google.com> Cc: Drew Davenport <ddavenport@chromium.org> Cc: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Fixes: `d13dde4495` ("drm/i915: Split pipe+output CSC programming to noarm+arm pair") Fixes: `f8a005eb89` ("drm/i915: Optimize icl+ universal plane programming") Fixes: `890b6ec4a5` ("drm/i915: Split skl+ plane update into noarm+arm pair") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320183532.17727-1-ville.syrjala@linux.intel.com (cherry picked from commit `41b4c7fe72`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:22:04 +03:00
Ville Syrjälä	4d4e766f8b	drm/i915: Workaround ICL CSC_MODE sticky arming Unlike SKL/GLK the ICL CSC unit suffers from a new issue where CSC_MODE arming is sticky. That is, once armed it remains armed causing the CSC coeff/offset registers to become effectively self-arming. CSC coeff/offset registers writes no longer disarm the CSC, but fortunately register read still do. So we can use that to disarm the CSC unit once the registers for the current frame have been latched. This avoid s the self-arming behaviour from persisting into the next frame's .color_commit_noarm() call. Cc: <stable@vger.kernel.org> #v5.19+ Cc: Manasi Navare <navaremanasi@google.com> Cc: Drew Davenport <ddavenport@chromium.org> Cc: Imre Deak <imre.deak@intel.com> Cc: Jouni Högander <jouni.hogander@intel.com> Fixes: `d13dde4495` ("drm/i915: Split pipe+output CSC programming to noarm+arm pair") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320095438.17328-5-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit `92736f1b45`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2023-03-27 13:21:58 +03:00

1 2 3 4 5 ...

1170595 Commits