linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 03:15:31 +09:00

Author	SHA1	Message	Date
Edward Adam Davis	7bd4e5367d	media: mc: Clear minor number before put device [ Upstream commit 8cfc8cec1b4da88a47c243a11f384baefd092a50 ] The device minor should not be cleared after the device is released. Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time") Cc: stable@vger.kernel.org Reported-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f Tested-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> [ moved clear_bit from media_devnode_release callback to media_devnode_unregister before put_device ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:57 +02:00
Phillip Lougher	8c7aad7675	Squashfs: reject negative file sizes in squashfs_read_inode() [ Upstream commit 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b ] Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs. This warning is ultimately caused because the underlying Squashfs file system returns a file with a negative file size. This commit checks for a negative file size and returns EINVAL. [phillip@squashfs.org.uk: only need to check 64 bit quantity] Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk Fixes: `6545b246a2` ("Squashfs: inode operations") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+f754e01116421e9754b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/ Cc: Amir Goldstein <amir73il@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:57 +02:00
Phillip Lougher	f5a1b04e5d	Squashfs: add additional inode sanity checking [ Upstream commit 9ee94bfbe930a1b39df53fa2d7b31141b780eb5a ] Patch series "Squashfs: performance improvement and a sanity check". This patchset adds an additional sanity check when reading regular file inodes, and adds support for SEEK_DATA/SEEK_HOLE lseek() whence values. This patch (of 2): Add an additional sanity check when reading regular file inodes. A regular file if the file size is an exact multiple of the filesystem block size cannot have a fragment. This is because by definition a fragment block stores tailends which are not a whole block in size. Link: https://lkml.kernel.org/r/20250923220652.568416-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250923220652.568416-2-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 9f1c14c1de1b ("Squashfs: reject negative file sizes in squashfs_read_inode()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:57 +02:00
Nathan Chancellor	edb6425f59	lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older commit 2f13daee2a72bb962f5fd356c3a263a6f16da965 upstream. After commit 6f110a5e4f99 ("Disable SLUB_TINY for build testing"), which causes CONFIG_KASAN to be enabled in allmodconfig again, arm64 allmodconfig builds with clang-17 and older show an instance of -Wframe-larger-than (which breaks the build with CONFIG_WERROR=y): lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (2336) exceeds limit (2048) in 'curve25519_generic' [-Werror,-Wframe-larger-than] 757 \| void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE], \| ^ When KASAN is disabled, the stack usage is roughly quartered: lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (608) exceeds limit (128) in 'curve25519_generic' [-Werror,-Wframe-larger-than] 757 \| void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE], \| ^ Using '-Rpass-analysis=stack-frame-layout' shows the following variables and many, many 8-byte spills when KASAN is enabled: Offset: [SP-144], Type: Variable, Align: 8, Size: 40 Offset: [SP-464], Type: Variable, Align: 8, Size: 320 Offset: [SP-784], Type: Variable, Align: 8, Size: 320 Offset: [SP-864], Type: Variable, Align: 32, Size: 80 Offset: [SP-896], Type: Variable, Align: 32, Size: 32 Offset: [SP-1016], Type: Variable, Align: 8, Size: 120 When KASAN is disabled, there are still spills but not at many and the variables list is smaller: Offset: [SP-192], Type: Variable, Align: 32, Size: 80 Offset: [SP-224], Type: Variable, Align: 32, Size: 32 Offset: [SP-344], Type: Variable, Align: 8, Size: 120 Disable KASAN for this file when using clang-17 or older to avoid blowing out the stack, clearing up the warning. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250609-curve25519-hacl64-disable-kasan-clang-v1-1-08ea0ac5ccff@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:57 +02:00
Jan Kara	abdfc4704e	ext4: free orphan info with kvfree commit 971843c511c3c2f6eda96c6b03442913bfee6148 upstream. Orphan info is now getting allocated with kvmalloc_array(). Free it with kvfree() instead of kfree() to avoid complaints from mm. Reported-by: Chris Mason <clm@meta.com> Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-ID: <20251007134936.7291-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:57 +02:00
Huacai Chen	f775f821de	ACPICA: Allow to skip Global Lock initialization commit feb8ae81b2378b75a99c81d315602ac8918ed382 upstream. Introduce acpi_gbl_use_global_lock, which allows to skip the Global Lock initialization. This is useful for systems without Global Lock (such as loong_arch), so as to avoid error messages during boot phase: ACPI Error: Could not enable global_lock event (20240827/evxfevnt-182) ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59) Link: https://github.com/acpica/acpica/commit/463cb0fe Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Deepanshu Kartikey	720a66fdaa	ext4: validate ea_ino and size in check_xattrs commit 44d2a72f4d64655f906ba47a5e108733f59e6f28 upstream. During xattr block validation, check_xattrs() processes xattr entries without validating that entries claiming to use EA inodes have non-zero sizes. Corrupted filesystems may contain xattr entries where e_value_size is zero but e_value_inum is non-zero, indicating invalid xattr data. Add validation in check_xattrs() to detect this corruption pattern early and return -EFSCORRUPTED, preventing invalid xattr entries from causing issues throughout the ext4 codebase. Cc: stable@kernel.org Suggested-by: Theodore Ts'o <tytso@mit.edu> Reported-by: syzbot+4c9d23743a2409b80293@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=4c9d23743a2409b80293 Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Message-ID: <20250923133245.1091761-1-kartikey406@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Ahmet Eray Karadag	79ea7f3e11	ext4: guard against EA inode refcount underflow in xattr update commit 57295e835408d8d425bef58da5253465db3d6888 upstream. syzkaller found a path where ext4_xattr_inode_update_ref() reads an EA inode refcount that is already <= 0 and then applies ref_change (often -1). That lets the refcount underflow and we proceed with a bogus value, triggering errors like: EXT4-fs error: EA inode <n> ref underflow: ref_count=-1 ref_change=-1 EXT4-fs warning: ea_inode dec ref err=-117 Make the invariant explicit: if the current refcount is non-positive, treat this as on-disk corruption, emit ext4_error_inode(), and fail the operation with -EFSCORRUPTED instead of updating the refcount. Delete the WARN_ONCE() as negative refcounts are now impossible; keep error reporting in ext4_error_inode(). This prevents the underflow and the follow-on orphan/cleanup churn. Reported-by: syzbot+0be4f339a8218d2a5bb1@syzkaller.appspotmail.com Fixes: https://syzbot.org/bug?extid=0be4f339a8218d2a5bb1 Cc: stable@kernel.org Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Message-ID: <20250920021342.45575-1-eraykrdg1@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Zhang Yi	9e642ab8e5	ext4: fix an off-by-one issue during moving extents commit 12e803c8827d049ae8f2c743ef66ab87ae898375 upstream. During the movement of a written extent, mext_page_mkuptodate() is called to read data in the range [from, to) into the page cache and to update the corresponding buffers. Therefore, we should not wait on any buffer whose start offset is >= 'to'. Otherwise, it will return -EIO and fail the extents movement. $ for i in `seq 3 -1 0`; \ do xfs_io -fs -c "pwrite -b 1024 $((i * 1024)) 1024" /mnt/foo; \ done $ umount /mnt && mount /dev/pmem1s /mnt # drop cache $ e4defrag /mnt/foo e4defrag 1.47.0 (5-Feb-2023) ext4 defragmentation for /mnt/foo [1/1]/mnt/foo: 0% [ NG ] Success: [0/1] Cc: stable@kernel.org Fixes: a40759fb16ae ("ext4: remove array of buffer_heads from mext_page_mkuptodate()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-ID: <20250912105841.1886799-1-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Ojaswin Mujoo	d1e681c0bb	ext4: correctly handle queries for metadata mappings commit 46c22a8bb4cb03211da1100d7ee4a2005bf77c70 upstream. Currently, our handling of metadata is _ambiguous_ in some scenarios, that is, we end up returning unknown if the range only covers the mapping partially. For example, in the following case: $ xfs_io -c fsmap -d 0: 254:16 [0..7]: static fs metadata 8 1: 254:16 [8..15]: special 102:1 8 2: 254:16 [16..5127]: special 102:2 5112 3: 254:16 [5128..5255]: special 102:3 128 4: 254:16 [5256..5383]: special 102:4 128 5: 254:16 [5384..70919]: inodes 65536 6: 254:16 [70920..70967]: unknown 48 ... $ xfs_io -c fsmap -d 24 33 0: 254:16 [24..39]: unknown 16 <--- incomplete reporting $ xfs_io -c fsmap -d 24 33 (With patch) 0: 254:16 [16..5127]: special 102:2 5112 This is because earlier in ext4_getfsmap_meta_helper, we end up ignoring any extent that starts before our queried range, but overlaps it. While the man page [1] is a bit ambiguous on this, this fix makes the output make more sense since we are anyways returning an "unknown" extent. This is also consistent to how XFS does it: $ xfs_io -c fsmap -d ... 6: 254:16 [104..127]: free space 24 7: 254:16 [128..191]: inodes 64 ... $ xfs_io -c fsmap -d 137 150 0: 254:16 [128..191]: inodes 64 <-- full extent returned [1] https://man7.org/linux/man-pages/man2/ioctl_getfsmap.2.html Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: stable@kernel.org Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Message-ID: <023f37e35ee280cd9baac0296cbadcbe10995cab.1757058211.git.ojaswin@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Yongjian Sun	871b6894a3	ext4: increase i_disksize to offset + len in ext4_update_disksize_before_punch() commit 9d80eaa1a1d37539224982b76c9ceeee736510b9 upstream. After running a stress test combined with fault injection, we performed fsck -a followed by fsck -fn on the filesystem image. During the second pass, fsck -fn reported: Inode 131512, end of extent exceeds allowed value (logical block 405, physical block 1180540, len 2) This inode was not in the orphan list. Analysis revealed the following call chain that leads to the inconsistency: ext4_da_write_end() //does not update i_disksize ext4_punch_hole() //truncate folio, keep size ext4_page_mkwrite() ext4_block_page_mkwrite() ext4_block_write_begin() ext4_get_block() //insert written extent without update i_disksize journal commit echo 1 > /sys/block/xxx/device/delete da-write path updates i_size but does not update i_disksize. Then ext4_punch_hole truncates the da-folio yet still leaves i_disksize unchanged(in the ext4_update_disksize_before_punch function, the condition offset + len < size is met). Then ext4_page_mkwrite sees ext4_nonda_switch return 1 and takes the nodioread_nolock path, the folio about to be written has just been punched out, and it’s offset sits beyond the current i_disksize. This may result in a written extent being inserted, but again does not update i_disksize. If the journal gets committed and then the block device is yanked, we might run into this. It should be noted that replacing ext4_punch_hole with ext4_zero_range in the call sequence may also trigger this issue, as neither will update i_disksize under these circumstances. To fix this, we can modify ext4_update_disksize_before_punch to increase i_disksize to min(i_size, offset + len) when both i_size and (offset + len) are greater than i_disksize. Cc: stable@kernel.org Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Message-ID: <20250911133024.1841027-1-sunyongjian@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Jan Kara	304fc34ff6	ext4: verify orphan file size is not too big commit 0a6ce20c156442a4ce2a404747bb0fb05d54eeb3 upstream. In principle orphan file can be arbitrarily large. However orphan replay needs to traverse it all and we also pin all its buffers in memory. Thus filesystems with absurdly large orphan files can lead to big amounts of memory consumed. Limit orphan file size to a sane value and also use kvmalloc() for allocating array of block descriptor structures to avoid large order allocations for sane but large orphan files. Reported-by: syzbot+0b92850d68d9b12934f5@syzkaller.appspotmail.com Fixes: `02f310fcf4` ("ext4: Speedup ext4 orphan inode handling") Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-ID: <20250909112206.10459-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Olga Kornievskaia	e7e0e3eae0	nfsd: nfserr_jukebox in nlm_fopen should lead to a retry commit a082e4b4d08a4a0e656d90c2c05da85f23e6d0c9 upstream. When v3 NLM request finds a conflicting delegation, it triggers a delegation recall and nfsd_open fails with EAGAIN. nfsd_open then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead of returning nlm_failed for when there is a conflicting delegation, drop this NLM request so that the client retries. Once delegation is recalled and if a local lock is claimed, a retry would lead to nfsd returning a nlm_lck_blocked error or a successful nlm lock. Fixes: `d343fce148` ("[PATCH] knfsd: Allow lockd to drop replies as appropriate") Cc: stable@vger.kernel.org # v6.6 Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
Thorsten Blum	925ed83efb	NFSD: Fix destination buffer size in nfsd4_ssc_setup_dul() commit ab1c282c010c4f327bd7addc3c0035fd8e3c1721 upstream. Commit `5304877936` ("NFSD: Fix strncpy() fortify warning") replaced strncpy(,, sizeof(..)) with strlcpy(,, sizeof(..) - 1), but strlcpy() already guaranteed NUL-termination of the destination buffer and subtracting one byte potentially truncated the source string. The incorrect size was then carried over in commit `72f78ae00a` ("NFSD: move from strlcpy with unused retval to strscpy") when switching from strlcpy() to strscpy(). Fix this off-by-one error by using the full size of the destination buffer again. Cc: stable@vger.kernel.org Fixes: `5304877936` ("NFSD: Fix strncpy() fortify warning") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:56 +02:00
SeongJae Park	677ebfe5d0	mm/damon/vaddr: do not repeat pte_offset_map_lock() until success commit b93af2cc8e036754c0d9970d9ddc47f43cc94b9f upstream. DAMON's virtual address space operation set implementation (vaddr) calls pte_offset_map_lock() inside the page table walk callback function. This is for reading and writing page table accessed bits. If pte_offset_map_lock() fails, it retries by returning the page table walk callback function with ACTION_AGAIN. pte_offset_map_lock() can continuously fail if the target is a pmd migration entry, though. Hence it could cause an infinite page table walk if the migration cannot be done until the page table walk is finished. This indeed caused a soft lockup when CPU hotplugging and DAMON were running in parallel. Avoid the infinite loop by simply not retrying the page table walk. DAMON is promising only a best-effort accuracy, so missing access to such pages is no problem. Link: https://lkml.kernel.org/r/20250930004410.55228-1-sj@kernel.org Fixes: `7780d04046` ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Xinyu Zheng <zhengxinyu6@huawei.com> Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com Acked-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [6.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Li RongQing	bb5ef60ee8	mm/hugetlb: early exit from hugetlb_pages_alloc_boot() when max_huge_pages=0 commit b322e88b3d553e85b4e15779491c70022783faa4 upstream. Optimize hugetlb_pages_alloc_boot() to return immediately when max_huge_pages is 0, avoiding unnecessary CPU cycles and the below log message when hugepages aren't configured in the kernel command line. [ 3.702280] HugeTLB: allocation took 0ms with hugepage_allocation_threads=32 Link: https://lkml.kernel.org/r/20250814102333.4428-1-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Tested-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Jane Chu <jane.chu@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Thadeu Lima de Souza Cascardo	81a6d6011a	mm/page_alloc: only set ALLOC_HIGHATOMIC for __GPF_HIGH allocations commit 6a204d4b14c99232e05d35305c27ebce1c009840 upstream. Commit `524c48072e` ("mm/page_alloc: rename ALLOC_HIGH to ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH, which implies ALLOC_MIN_RESERVE, is going to be used instead of __GFP_ATOMIC for high atomic reserves. Commit `eb2e2b425c` ("mm/page_alloc: explicitly record high-order atomic allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such allocations of order higher than 0. It still used __GFP_ATOMIC, though. Then, commit `1ebbb21811` ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves") just turned that check for !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to test for __GFP_HIGH. This leads to high atomic reserves being added for high-order GFP_NOWAIT allocations and others that clear __GFP_DIRECT_RECLAIM, which is unexpected. Later, those reserves lead to 0-order allocations going to the slow path and starting reclaim. From /proc/pagetypeinfo, without the patch: Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0 Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0 With the patch: Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Link: https://lkml.kernel.org/r/20250814172245.1259625-1-cascardo@igalia.com Fixes: `1ebbb21811` ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Tested-by: Helen Koike <koike@igalia.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Nick Morrow	69336589df	wifi: mt76: mt7921u: Add VID/PID for Netgear A7500 commit fc6627ca8a5f811b601aea74e934cf8a048c88ac upstream. Add VID/PID 0846/9065 for Netgear A7500. Reported-by: Autumn Dececco <autumndececco@gmail.com> Tested-by: Autumn Dececco <autumndececco@gmail.com> Signed-off-by: Nick Morrow <morrownr@gmail.com> Cc: stable@vger.kernel.org Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/80bacfd6-6073-4ce5-be32-ae9580832337@gmail.com Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Muhammad Usama Anjum	d0ca2f9fbb	wifi: ath11k: HAL SRNG: don't deinitialize and re-initialize again commit 32be3ca4cf78b309dfe7ba52fe2d7cc3c23c5634 upstream. Don't deinitialize and reinitialize the HAL helpers. The dma memory is deallocated and there is high possibility that we'll not be able to get the same memory allocated from dma when there is high memory pressure. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6 Fixes: `d5c65159f2` ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org Cc: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://patch.msgid.link/20250722053121.1145001-1-usama.anjum@collabora.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Matthieu Baerts (NGI0)	3454c79780	selftests: mptcp: join: validate C-flag + def limit commit 008385efd05e04d8dff299382df2e8be0f91d8a0 upstream. The previous commit adds an exception for the C-flag case. The 'mptcp_join.sh' selftest is extended to validate this case. In this subtest, there is a typical CDN deployment with a client where MPTCP endpoints have been 'automatically' configured: - the server set net.mptcp.allow_join_initial_addr_port=0 - the client has multiple 'subflow' endpoints, and the default limits: not accepting ADD_ADDRs. Without the parent patch, the client is not able to establish new subflows using its 'subflow' endpoints. The parent commit fixes that. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `df377be387` ("mptcp: add deny_join_id0 in mptcp_options_received") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-2-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Sean Christopherson	1264edbed4	x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) commit 27b1fd62012dfe9d3eb8ecde344d7aa673695ecf upstream. Filter out the register forms of 0F 01 when determining whether or not to emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only accept memory operands. The register variants of 0F 01 are used to encode instructions for things like VMX and SGX, i.e. not checking the Mod field would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL violation on VMLAUNCH. Fixes: `1e5db22369` ("x86/umip: Add emulation code for UMIP instructions") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Sean Christopherson	c5bceeb4c5	x86/umip: Check that the instruction opcode is at least two bytes commit 32278c677947ae2f042c9535674a7fff9a245dd3 upstream. When checking for a potential UMIP violation on #GP, verify the decoder found at least two opcode bytes to avoid false positives when the kernel encounters an unknown instruction that starts with 0f. Because the array of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret garbage as a potential SLDT or STR instruction, and can incorrectly trigger emulation. E.g. if a VPALIGNR instruction 62 83 c5 05 0f 08 ff vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts the userspace code stream). Arguably the check should look for exactly two bytes, but no three byte opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be impossible to get a false positive if the first two opcode bytes match '0f 00' or '0f 01'. Go with a more conservative check with respect to the existing code to minimize the chances of breaking userspace, e.g. due to decoder weirdness. Analyzed by Nick Bray <ncbray@google.com>. Fixes: `1e5db22369` ("x86/umip: Add emulation code for UMIP instructions") Reported-by: Dan Snyder <dansnyder@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Pratyush Yadav	d7760884ee	spi: cadence-quadspi: Flush posted register writes before DAC access commit 1ad55767e77a853c98752ed1e33b68049a243bd7 upstream. cqspi_read_setup() and cqspi_write_setup() program the address width as the last step in the setup. This is likely to be immediately followed by a DAC region read/write. On TI K3 SoCs the DAC region is on a different endpoint from the register region. This means that the order of the two operations is not guaranteed, and they might be reordered at the interconnect level. It is possible that the DAC read/write goes through before the address width update goes through. In this situation if the previous command used a different address width the OSPI command is sent with the wrong number of address bytes, resulting in an invalid command and undefined behavior. Read back the size register to make sure the write gets flushed before accessing the DAC region. Fixes: `1406234105` ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") CC: stable@vger.kernel.org Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Santhosh Kumar K <s-k6@ti.com> Message-ID: <20250905185958.3575037-3-s-k6@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:55 +02:00
Pratyush Yadav	8bf417e1d3	spi: cadence-quadspi: Flush posted register writes before INDAC access commit 29e0b471ccbd674d20d4bbddea1a51e7105212c5 upstream. cqspi_indirect_read_execute() and cqspi_indirect_write_execute() first set the enable bit on APB region and then start reading/writing to the AHB region. On TI K3 SoCs these regions lie on different endpoints. This means that the order of the two operations is not guaranteed, and they might be reordered at the interconnect level. It is possible for the AHB write to be executed before the APB write to enable the indirect controller, causing the transaction to be invalid and the write erroring out. Read back the APB region write before accessing the AHB region to make sure the write got flushed and the race condition is eliminated. Fixes: `1406234105` ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") CC: stable@vger.kernel.org Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Santhosh Kumar K <s-k6@ti.com> Message-ID: <20250905185958.3575037-2-s-k6@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Vidya Sagar	eef57e03d5	PCI: tegra194: Handle errors in BPMP response commit f8c9ad46b00453a8c075453f3745f8d263f44834 upstream. The return value from tegra_bpmp_transfer() indicates the success or failure of the IPC transaction with BPMP. If the transaction succeeded, we also need to check the actual command's result code. If we don't have error handling for tegra_bpmp_transfer(), we will set the pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer() command fails. Thus, the pcie->ep_state will get out of sync with reality, and any further PERST# assert + deassert will be a no-op and will not trigger the hardware initialization sequence. This is because pex_ep_event_pex_rst_deassert() checks the current pcie->ep_state, and does nothing if the current state is already EP_STATE_ENABLED. Thus, it is important to have error handling for tegra_bpmp_transfer(), such that the pcie->ep_state can not get out of sync with reality, so that we will try to initialize the hardware not only during the first PERST# assert + deassert, but also during any succeeding PERST# assert + deassert. One example where this fix is needed is when using a rock5b as host. During the initial PERST# assert + deassert (triggered by the bootloader on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails. Once Linux has been loaded on the rock5b, the PCIe driver will once again assert + deassert PERST#. However, without tegra_bpmp_transfer() error handling, this second PERST# assert + deassert will not trigger the hardware initialization sequence. With tegra_bpmp_transfer() error handling, the second PERST# assert + deassert will once again trigger the hardware to be initialized and this time the tegra_bpmp_transfer() succeeds. Fixes: `c57247f940` ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [cassel: improve commit log] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Niklas Cassel	315001252a	PCI: tegra194: Fix broken tegra_pcie_ep_raise_msi_irq() commit b640d42a6ac9ba01abe65ec34f7c73aaf6758ab8 upstream. The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range (1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI. But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as the MSI vector. This causes wrong MSI vector to be triggered, leading to the failure of PCI endpoint Kselftest MSI_TEST test case. To fix this issue, convert the interrupt number to MSI vector. Fixes: `c57247f940` ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Marek Vasut	a1a7a80dbe	PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlock commit 5ed35b4d490d8735021cce9b715b62a418310864 upstream. The rcar_msi_irq_unmask() function may be called from a PCI driver request_threaded_irq() function. This triggers kernel/irq/manage.c __setup_irq() which locks raw spinlock &desc->lock descriptor lock and with that descriptor lock held, calls rcar_msi_irq_unmask(). Since the &desc->lock descriptor lock is a raw spinlock, and the rcar_msi .mask_lock is not a raw spinlock, this setup triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y. Use scoped_guard() to simplify the locking. Fixes: `83ed8d4fa6` ("PCI: rcar: Convert to MSI domains") Reported-by: Duy Nguyen <duy.nguyen.rh@renesas.com> Reported-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-2-marek.vasut+renesas@mailbox.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Marek Vasut	f5770bba83	PCI: rcar-host: Drop PMSR spinlock commit 0a8f173d9dad13930d5888505dc4c4fd6a1d4262 upstream. The pmsr_lock spinlock used to be necessary to synchronize access to the PMSR register, because that access could have been triggered from either config space access in rcar_pcie_config_access() or an exception handler rcar_pcie_aarch32_abort_handler(). The rcar_pcie_aarch32_abort_handler() case is no longer applicable since commit `6e36203bc1` ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read which triggered an exception"), which performs more accurate, controlled invocation of the exception, and a fixup. This leaves rcar_pcie_config_access() as the only call site from which rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be called from the controller struct pci_ops .read and .write callbacks, and those are serialized in drivers/pci/access.c using raw spinlock 'pci_lock' . It should be noted that CONFIG_PCI_LOCKLESS_CONFIG is never set on this platform. Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a raw spinlock, this constellation triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y . Remove the pmsr_lock to fix the locking. Fixes: `a115b1bd3a` ("PCI: rcar: Add L1 link state fix into data abort hook") Reported-by: Duy Nguyen <duy.nguyen.rh@renesas.com> Reported-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-1-marek.vasut+renesas@mailbox.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Siddharth Vadapalli	608ab627d9	PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit commit e51d05f523e43ce5d2bad957943a2b14f68078cd upstream. Commit under Fixes introduced the IRQ handler for "ks-pcie-error-irq". The interrupt is acquired using "request_irq()" but is never freed if the driver exits due to an error. Although the section in the driver that invokes "request_irq()" has moved around over time, the issue hasn't been addressed until now. Fix this by using "devm_request_irq()" which automatically frees the interrupt if the driver exits. Fixes: `025dd3daed` ("PCI: keystone: Add error IRQ handler") Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/r/3d3a4b52-e343-42f3-9d69-94c259812143@kernel.org Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250912100802.3136121-2-s-vadapalli@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Lukas Wunner	61aeab7178	PCI/AER: Support errors introduced by PCIe r6.0 commit 6633875250b38b18b8638cf01e695de031c71f02 upstream. PCIe r6.0 defined five additional errors in the Uncorrectable Error Status, Mask and Severity Registers (PCIe r7.0 sec 7.8.4.2ff). lspci has been supporting them since commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"): https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b Amend the AER driver to recognize them as well, instead of logging them as "Unknown Error Bit". Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/21f1875b18d4078c99353378f37dcd6b994f6d4e.1756301211.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Niklas Schnelle	741b783950	PCI/AER: Fix missing uevent on recovery when a reset is requested commit bbf7d0468d0da71d76cc6ec9bc8a224325d07b6b upstream. Since commit `7b42d97e99` ("PCI/ERR: Always report current recovery status for udev") AER uses the result of error_detected() as parameter to pci_uevent_ers(). As pci_uevent_ers() however does not handle PCI_ERS_RESULT_NEED_RESET this results in a missing uevent for the beginning of recovery if drivers request a reset. Fix this by treating PCI_ERS_RESULT_NEED_RESET as beginning recovery. Fixes: `7b42d97e99` ("PCI/ERR: Always report current recovery status for udev") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250807-add_err_uevents-v5-1-adf85b0620b0@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:54 +02:00
Lukas Wunner	a3a52f85a2	PCI/ERR: Fix uevent on failure to recover commit 1cbc5e25fb70e942a7a735a1f3d6dd391afc9b29 upstream. Upon failure to recover from a PCIe error through AER, DPC or EDR, a uevent is sent to inform user space about disconnection of the bridge whose subordinate devices failed to recover. However the bridge itself is not disconnected. Instead, a uevent should be sent for each of the subordinate devices. Only if the "bridge" happens to be a Root Complex Event Collector or Integrated Endpoint does it make sense to send a uevent for it (because there are no subordinate devices). Right now if there is a mix of subordinate devices with and without pci_error_handlers, a BEGIN_RECOVERY event is sent for those with pci_error_handlers but no FAILED_RECOVERY event is ever sent for them afterwards. Fix it. Fixes: `856e1eb9bd` ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.16+ Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Niklas Schnelle	36039348bc	PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV commit 05703271c3cdcc0f2a8cf6ebdc45892b8ca83520 upstream. Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs. Since commit `9d16947b75` ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit `18f9e9d150` ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls. On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed: PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2. This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy. Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper. Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking. Fixes: `18f9e9d150` ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Brian Norris	bd27ddb68a	PCI/sysfs: Ensure devices are powered for config reads commit 48991e4935078b05f80616c75d1ee2ea3ae18e58 upstream. The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details. Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes. Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds"). Fixes: `56c1af4606` ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris <briannorris@google.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a44f2684c668@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Marek Vasut	7a9dee3e4c	PCI: tegra: Convert struct tegra_msi mask_lock into raw spinlock commit 26fda92d3b56bf44a02bcb4001c5a5548e0ae8ee upstream. The tegra_msi_irq_unmask() function may be called from a PCI driver request_threaded_irq() function. This triggers kernel/irq/manage.c __setup_irq() which locks raw spinlock &desc->lock descriptor lock and with that descriptor lock held, calls tegra_msi_irq_unmask(). Since the &desc->lock descriptor lock is a raw spinlock, and the tegra_msi .mask_lock is not a raw spinlock, this setup triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y. Use scoped_guard() to simplify the locking. Fixes: `2c99e55f79` ("PCI: tegra: Convert to MSI domains") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Closes: https://patchwork.kernel.org/project/linux-pci/patch/20250909162707.13927-2-marek.vasut+renesas@mailbox.org/#26574451 Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922150811.88450-1-marek.vasut+renesas@mailbox.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Sean Christopherson	a5f1934fea	rseq/selftests: Use weak symbol reference, not definition, to link with glibc commit a001cd248ab244633c5fabe4f7c707e13fc1d1cc upstream. Add "extern" to the glibc-defined weak rseq symbols to convert the rseq selftest's usage from weak symbol definitions to weak symbol _references_. Effectively re-defining the glibc symbols wreaks havoc when building with -fno-common, e.g. generates segfaults when running multi-threaded programs, as dynamically linked applications end up with multiple versions of the symbols. Building with -fcommon, which until recently has the been the default for GCC and clang, papers over the bug by allowing the linker to resolve the weak/tentative definition to glibc's "real" definition. Note, the symbol itself (or rather its address), not the value of the symbol, is set to 0/NULL for unresolved weak symbol references, as the symbol doesn't exist and thus can't have a value. Check for a NULL rseq size pointer to handle the scenario where the test is statically linked against a libc that doesn't support rseq in any capacity. Fixes: `3bcbc20942` ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Reported-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/87frdoybk4.ffs@tglx Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Esben Haabendal	9f16da9b54	rtc: interface: Fix long-standing race when setting alarm commit 795cda8338eab036013314dbc0b04aae728880ab upstream. As described in the old comment dating back to commit `6610e0893b` ("RTC: Rework RTC code to use timerqueue for events") from 2010, we have been living with a race window when setting alarm with an expiry in the near future (i.e. next second). With 1 second resolution, it can happen that the second ticks after the check for the timer having expired, but before the alarm is actually set. When this happen, no alarm IRQ is generated, at least not with some RTC chips (isl12022 is an example of this). With UIE RTC timer being implemented on top of alarm irq, being re-armed every second, UIE will occasionally fail to work, as an alarm irq lost due to this race will stop the re-arming loop. For now, I have limited the additional expiry check to only be done for alarms set to next seconds. I expect it should be good enough, although I don't know if we can now for sure that systems with loads could end up causing the same problems for alarms set 2 seconds or even longer in the future. I haven't been able to reproduce the problem with this check in place. Cc: stable@vger.kernel.org Signed-off-by: Esben Haabendal <esben@geanix.com> Link: https://lore.kernel.org/r/20250516-rtc-uie-irq-fixes-v2-1-3de8e530a39e@geanix.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Esben Haabendal	31a81d9ad8	rtc: interface: Ensure alarm irq is enabled when UIE is enabled commit 9db26d5855d0374d4652487bfb5aacf40821c469 upstream. When setting a normal alarm, user-space is responsible for using RTC_AIE_ON/RTC_AIE_OFF to control if alarm irq should be enabled. But when RTC_UIE_ON is used, interrupts must be enabled so that the requested irq events are generated. When RTC_UIE_OFF is used, alarm irq is disabled if there are no other alarms queued, so this commit brings symmetry to that. Signed-off-by: Esben Haabendal <esben@geanix.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250516-rtc-uie-irq-fixes-v2-5-3de8e530a39e@geanix.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Zhen Ni	5dd8217443	memory: samsung: exynos-srom: Fix of_iomap leak in exynos_srom_probe commit 6744085079e785dae5f7a2239456135407c58b25 upstream. The of_platform_populate() call at the end of the function has a possible failure path, causing a resource leak. Replace of_iomap() with devm_platform_ioremap_resource() to ensure automatic cleanup of srom->reg_base. This issue was detected by smatch static analysis: drivers/memory/samsung/exynos-srom.c:155 exynos_srom_probe()warn: 'srom->reg_base' from of_iomap() not released on lines: 155. Fixes: `8ac2266d88` ("memory: samsung: exynos-srom: Add support for bank configuration") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Link: https://lore.kernel.org/r/20250806025538.306593-1-zhen.ni@easystack.cn Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Rex Chen	587b819fbc	mmc: core: SPI mode remove cmd7 commit fec40f44afdabcbc4a7748e4278f30737b54bb1a upstream. SPI mode doesn't support cmd7, so remove it in mmc_sdio_alive() and confirm if sdio is active by checking CCCR register value is available or not. Signed-off-by: Rex Chen <rex.chen_1@nxp.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250728082230.1037917-2-rex.chen_1@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:53 +02:00
Linus Walleij	6a5b401c74	mtd: rawnand: fsmc: Default to autodetect buswidth commit b8df622cf7f6808c85764e681847150ed6d85f3d upstream. If you don't specify buswidth 2 (16 bits) in the device tree, FSMC doesn't even probe anymore: fsmc-nand 10100000.flash: FSMC device partno 090, manufacturer 80, revision 00, config 00 nand: device found, Manufacturer ID: 0x20, Chip ID: 0xb1 nand: ST Micro 10100000.flash nand: bus width 8 instead of 16 bits nand: No NAND device found fsmc-nand 10100000.flash: probe with driver fsmc-nand failed with error -22 With this patch to use autodetection unless buswidth is specified, the device is properly detected again: fsmc-nand 10100000.flash: FSMC device partno 090, manufacturer 80, revision 00, config 00 nand: device found, Manufacturer ID: 0x20, Chip ID: 0xb1 nand: ST Micro NAND 128MiB 1,8V 16-bit nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 fsmc-nand 10100000.flash: Using 1-bit HW ECC scheme Scanning device for bad blocks I don't know where or how this happened, I think some change in the nand core. Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Miaoqian Lin	151bd88859	xtensa: simdisk: add input size check in proc_write_simdisk commit 5d5f08fd0cd970184376bee07d59f635c8403f63 upstream. A malicious user could pass an arbitrarily bad value to memdup_user_nul(), potentially causing kernel crash. This follows the same pattern as commit ee76746387f6 ("netdevsim: prevent bad user input in nsim_dev_health_break_write()") Fixes: `b6c7e873da` ("xtensa: ISS: add host file-based simulated disk") Fixes: `16e5c1fc36` ("convert a bunch of open-coded instances of memdup_user_nul()") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Message-Id: <20250829083015.1992751-1-linmq006@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Ma Ke	3572290dfa	sparc: fix error handling in scan_one_device() commit 302c04110f0ce70d25add2496b521132548cd408 upstream. Once of_device_register() failed, we should call put_device() to decrement reference count for cleanup. Or it could cause memory leak. So fix this by calling put_device(), then the name can be freed in kobject_cleanup(). Calling path: of_device_register() -> of_device_add() -> device_add(). As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'. Found by code review. Cc: stable@vger.kernel.org Fixes: `cf44bbc26c` ("[SPARC]: Beginnings of generic of_device framework.") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Anthony Yznaga	612d10ce84	sparc64: fix hugetlb for sun4u commit 6fd44a481b3c6111e4801cec964627791d0f3ec5 upstream. An attempt to exercise sparc hugetlb code in a sun4u-based guest running under qemu results in the guest hanging due to being stuck in a trap loop. This is due to invalid hugetlb TTEs being installed that do not have the expected _PAGE_PMD_HUGE and page size bits set. Although the breakage has gone apparently unnoticed for several years, fix it now so there is the option to exercise sparc hugetlb code under qemu. This can be useful because sun4v support in qemu does not support linux guests currently and sun4v-based hardware resources may not be readily available. Fix tested with a 6.15.2 and 6.16-rc6 kernels by running libhugetlbfs tests on a qemu guest running Debian 13. Fixes: `c7d9f77d33` ("sparc64: Multi-page size support") Cc: stable@vger.kernel.org Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20250716012446.10357-1-anthony.yznaga@oracle.com Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Eric Biggers	ed3044b9c8	sctp: Fix MAC comparison to be constant-time commit dd91c79e4f58fbe2898dac84858033700e0e99fb upstream. To prevent timing attacks, MACs need to be compared in constant time. Use the appropriate helper function for this. Fixes: `bbd0d59809` ("[SCTP]: Implement the receive and verification of AUTH chunk") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20250818205426.30222-3-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Thorsten Blum	0418164564	scsi: hpsa: Fix potential memory leak in hpsa_big_passthru_ioctl() commit b81296591c567b12d3873b05a37b975707959b94 upstream. Replace kmalloc() followed by copy_from_user() with memdup_user() to fix a memory leak that occurs when copy_from_user(buff[sg_used],,) fails and the 'cleanup1:' path does not free the memory for 'buff[sg_used]'. Using memdup_user() avoids this by freeing the memory internally. Since memdup_user() already allocates memory, use kzalloc() in the else branch instead of manually zeroing 'buff[sg_used]' using memset(0). Cc: stable@vger.kernel.org Fixes: `edd163687e` ("[SCSI] hpsa: add driver for HP Smart Array controllers.") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Harshit Agarwal	ebf2b91a09	sched/deadline: Fix race in push_dl_task() commit 8fd5485fb4f3d9da3977fd783fcb8e5452463420 upstream. When a CPU chooses to call push_dl_task and picks a task to push to another CPU's runqueue then it will call find_lock_later_rq method which would take a double lock on both CPUs' runqueues. If one of the locks aren't readily available, it may lead to dropping the current runqueue lock and reacquiring both the locks at once. During this window it is possible that the task is already migrated and is running on some other CPU. These cases are already handled. However, if the task is migrated and has already been executed and another CPU is now trying to wake it up (ttwu) such that it is queued again on the runqeue (on_rq is 1) and also if the task was run by the same CPU, then the current checks will pass even though the task was migrated out and is no longer in the pushable tasks list. Please go through the original rt change for more details on the issue. To fix this, after the lock is obtained inside the find_lock_later_rq, it ensures that the task is still at the head of pushable tasks list. Also removed some checks that are no longer needed with the addition of this new check. However, the new check of pushable tasks list only applies when find_lock_later_rq is called by push_dl_task. For the other caller i.e. dl_task_offline_migration, existing checks are used. Signed-off-by: Harshit Agarwal <harshit@nutanix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250408045021.3283624-1-harshit@nutanix.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Corey Minyard	f4aab940ae	Revert "ipmi: fix msg stack when IPMI is disconnected" commit 5d09ee1bec870263f4ace439402ea840503b503b upstream. This reverts commit `c608966f3f`. This patch has a subtle bug that can cause the IPMI driver to go into an infinite loop if the BMC misbehaves in a certain way. Apparently certain BMCs do misbehave this way because several reports have come in recently about this. Signed-off-by: Corey Minyard <corey@minyard.net> Tested-by: Eric Hagberg <ehagberg@janestreet.com> Cc: <stable@vger.kernel.org> # 6.2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Jisheng Zhang	dc3a1c6237	pwm: berlin: Fix wrong register in suspend/resume commit 3a4b9d027e4061766f618292df91760ea64a1fcc upstream. The 'enable' register should be BERLIN_PWM_EN rather than BERLIN_PWM_ENABLE, otherwise, the driver accesses wrong address, there will be cpu exception then kernel panic during suspend/resume. Fixes: `bbf0722c1c` ("pwm: berlin: Add suspend/resume support") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20250819114224.31825-1-jszhang@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:52 +02:00
Nam Cao	9f88a6fd97	powerpc/pseries/msi: Fix potential underflow and leak issue commit 3443ff3be6e59b80d74036bb39f5b6409eb23cc9 upstream. pseries_irq_domain_alloc() allocates interrupts at parent's interrupt domain. If it fails in the progress, all allocated interrupts are freed. The number of successfully allocated interrupts so far is stored "i". However, "i - 1" interrupts are freed. This is broken: - One interrupt is not be freed - If "i" is zero, "i - 1" wraps around Correct the number of freed interrupts to 'i'. Fixes: `a5f3d2c17b` ("powerpc/pseries/pci: Add MSI domains") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/a980067f2b256bf716b4cd713bc1095966eed8cd.1754300646.git.namcao@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-19 16:30:51 +02:00

1 2 3 4 5 ...

1237698 Commits