linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 02:50:49 +09:00

Author	SHA1	Message	Date
Long Li	794fa8792d	f2fs: fix race in concurrent f2fs_stop_gc_thread [ Upstream commit 7b0033dbc48340a1c1c3f12448ba17d6587ca092 ] In my test case, concurrent calls to f2fs shutdown report the following stack trace: Oops: general protection fault, probably for non-canonical address 0xc6cfff63bb5513fc: 0000 [#1] PREEMPT SMP PTI CPU: 0 UID: 0 PID: 678 Comm: f2fs_rep_shutdo Not tainted 6.12.0-rc5-next-20241029-g6fb2fa9805c5-dirty #85 Call Trace: <TASK> ? show_regs+0x8b/0xa0 ? __die_body+0x26/0xa0 ? die_addr+0x54/0x90 ? exc_general_protection+0x24b/0x5c0 ? asm_exc_general_protection+0x26/0x30 ? kthread_stop+0x46/0x390 f2fs_stop_gc_thread+0x6c/0x110 f2fs_do_shutdown+0x309/0x3a0 f2fs_ioc_shutdown+0x150/0x1c0 __f2fs_ioctl+0xffd/0x2ac0 f2fs_ioctl+0x76/0xe0 vfs_ioctl+0x23/0x60 __x64_sys_ioctl+0xce/0xf0 x64_sys_call+0x2b1b/0x4540 do_syscall_64+0xa7/0x240 entry_SYSCALL_64_after_hwframe+0x76/0x7e The root cause is a race condition in f2fs_stop_gc_thread() called from different f2fs shutdown paths: [CPU0] [CPU1] ---------------------- ----------------------- f2fs_stop_gc_thread f2fs_stop_gc_thread gc_th = sbi->gc_thread gc_th = sbi->gc_thread kfree(gc_th) sbi->gc_thread = NULL < gc_th != NULL > kthread_stop(gc_th->f2fs_gc_task) //UAF The commit c7f114d864ac ("f2fs: fix to avoid use-after-free in f2fs_stop_gc_thread()") attempted to fix this issue by using a read semaphore to prevent races between shutdown and remount threads, but it fails to prevent all race conditions. Fix it by converting to write lock of s_umount in f2fs_do_shutdown(). Fixes: `7950e9ac63` ("f2fs: stop gc/discard thread after fs shutdown") Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Siddharth Vadapalli	e466b89987	PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds [ Upstream commit 22a9120479a40a56c13c5e473a0100fad2e017c0 ] According to Section 2.2 of the PCI Express Card Electromechanical Specification (Revision 5.1), in order to ensure that the power and the reference clock are stable, PERST# has to be deasserted after a delay of 100 milliseconds (TPVPERL). Currently, it is being assumed that the power is already stable, which is not necessarily true. Hence, change the delay to PCIE_T_PVPERL_MS to guarantee that power and reference clock are stable. Fixes: `f3e25911a4` ("PCI: j721e: Add TI J721E PCIe driver") Fixes: f96b69713733 ("PCI: j721e: Use T_PERST_CLK_US macro") Link: https://lore.kernel.org/r/20241104074420.1862932-1-s-vadapalli@ti.com Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Théo Lebrun	9621a3d5a4	PCI: j721e: Add suspend and resume support [ Upstream commit c538d40f365b5b6d7433d371710f58e8b266fb19 ] Add suspend and resume support. Only the Root Complex mode is supported. During the suspend stage PERST# is asserted, then deasserted during the resume stage. Link: https://lore.kernel.org/linux-pci/20240102-j7200-pcie-s2r-v7-7-a2f9156da6c3@bootlin.com Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Signed-off-by: Thomas Richard <thomas.richard@bootlin.com> [kwilczynski: commit log, update references to the PCI SIG specification] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Thomas Richard	bea0c0e401	PCI: j721e: Use T_PERST_CLK_US macro [ Upstream commit f96b6971373382855bc964f1c067bd6dc41cf0ab ] Use the T_PERST_CLK_US macro, and the fsleep() function instead of usleep_range(). Link: https://lore.kernel.org/linux-pci/20240102-j7200-pcie-s2r-v7-6-a2f9156da6c3@bootlin.com Signed-off-by: Thomas Richard <thomas.richard@bootlin.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Théo Lebrun	0a289ca902	PCI: j721e: Add reset GPIO to struct j721e_pcie [ Upstream commit b8600b8791cb2b7c8be894846b1ecddba7291680 ] Add reset GPIO to struct j721e_pcie, so it can be used at suspend and resume stages. Link: https://lore.kernel.org/linux-pci/20240102-j7200-pcie-s2r-v7-4-a2f9156da6c3@bootlin.com Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Signed-off-by: Thomas Richard <thomas.richard@bootlin.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Thomas Richard	762de2993b	PCI: cadence: Set cdns_pcie_host_init() global [ Upstream commit 063c938928dc80c2bfd66f34df48344db22e009b ] During the resume sequence of the host, cdns_pcie_host_init() needs to be called, so set it global. The dev function parameter is removed, as it isn't used. Link: https://lore.kernel.org/linux-pci/20240102-j7200-pcie-s2r-v7-2-a2f9156da6c3@bootlin.com Signed-off-by: Thomas Richard <thomas.richard@bootlin.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Thomas Richard	4231df7670	PCI: cadence: Extract link setup sequence from cdns_pcie_host_setup() [ Upstream commit d1b6f2e2ce4d8b17d9f3558c98a1517b864bfd03 ] The function cdns_pcie_host_setup() mixes probe structure and link setup. The link setup must be done during the resume sequence. So extract it from cdns_pcie_host_setup() and create a dedicated function. Link: https://lore.kernel.org/linux-pci/20240102-j7200-pcie-s2r-v7-1-a2f9156da6c3@bootlin.com Signed-off-by: Thomas Richard <thomas.richard@bootlin.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:27 +01:00
Matt Ranostay	5261d258e3	PCI: j721e: Add PCIe 4x lane selection support [ Upstream commit 4490f559f75514d5a6f0e729e85235a7be6216bf ] Add support for setting of two-bit field that allows selection of 4x lane PCIe which was previously limited to only 2x lanes. Link: https://lore.kernel.org/linux-pci/20231128054402.2155183-5-s-vadapalli@ti.com Signed-off-by: Matt Ranostay <mranostay@ti.com> Signed-off-by: Achal Verma <a-verma1@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Matt Ranostay	7c3bf69109	PCI: j721e: Add per platform maximum lane settings [ Upstream commit 3ac7f14084f54bff9c31573d1ed59d047a34fe03 ] Various platforms have different maximum amount of lanes that can be selected. Add max_lanes to struct j721e_pcie to allow for detection of this which is needed to calculate the needed bitmask size for the possible lane count. Link: https://lore.kernel.org/linux-pci/20231128054402.2155183-4-s-vadapalli@ti.com Signed-off-by: Matt Ranostay <mranostay@ti.com> Signed-off-by: Achal Verma <a-verma1@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Yoshihiro Shimoda	47203d68f5	PCI: Add T_PVPERL macro [ Upstream commit 164f66be0c2523e65df41b755c41b7c9ff58035a ] According to the PCIe CEM r5.0, sec 2.9.2, Power stable to PERST# inactive interval is 100 ms as minimum. Add a macro so that the PCIe controller drivers can make use of it. Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-2-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Stable-dep-of: 22a9120479a4 ("PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Zhiguo Niu	ed16873faf	f2fs: fix to avoid use GC_AT when setting gc_mode as GC_URGENT_LOW or GC_URGENT_MID [ Upstream commit 296b8cb34e65fa93382cf919be5a056f719c9a26 ] If gc_mode is set to GC_URGENT_LOW or GC_URGENT_MID, cost benefit GC approach should be used, but if ATGC is enabled at the same time, Age-threshold approach will be selected, which can only do amount of GC and it is much less than the numbers of CB approach. some traces: f2fs_gc-254:48-396 [007] ..... 2311600.684028: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1053, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0 f2fs_gc-254:48-396 [007] ..... 2311600.684527: f2fs_get_victim: dev = (254,48), type = No TYPE, policy = (Background GC, LFS-mode, Age-threshold), victim = 10, cost = 4294364975, ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 44898 f2fs_gc-254:48-396 [007] ..... 2311600.714835: f2fs_gc_end: dev = (254,48), ret = 0, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0 f2fs_gc-254:48-396 [007] ..... 2311600.714843: f2fs_background_gc: dev = (254,48), wait_ms = 50, prefree = 0, free = 44898 f2fs_gc-254:48-396 [007] ..... 2311600.771785: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg: f2fs_gc-254:48-396 [007] ..... 2311600.772275: f2fs_gc_end: dev = (254,48), ret = -61, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0 Fixes: `0e5e81114d` ("f2fs: add GC_URGENT_LOW mode in gc_urgent") Fixes: `d98af5f455` ("f2fs: introduce gc_urgent_mid mode") Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Chao Yu	ecf4e6782b	f2fs: fix to avoid potential deadlock in f2fs_record_stop_reason() [ Upstream commit f10a890308a7cd8794e21f646f09827c6cb4bf5d ] syzbot reports deadlock issue of f2fs as below: ====================================================== WARNING: possible circular locking dependency detected 6.12.0-rc3-syzkaller-00087-gc964ced77262 #0 Not tainted ------------------------------------------------------ kswapd0/79 is trying to acquire lock: ffff888011824088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2199 [inline] ffff888011824088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_record_stop_reason+0x52/0x1d0 fs/f2fs/super.c:4068 but task is already holding lock: ffff88804bd92610 (sb_internal#2){.+.+}-{0:0}, at: f2fs_evict_inode+0x662/0x15c0 fs/f2fs/inode.c:842 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (sb_internal#2){.+.+}-{0:0}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] __sb_start_write include/linux/fs.h:1716 [inline] sb_start_intwrite+0x4d/0x1c0 include/linux/fs.h:1899 f2fs_evict_inode+0x662/0x15c0 fs/f2fs/inode.c:842 evict+0x4e8/0x9b0 fs/inode.c:725 f2fs_evict_inode+0x1a4/0x15c0 fs/f2fs/inode.c:807 evict+0x4e8/0x9b0 fs/inode.c:725 dispose_list fs/inode.c:774 [inline] prune_icache_sb+0x239/0x2f0 fs/inode.c:963 super_cache_scan+0x38c/0x4b0 fs/super.c:223 do_shrink_slab+0x701/0x1160 mm/shrinker.c:435 shrink_slab+0x1093/0x14d0 mm/shrinker.c:662 shrink_one+0x43b/0x850 mm/vmscan.c:4818 shrink_many mm/vmscan.c:4879 [inline] lru_gen_shrink_node mm/vmscan.c:4957 [inline] shrink_node+0x3799/0x3de0 mm/vmscan.c:5937 kswapd_shrink_node mm/vmscan.c:6765 [inline] balance_pgdat mm/vmscan.c:6957 [inline] kswapd+0x1ca3/0x3700 mm/vmscan.c:7226 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 -> #1 (fs_reclaim){+.+.}-{0:0}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 __fs_reclaim_acquire mm/page_alloc.c:3834 [inline] fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3848 might_alloc include/linux/sched/mm.h:318 [inline] prepare_alloc_pages+0x147/0x5b0 mm/page_alloc.c:4493 __alloc_pages_noprof+0x16f/0x710 mm/page_alloc.c:4722 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_pages_noprof mm/mempolicy.c:2345 [inline] folio_alloc_noprof+0x128/0x180 mm/mempolicy.c:2352 filemap_alloc_folio_noprof+0xdf/0x500 mm/filemap.c:1010 do_read_cache_folio+0x2eb/0x850 mm/filemap.c:3787 read_mapping_folio include/linux/pagemap.h:1011 [inline] f2fs_commit_super+0x3c0/0x7d0 fs/f2fs/super.c:4032 f2fs_record_stop_reason+0x13b/0x1d0 fs/f2fs/super.c:4079 f2fs_handle_critical_error+0x2ac/0x5c0 fs/f2fs/super.c:4174 f2fs_write_inode+0x35f/0x4d0 fs/f2fs/inode.c:785 write_inode fs/fs-writeback.c:1503 [inline] __writeback_single_inode+0x711/0x10d0 fs/fs-writeback.c:1723 writeback_single_inode+0x1f3/0x660 fs/fs-writeback.c:1779 sync_inode_metadata+0xc4/0x120 fs/fs-writeback.c:2849 f2fs_release_file+0xa8/0x100 fs/f2fs/file.c:1941 __fput+0x23f/0x880 fs/file_table.c:431 task_work_run+0x24f/0x310 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&sbi->sb_lock){++++}-{3:3}: check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 down_write+0x99/0x220 kernel/locking/rwsem.c:1577 f2fs_down_write fs/f2fs/f2fs.h:2199 [inline] f2fs_record_stop_reason+0x52/0x1d0 fs/f2fs/super.c:4068 f2fs_handle_critical_error+0x2ac/0x5c0 fs/f2fs/super.c:4174 f2fs_evict_inode+0xa61/0x15c0 fs/f2fs/inode.c:883 evict+0x4e8/0x9b0 fs/inode.c:725 f2fs_evict_inode+0x1a4/0x15c0 fs/f2fs/inode.c:807 evict+0x4e8/0x9b0 fs/inode.c:725 dispose_list fs/inode.c:774 [inline] prune_icache_sb+0x239/0x2f0 fs/inode.c:963 super_cache_scan+0x38c/0x4b0 fs/super.c:223 do_shrink_slab+0x701/0x1160 mm/shrinker.c:435 shrink_slab+0x1093/0x14d0 mm/shrinker.c:662 shrink_one+0x43b/0x850 mm/vmscan.c:4818 shrink_many mm/vmscan.c:4879 [inline] lru_gen_shrink_node mm/vmscan.c:4957 [inline] shrink_node+0x3799/0x3de0 mm/vmscan.c:5937 kswapd_shrink_node mm/vmscan.c:6765 [inline] balance_pgdat mm/vmscan.c:6957 [inline] kswapd+0x1ca3/0x3700 mm/vmscan.c:7226 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Chain exists of: &sbi->sb_lock --> fs_reclaim --> sb_internal#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(sb_internal#2); lock(fs_reclaim); lock(sb_internal#2); lock(&sbi->sb_lock); Root cause is there will be potential deadlock in between below tasks: Thread A Kswapd - f2fs_ioc_commit_atomic_write - mnt_want_write_file -- down_read lock A - balance_pgdat - __fs_reclaim_acquire -- lock B - shrink_node - prune_icache_sb - dispose_list - f2fs_evict_inode - sb_start_intwrite -- down_read lock A - f2fs_do_sync_file - f2fs_write_inode - f2fs_handle_critical_error - f2fs_record_stop_reason - f2fs_commit_super - read_mapping_folio - filemap_alloc_folio_noprof - fs_reclaim_acquire -- lock B Both threads try to acquire read lock of lock A, then its upcoming write lock grabber will trigger deadlock. Let's always create an asynchronous task in f2fs_handle_critical_error() rather than calling f2fs_record_stop_reason() synchronously to avoid this potential deadlock issue. Fixes: `b62e71be21` ("f2fs: support errors=remount-ro\|continue\|panic mountoption") Reported-by: syzbot+be4a9983e95a5e25c8d3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6704d667.050a0220.1e4d62.0081.GAE@google.com Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Yongpeng Yang	67f4c66460	f2fs: check curseg->inited before write_sum_page in change_curseg [ Upstream commit 43563069e1c1df417d2eed6eca8a22fc6b04691d ] In the __f2fs_init_atgc_curseg->get_atssr_segment calling, curseg->segno is NULL_SEGNO, indicating that there is no summary block that needs to be written. Fixes: `093749e296` ("f2fs: support age threshold based garbage collection") Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
LongPing Wei	f3d586b7ab	f2fs: fix the wrong f2fs_bug_on condition in f2fs_do_replace_block [ Upstream commit c3af1f13476ec23fd99c98d060a89be28c1e8871 ] This f2fs_bug_on was introduced by commit `2c1905042c` ("f2fs: check segment type in __f2fs_replace_block") when there were only 6 curseg types. After commit `d0b9e42ab6` ("f2fs: introduce inmem curseg") was introduced, the condition should be changed to checking curseg->seg_type. Fixes: `d0b9e42ab6` ("f2fs: introduce inmem curseg") Signed-off-by: LongPing Wei <weilongping@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:26 +01:00
Arnaldo Carvalho de Melo	aac3361f6d	perf ftrace latency: Fix unit on histogram first entry when using --use-nsec [ Upstream commit 064d569e20e82c065b1dec9d20c29c7087bb1a00 ] The use_nsec arg wasn't being taken into account when printing the first histogram entry, fix it: root@number:~# perf ftrace latency --use-nsec -T switch_mm_irqs_off -a sleep 2 # DURATION \| COUNT \| GRAPH \| 0 - 1 us \| 0 \| \| 1 - 2 ns \| 0 \| \| 2 - 4 ns \| 0 \| \| 4 - 8 ns \| 0 \| \| 8 - 16 ns \| 0 \| \| 16 - 32 ns \| 0 \| \| 32 - 64 ns \| 125 \| \| 64 - 128 ns \| 335 \| \| 128 - 256 ns \| 2155 \| #### \| 256 - 512 ns \| 9996 \| ################### \| 512 - 1024 ns \| 4958 \| ######### \| 1 - 2 us \| 4636 \| ######### \| 2 - 4 us \| 1053 \| ## \| 4 - 8 us \| 15 \| \| 8 - 16 us \| 1 \| \| 16 - 32 us \| 0 \| \| 32 - 64 us \| 0 \| \| 64 - 128 us \| 0 \| \| 128 - 256 us \| 0 \| \| 256 - 512 us \| 0 \| \| 512 - 1024 us \| 0 \| \| 1 - ... ms \| 0 \| \| root@number:~# After: root@number:~# perf ftrace latency --use-nsec -T switch_mm_irqs_off -a sleep 2 # DURATION \| COUNT \| GRAPH \| 0 - 1 ns \| 0 \| \| 1 - 2 ns \| 0 \| \| 2 - 4 ns \| 0 \| \| 4 - 8 ns \| 0 \| \| 8 - 16 ns \| 0 \| \| 16 - 32 ns \| 0 \| \| 32 - 64 ns \| 19 \| \| 64 - 128 ns \| 94 \| \| 128 - 256 ns \| 2191 \| #### \| 256 - 512 ns \| 9719 \| #################### \| 512 - 1024 ns \| 5330 \| ########### \| 1 - 2 us \| 4104 \| ######## \| 2 - 4 us \| 807 \| # \| 4 - 8 us \| 9 \| \| 8 - 16 us \| 0 \| \| 16 - 32 us \| 0 \| \| 32 - 64 us \| 0 \| \| 64 - 128 us \| 0 \| \| 128 - 256 us \| 0 \| \| 256 - 512 us \| 0 \| \| 512 - 1024 us \| 0 \| \| 1 - ... ms \| 0 \| \| root@number:~# Fixes: `84005bb614` ("perf ftrace latency: Add -n/--use-nsec option") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/ZyE3frB-hMXHCnMO@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Ilpo Järvinen	b6b896c2fd	PCI: cpqphp: Fix PCIBIOS_* return value confusion [ Upstream commit e2226dbc4a4919d9c8bd9293299b532090bdf020 ] Code in and related to PCI_RefinedAccessConfig() has three types of return type confusion: - PCI_RefinedAccessConfig() tests pci_bus_read_config_dword() return value against -1. - PCI_RefinedAccessConfig() returns both -1 and PCIBIOS_* return codes. - Callers of PCI_RefinedAccessConfig() only test for -1. Make PCI_RefinedAccessConfig() return PCIBIOS_* codes consistently and adapt callers accordingly. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20241022091140.3504-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
weiyufeng	f974480cf3	PCI: cpqphp: Use PCI_POSSIBLE_ERROR() to check config reads [ Upstream commit 87d5403378cccc557af9e02a8a2c8587ad8b7e9a ] Use PCI_POSSIBLE_ERROR() to check the response we get when we read data from hardware. This unifies PCI error response checking and makes error checks consistent and easier to find. Link: https://lore.kernel.org/r/20240806065050.28725-1-412574090@163.com Signed-off-by: weiyufeng <weiyufeng@kylinos.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: e2226dbc4a49 ("PCI: cpqphp: Fix PCIBIOS_* return value confusion") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Paolo Bonzini	b7c4121a43	rust: macros: fix documentation of the paste! macro [ Upstream commit 15541c9263ce34ff95a06bc68f45d9bc5c990bcd ] One of the example in this section uses a curious mix of the constant and function declaration syntaxes; fix it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Fixes: `823d4737d4` ("rust: macros: add `paste!` proc macro") Link: https://lore.kernel.org/r/20241019072208.1016707-1-pbonzini@redhat.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Leo Yan	cbc853c490	perf probe: Correct demangled symbols in C++ program [ Upstream commit 314909f13cc12d47c468602c37dace512d225eeb ] An issue can be observed when probe C++ demangled symbol with steps: # nm test_cpp_mangle \| grep print_data 0000000000000c94 t _GLOBAL__sub_I__Z10print_datai 0000000000000afc T _Z10print_datai 0000000000000b38 T _Z10print_dataR5Point # perf probe -x /home/niayan01/test_cpp_mangle -F --demangle ... print_data(Point&) print_data(int) ... # perf --debug verbose=3 probe -x test_cpp_mangle --add "test=print_data(int)" probe-definition(0): test=print_data(int) symbol:print_data(int) file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Open Debuginfo file: /home/niayan01/test_cpp_mangle Try to find probe point from debuginfo. Symbol print_data(int) address found : afc Matched function: print_data [2ccf] Probe point found: print_data+0 Found 1 probe_trace_events. Opening /sys/kernel/tracing//uprobe_events write=1 Opening /sys/kernel/tracing//README write=0 Writing event: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0xb38 ... When tried to probe symbol "print_data(int)", the log shows: Symbol print_data(int) address found : afc The found address is 0xafc - which is right with verifying the output result from nm. Afterwards when write event, the command uses offset 0xb38 in the last log, which is a wrong address. The dwarf_diename() gets a common function name, in above case, it returns string "print_data". As a result, the tool parses the offset based on the common name. This leads to probe at the wrong symbol "print_data(Point&)". To fix the issue, use the die_get_linkage_name() function to retrieve the distinct linkage name - this is the mangled name for the C++ case. Based on this unique name, the tool can get a correct offset for probing. Based on DWARF doc, it is possible the linkage name is missed in the DIE, it rolls back to use dwarf_diename(). After: # perf --debug verbose=3 probe -x test_cpp_mangle --add "test=print_data(int)" probe-definition(0): test=print_data(int) symbol:print_data(int) file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Open Debuginfo file: /home/niayan01/test_cpp_mangle Try to find probe point from debuginfo. Symbol print_data(int) address found : afc Matched function: print_data [2d06] Probe point found: print_data+0 Found 1 probe_trace_events. Opening /sys/kernel/tracing//uprobe_events write=1 Opening /sys/kernel/tracing//README write=0 Writing event: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0xafc Added new event: probe_test_cpp_mangle:test (on print_data(int) in /home/niayan01/test_cpp_mangle) You can now use it in all perf tools, such as: perf record -e probe_test_cpp_mangle:test -aR sleep 1 # perf --debug verbose=3 probe -x test_cpp_mangle --add "test2=print_data(Point&)" probe-definition(0): test2=print_data(Point&) symbol:print_data(Point&) file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Open Debuginfo file: /home/niayan01/test_cpp_mangle Try to find probe point from debuginfo. Symbol print_data(Point&) address found : b38 Matched function: print_data [2ccf] Probe point found: print_data+0 Found 1 probe_trace_events. Opening /sys/kernel/tracing//uprobe_events write=1 Parsing probe_events: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0x0000000000000afc Group:probe_test_cpp_mangle Event:test probe:p Opening /sys/kernel/tracing//README write=0 Writing event: p:probe_test_cpp_mangle/test2 /home/niayan01/test_cpp_mangle:0xb38 Added new event: probe_test_cpp_mangle:test2 (on print_data(Point&) in /home/niayan01/test_cpp_mangle) You can now use it in all perf tools, such as: perf record -e probe_test_cpp_mangle:test2 -aR sleep 1 Fixes: `fb1587d869` ("perf probe: List probes with line number and file name") Signed-off-by: Leo Yan <leo.yan@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20241012141432.877894-1-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Ian Rogers	2c6f6c3843	perf probe: Fix libdw memory leak [ Upstream commit 4585038b8e186252141ef86e9f0d8e97f11dce8d ] Add missing dwarf_cfi_end to free memory associated with probe_finder cfi_eh which is allocated and owned via a call to dwarf_getcfi_elf. Confusingly cfi_dbg shouldn't be freed as its memory is owned by the passed in debuginfo struct. Add comments to highlight this. This addresses leak sanitizer issues seen in: tools/perf/tests/shell/test_uprobe_from_different_cu.sh Fixes: `270bde1e76` ("perf probe: Search both .eh_frame and .debug_frame sections for probe location") Signed-off-by: Ian Rogers <irogers@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20241016235622.52166-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Chao Yu	f1b8bfe8d2	f2fs: fix to account dirty data in __get_secs_required() [ Upstream commit 1acd73edbbfef2c3c5b43cba4006a7797eca7050 ] It will trigger system panic w/ testcase in [1]: ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.c:2752! RIP: 0010:new_curseg+0xc81/0x2110 Call Trace: f2fs_allocate_data_block+0x1c91/0x4540 do_write_page+0x163/0xdf0 f2fs_outplace_write_data+0x1aa/0x340 f2fs_do_write_data_page+0x797/0x2280 f2fs_write_single_data_page+0x16cd/0x2190 f2fs_write_cache_pages+0x994/0x1c80 f2fs_write_data_pages+0x9cc/0xea0 do_writepages+0x194/0x7a0 filemap_fdatawrite_wbc+0x12b/0x1a0 __filemap_fdatawrite_range+0xbb/0xf0 file_write_and_wait_range+0xa1/0x110 f2fs_do_sync_file+0x26f/0x1c50 f2fs_sync_file+0x12b/0x1d0 vfs_fsync_range+0xfa/0x230 do_fsync+0x3d/0x80 __x64_sys_fsync+0x37/0x50 x64_sys_call+0x1e88/0x20d0 do_syscall_64+0x4b/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e The root cause is if checkpoint_disabling and lfs_mode are both on, it will trigger OPU for all overwritten data, it may cost more free segment than expected, so f2fs must account those data correctly to calculate cosumed free segments later, and return ENOSPC earlier to avoid run out of free segment during block allocation. [1] https://lore.kernel.org/fstests/20241015025106.3203676-1-chao@kernel.org/ Fixes: `4354994f09` ("f2fs: checkpoint disabling") Cc: Daniel Rosenberg <drosen@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:25 +01:00
Qi Han	6b0ed65c94	f2fs: compress: fix inconsistent update of i_blocks in release_compress_blocks and reserve_compress_blocks [ Upstream commit 26413ce18e85de3dda2cd3d72c3c3e8ab8f4f996 ] After release a file and subsequently reserve it, the FSCK flag is set when the file is deleted, as shown in the following backtrace: F2FS-fs (dm-48): Inconsistent i_blocks, ino:401231, iblocks:1448, sectors:1472 fs_rec_info_write_type+0x58/0x274 f2fs_rec_info_write+0x1c/0x2c set_sbi_flag+0x74/0x98 dec_valid_block_count+0x150/0x190 f2fs_truncate_data_blocks_range+0x2d4/0x3cc f2fs_do_truncate_blocks+0x2fc/0x5f0 f2fs_truncate_blocks+0x68/0x100 f2fs_truncate+0x80/0x128 f2fs_evict_inode+0x1a4/0x794 evict+0xd4/0x280 iput+0x238/0x284 do_unlinkat+0x1ac/0x298 __arm64_sys_unlinkat+0x48/0x68 invoke_syscall+0x58/0x11c For clusters of the following type, i_blocks are decremented by 1 and i_compr_blocks are incremented by 7 in release_compress_blocks, while updates to i_blocks and i_compr_blocks are skipped in reserve_compress_blocks. raw node: D D D D D D D D after compress: C D D D D D D D after reserve: C D D D D D D D Let's update i_blocks and i_compr_blocks properly in reserve_compress_blocks. Fixes: eb8fbaa53374 ("f2fs: compress: fix to check unreleased compressed cluster") Signed-off-by: Qi Han <hanqi@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Veronika Molnarova	9ac8d66362	perf test attr: Add back missing topdown events [ Upstream commit 6bff76af9635411214ca44ea38fc2781e78064b6 ] With the patch `0b6c5371c0` "Add missing topdown metrics events" eight topdown metric events with numbers ranging from 0x8000 to 0x8700 were added to the test since they were added as 'perf stat' default events. Later the patch `951efb9976` "Update no event/metric expectations" kept only 4 of those events(0x8000-0x8300). Currently, the topdown events with numbers 0x8400 to 0x8700 are missing from the list of expected events resulting in a failure. Add back the missing topdown events. Fixes: `951efb9976` ("perf test attr: Update no event/metric expectations") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Tested-by: Ian Rogers <irogers@google.com> Cc: mpetlan@redhat.com Link: https://lore.kernel.org/r/20240311081611.7835-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Michael Petlan	0c47534539	perf trace: Keep exited threads for summary [ Upstream commit d29d92df410e2fb523f640478b18f70c1823e55e ] Since 9ffa6c7512ca ("perf machine thread: Remove exited threads by default") perf cleans exited threads up, but as said, sometimes they are necessary to be kept. The mentioned commit does not cover all the cases, we also need the information to construct the summary table in perf-trace. Before: # perf trace -s true Summary of events: After: # perf trace -s -- true Summary of events: true (383382), 64 events, 91.4% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ mmap 8 0 0.150 0.013 0.019 0.031 11.90% mprotect 3 0 0.045 0.014 0.015 0.017 6.47% openat 2 0 0.014 0.006 0.007 0.007 9.73% munmap 1 0 0.009 0.009 0.009 0.009 0.00% access 1 1 0.009 0.009 0.009 0.009 0.00% pread64 4 0 0.006 0.001 0.001 0.002 4.53% fstat 2 0 0.005 0.001 0.002 0.003 37.59% arch_prctl 2 1 0.003 0.001 0.002 0.002 25.91% read 1 0 0.003 0.003 0.003 0.003 0.00% close 2 0 0.003 0.001 0.001 0.001 3.86% brk 1 0 0.002 0.002 0.002 0.002 0.00% rseq 1 0 0.001 0.001 0.001 0.001 0.00% prlimit64 1 0 0.001 0.001 0.001 0.001 0.00% set_robust_list 1 0 0.001 0.001 0.001 0.001 0.00% set_tid_address 1 0 0.001 0.001 0.001 0.001 0.00% execve 1 0 0.000 0.000 0.000 0.000 0.00% [namhyung: simplified the condition] Fixes: 9ffa6c7512ca ("perf machine thread: Remove exited threads by default") Reported-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Link: https://lore.kernel.org/r/20240927151926.399474-1-mpetlan@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Ian Rogers	380bc5a698	perf stat: Fix affinity memory leaks on error path [ Upstream commit 7f6ccb70e465bd8c9cf8973aee1c01224e4bdb3c ] Missed cleanup when an error occurs. Fixes: `49de179577` ("perf stat: No need to setup affinities when starting a workload") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241001052327.7052-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Levi Yun	035c6b7a13	perf stat: Close cork_fd when create_perf_stat_counter() failed [ Upstream commit e880a70f8046df0dd9089fa60dcb866a2cc69194 ] When create_perf_stat_counter() failed, it doesn't close workload.cork_fd open in evlist__prepare_workload(). This could make too many open file error while __run_perf_stat() repeats. Introduce evlist__cancel_workload to close workload.cork_fd and wait workload.child_pid until exit to clear child process when create_perf_stat_counter() is failed. Signed-off-by: Levi Yun <yeoreum.yun@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: nd@arm.com Cc: howardchu95@gmail.com Link: https://lore.kernel.org/r/20240925132022.2650180-2-yeoreum.yun@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Stable-dep-of: 7f6ccb70e465 ("perf stat: Fix affinity memory leaks on error path") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Todd Kjos	8e098baf6b	PCI: Fix reset_method_store() memory leak [ Upstream commit 2985b1844f3f3447f2d938eff1ef6762592065a5 ] In reset_method_store(), a string is allocated via kstrndup() and assigned to the local "options". options is then used in with strsep() to find spaces: while ((name = strsep(&options, " ")) != NULL) { If there are no remaining spaces, then options is set to NULL by strsep(), so the subsequent kfree(options) doesn't free the memory allocated via kstrndup(). Fix by using a separate tmp_options to iterate with strsep() so options is preserved. Link: https://lore.kernel.org/r/20241001231147.3583649-1-tkjos@google.com Fixes: `d88f521da3` ("PCI: Allow userspace to query and set device reset mechanism") Signed-off-by: Todd Kjos <tkjos@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Andreas Gruenbacher	e30cab288c	gfs2: Fix unlinked inode cleanup [ Upstream commit 7c6f714d88475ceae5342264858a641eafa19632 ] Before commit `f0e56edc2e` ("gfs2: Split the two kinds of glock "delete" work"), function delete_work_func() was used to trigger the eviction of in-memory inodes from remote as well as deleting unlinked inodes at a later point. These two kinds of work were then split into two kinds of work, and the two places in the code were deferred deletion of inodes is required accidentally ended up queuing the wrong kind of work. This caused unlinked inodes to be left behind, which could in the worst case fill up filesystems and require a filesystem check to recover. Fix that by queuing the right kind of work in try_rgrp_unlink() and gfs2_drop_inode(). Fixes: `f0e56edc2e` ("gfs2: Split the two kinds of glock "delete" work") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:24 +01:00
Andreas Gruenbacher	8264963475	gfs2: Allow immediate GLF_VERIFY_DELETE work [ Upstream commit 160bc9555d8654464cbbd7bb1f6687048471d2f6 ] Add an argument to gfs2_queue_verify_delete() that allows it to queue GLF_VERIFY_DELETE work for immediate execution. This is used in the next patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
Andreas Gruenbacher	4389447f1b	gfs2: Rename GLF_VERIFY_EVICT to GLF_VERIFY_DELETE [ Upstream commit 820ce8ed53ce2111aa5171f7349f289d7e9d0693 ] Rename the GLF_VERIFY_EVICT flag to GLF_VERIFY_DELETE: that flag indicates that we want to delete an inode / verify that it has been deleted. To match, rename gfs2_queue_verify_evict() to gfs2_queue_verify_delete(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
Andreas Gruenbacher	39822f7f49	gfs2: Replace gfs2_glock_queue_put with gfs2_glock_put_async [ Upstream commit ee2be7d7c7f32783f60ee5fe59b91548a4571f10 ] Function gfs2_glock_queue_put() puts a glock reference by enqueuing glock work instead of putting the reference directly. This ensures that the operation won't sleep, but it is costly and really only necessary when putting the final glock reference. Replace it with a new gfs2_glock_put_async() function that only queues glock work when putting the last glock reference. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
Andreas Gruenbacher	67696fef78	gfs2: Get rid of gfs2_glock_queue_put in signal_our_withdraw [ Upstream commit f80d882edcf242d0256d9e51b09d5fb7a3a0d3b4 ] In function signal_our_withdraw(), we are calling gfs2_glock_queue_put() in a context in which we are actually allowed to sleep, so replace that with a simple call to gfs2_glock_put(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
James Clark	300b218862	perf cs-etm: Don't flush when packet_queue fills up [ Upstream commit 5afd032961e8465808c4bc385c06e7676fbe1951 ] cs_etm__flush(), like cs_etm__sample() is an operation that generates a sample and then swaps the current with the previous packet. Calling flush after processing the queues results in two swaps which corrupts the next sample. Therefore it wasn't appropriate to call flush here so remove it. Flushing is still done on a discontinuity to explicitly clear the last branch buffer, but when the packet_queue fills up before reaching a timestamp, that's not a discontinuity and the call to cs_etm__process_traceid_queue() already generated samples and drained the buffers correctly. This is visible by looking for a branch that has the same target as the previous branch and the following source is before the address of the last target, which is impossible as execution would have had to have gone backwards: ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94 (packet_queue fills here before a timestamp, resulting in a flush and branch target ffff80008011cadc is duplicated.) ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94 ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34 After removing the flush the correct branch target is used for the second sample, and ffff8000801117c4 is no longer before the previous address: ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94 ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0 ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34 Make sure that a final branch stack is output at the end of the trace by calling cs_etm__end_block(). This is already done for both the timeless decode paths. Fixes: `21fe8dc119` ("perf cs-etm: Add support for CPU-wide trace scenarios") Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/ Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: John Garry <john.g.garry@oracle.com> Cc: scclevenger@os.amperecomputing.com Link: https://lore.kernel.org/r/20240916135743.1490403-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
Dan Carpenter	39e5f390c2	mailbox: arm_mhuv2: clean up loop in get_irq_chan_comb() [ Upstream commit 192a16a3430ca459c4e986f3d10758c4d6b1aa29 ] Both the inner and outer loops in this code use the "i" iterator. The inner loop should really use a different iterator. It doesn't affect things in practice because the data comes from the device tree. The "protocol" and "windows" variables are going to be zero. That means we're always going to hit the "return &chans[channel];" statement and we're not going to want to iterate through the outer loop again. Still it's worth fixing this for future use cases. Fixes: `5a6338cce9` ("mailbox: arm_mhuv2: Add driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
Paul Aurich	ebe0f8dc24	smb: cached directories can be more than root file handle [ Upstream commit 128630e1dbec8074c7707aad107299169047e68f ] Update this log message since cached fids may represent things other than the root of a mount. Fixes: `e4029e0726` ("cifs: find and use the dentry for cached non-root directories also") Signed-off-by: Paul Aurich <paul@darkrain42.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:23 +01:00
zhang jiao	f65f4ad877	pinctrl: k210: Undef K210_PC_DEFAULT [ Upstream commit 7e86490c5dee5c41a55f32d0dc34269e200e6909 ] When the temporary macro K210_PC_DEFAULT is not needed anymore, use its name in the #undef statement instead of the incorrect "DEFAULT" name. Fixes: `d4c34d09ab` ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver") Signed-off-by: zhang jiao <zhangjiao2@cmss.chinamobile.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/20241113071201.5440-1-zhangjiao2@cmss.chinamobile.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Konrad Dybcio	0dffdb2e70	arm64: dts: qcom: sc8180x: Add a SoC-specific compatible to cpufreq-hw [ Upstream commit 5df30684415d5a902f23862ab5bbed2a2df7fbf1 ] Comply with bindings guidelines and get rid of errors such as: cpufreq@18323000: compatible: 'oneOf' conditional failed, one must be fixed: ['qcom,cpufreq-hw'] is too short Fixes: `8575f197b0` ("arm64: dts: qcom: Introduce the SC8180x platform") Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Nuno Sa	118aa7caca	clk: clk-axi-clkgen: make sure to enable the AXI bus clock [ Upstream commit c64ef7e4851d1a9abbb7f7833e4936973ac5ba79 ] In order to access the registers of the HW, we need to make sure that the AXI bus clock is enabled. Hence let's increase the number of clocks by one. In order to keep backward compatibility and make sure old DTs still work we check if clock-names is available or not. If it is, then we can disambiguate between really having the AXI clock or a parent clock and so we can enable the bus clock. If not, we fallback to what was done before and don't explicitly enable the AXI bus clock. Note that if clock-names is given, the axi clock must be the last one in the phandle array (also enforced in the DT bindings) so that we can reuse as much code as possible. Fixes: `0e646c52cf` ("clk: Add axi-clkgen driver") Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20241029-axi-clkgen-fix-axiclk-v2-2-bc5e0733ad76@analog.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Nuno Sa	abdf848ce5	dt-bindings: clock: axi-clkgen: include AXI clk [ Upstream commit 47f3f5a82a31527e027929c5cec3dd1ef5ef30f5 ] In order to access the registers of the HW, we need to make sure that the AXI bus clock is enabled. Hence let's increase the number of clocks by one and add clock-names to differentiate between parent clocks and the bus clock. Fixes: `0e646c52cf` ("clk: Add axi-clkgen driver") Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20241029-axi-clkgen-fix-axiclk-v2-1-bc5e0733ad76@analog.com Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Sergio Paracuellos	fbb13732c6	clk: ralink: mtmips: fix clocks probe order in oldest ralink SoCs [ Upstream commit d34db686a3d74bd564bfce2ada15011c556269fc ] Base clocks are the first in being probed and are real dependencies of the rest of fixed, factor and peripheral clocks. For old ralink SoCs RT2880, RT305x and RT3883 'xtal' must be defined first since in any other case, when fixed clocks are probed they are delayed until 'xtal' is probed so the following warning appears: WARNING: CPU: 0 PID: 0 at drivers/clk/ralink/clk-mtmips.c:499 rt3883_bus_recalc_rate+0x98/0x138 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.43 #0 Stack : 805e58d0 00000000 00000004 8004f950 00000000 00000004 00000000 00000000 80669c54 80830000 80700000 805ae570 80670068 00000001 80669bf8 00000000 00000000 00000000 805ae570 80669b38 00000020 804db7dc 00000000 00000000 203a6d6d 80669b78 80669e48 70617773 00000000 805ae570 00000000 00000009 00000000 00000001 00000004 00000001 00000000 00000000 83fe43b0 00000000 ... Call Trace: [<800065d0>] show_stack+0x64/0xf4 [<804bca14>] dump_stack_lvl+0x38/0x60 [<800218ac>] __warn+0x94/0xe4 [<8002195c>] warn_slowpath_fmt+0x60/0x94 [<80259ff8>] rt3883_bus_recalc_rate+0x98/0x138 [<80254530>] __clk_register+0x568/0x688 [<80254838>] of_clk_hw_register+0x18/0x2c [<8070b910>] rt2880_clk_of_clk_init_driver+0x18c/0x594 [<8070b628>] of_clk_init+0x1c0/0x23c [<806fc448>] plat_time_init+0x58/0x18c [<806fdaf0>] time_init+0x10/0x6c [<806f9bc4>] start_kernel+0x458/0x67c ---[ end trace 0000000000000000 ]--- When this driver was mainlined we could not find any active users of old ralink SoCs so we cannot perform any real tests for them. Now, one user of a Belkin f9k1109 version 1 device which uses RT3883 SoC appeared and reported some issues in openWRT: - https://github.com/openwrt/openwrt/issues/16054 Thus, define a 'rt2880_xtal_recalc_rate()' just returning the expected frequency 40Mhz and use it along the old ralink SoCs to have a correct boot trace with no warnings and a working clock plan from the beggining. Fixes: `6f3b15586e` ("clk: ralink: add clock and reset driver for MTMIPS SoCs") Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20240910044024.120009-3-sergio.paracuellos@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Sergio Paracuellos	f85a1d06af	clk: ralink: mtmips: fix clock plan for Ralink SoC RT3883 [ Upstream commit 33239152305567b3e9bf052f71fd4baecd626341 ] Clock plan for Ralink SoC RT3883 needs an extra 'periph' clock to properly set some peripherals that has this clock as their parent. When this driver was mainlined we could not find any active users of this SoC so we cannot perform any real tests for it. Now, one user of a Belkin f9k1109 version 1 device which uses this SoC appear and reported some issues in openWRT: - https://github.com/openwrt/openwrt/issues/16054 The peripherals that are wrong are 'uart', 'i2c', 'i2s' and 'uartlite' which has a not defined 'periph' clock as parent. Hence, introduce it to have a properly working clock plan for this SoC. Fixes: `6f3b15586e` ("clk: ralink: add clock and reset driver for MTMIPS SoCs") Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20240910044024.120009-2-sergio.paracuellos@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Charles Han	72ea9a7e9e	clk: clk-apple-nco: Add NULL check in applnco_probe [ Upstream commit 969c765e2b508cca9099d246c010a1e48dcfd089 ] Add NULL check in applnco_probe, to handle kernel NULL pointer dereference error. Fixes: `6641057d5d` ("clk: clk-apple-nco: Add driver for Apple NCO") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/r/20241114072820.3071-1-hanchunchao@inspur.com Reviewed-by: Martin Povišer <povik+lin@cutebit.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Patrisious Haddad	921fcf2971	RDMA/mlx5: Move events notifier registration to be after device registration [ Upstream commit ede132a5cf559f3ab35a4c28bac4f4a6c20334d8 ] Move pkey change work initialization and cleanup from device resources stage to notifier stage, since this is the stage which handles this work events. Fix a race between the device deregistration and pkey change work by moving MLX5_IB_STAGE_DEVICE_NOTIFIER to be after MLX5_IB_STAGE_IB_REG in order to ensure that the notifier is deregistered before the device during cleanup. Which ensures there are no works that are being executed after the device has already unregistered which can cause the panic below. BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 630071 Comm: kworker/1:2 Kdump: loaded Tainted: G W OE --------- --- 5.14.0-162.6.1.el9_1.x86_64 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 02/27/2023 Workqueue: events pkey_change_handler [mlx5_ib] RIP: 0010:setup_qp+0x38/0x1f0 [mlx5_ib] Code: ee 41 54 45 31 e4 55 89 f5 53 48 89 fb 48 83 ec 20 8b 77 08 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 48 8b 07 48 8d 4c 24 16 <4c> 8b 38 49 8b 87 80 0b 00 00 4c 89 ff 48 8b 80 08 05 00 00 8b 40 RSP: 0018:ffffbcc54068be20 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff954054494128 RCX: ffffbcc54068be36 RDX: ffff954004934000 RSI: 0000000000000001 RDI: ffff954054494128 RBP: 0000000000000023 R08: ffff954001be2c20 R09: 0000000000000001 R10: ffff954001be2c20 R11: ffff9540260133c0 R12: 0000000000000000 R13: 0000000000000023 R14: 0000000000000000 R15: ffff9540ffcb0905 FS: 0000000000000000(0000) GS:ffff9540ffc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010625c001 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_gsi_pkey_change+0x20/0x40 [mlx5_ib] process_one_work+0x1e8/0x3c0 worker_thread+0x50/0x3b0 ? rescuer_thread+0x380/0x380 kthread+0x149/0x170 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x22/0x30 Modules linked in: rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) mlx5_fwctl(OE) fwctl(OE) ib_uverbs(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlx_compat(OE) psample mlxfw(OE) tls knem(OE) netconsole nfsv3 nfs_acl nfs lockd grace fscache netfs qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common rapl hv_balloon hv_utils i2c_piix4 pcspkr joydev fuse ext4 mbcache jbd2 sr_mod sd_mod cdrom t10_pi sg ata_generic pci_hyperv pci_hyperv_intf hyperv_drm drm_shmem_helper drm_kms_helper hv_storvsc syscopyarea hv_netvsc sysfillrect sysimgblt hid_hyperv fb_sys_fops scsi_transport_fc hyperv_keyboard drm ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel hv_vmbus serio_raw [last unloaded: ib_core] CR2: 0000000000000000 ---[ end trace f6f8be4eae12f7bc ]--- Fixes: `7722f47e71` ("IB/mlx5: Create GSI transmission QPs when P_Key table is changed") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/d271ceeff0c08431b3cbbbb3e2d416f09b6d1621.1731496944.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:22 +01:00
Jianbo Liu	b6334d2356	IB/mlx5: Allocate resources just before first QP/SRQ is created [ Upstream commit 5895e70f2e6e8dc67b551ca554d6fcde0a7f0467 ] Previously, all IB dev resources are initialized on driver load. As they are not always used, move the initialization to the time when they are needed. To be more specific, move PD (p0) and CQ (c0) initialization to the time when the first SRQ is created. and move SRQs（s0 and s1) initialization to the time first QP is created. To avoid concurrent creations, two new mutexes are also added. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Link: https://lore.kernel.org/r/98c3e53a8cc0bdfeb6dec6e5bb8b037d78ab00d8.1717409369.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Stable-dep-of: ede132a5cf55 ("RDMA/mlx5: Move events notifier registration to be after device registration") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Zhen Lei	3dd9df8e5f	fbdev: sh7760fb: Fix a possible memory leak in sh7760fb_alloc_mem() [ Upstream commit f89d17ae2ac42931be2a0153fecbf8533280c927 ] When information such as info->screen_base is not ready, calling sh7760fb_free_mem() does not release memory correctly. Call dma_free_coherent() instead. Fixes: `4a25e41831` ("video: sh7760fb: SH7760/SH7763 LCDC framebuffer driver") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Zhang Zekun	1dd2d5630f	powerpc/kexec: Fix return of uninitialized variable [ Upstream commit 83b5a407fbb73e6965adfb4bd0a803724bf87f96 ] of_property_read_u64() can fail and leave the variable uninitialized, which will then be used. Return error if reading the property failed. Fixes: `2e6bd221d9` ("powerpc/kexec_file: Enable early kernel OPAL calls") Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20240930075628.125138-1-zhangzekun11@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Michal Suchanek	277ecc3d97	powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static [ Upstream commit a26c4dbb3d9c1821cb0fc11cb2dbc32d5bf3463b ] These functions are not used outside of sstep.c Fixes: `350779a29f` ("powerpc: Handle most loads and stores in instruction emulation code") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241001130356.14664-1-msuchanek@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Gautam Menghani	d2f3414036	KVM: PPC: Book3S HV: Avoid returning to nested hypervisor on pending doorbells [ Upstream commit 26686db69917399fa30e3b3135360771e90f83ec ] Commit `6398326b9b` ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes") dropped the use of vcore->dpdes for msgsndp / SMT emulation. Prior to that commit, the below code at L1 level (see [1] for terminology) was responsible for setting vc->dpdes for the respective L2 vCPU: if (!nested) { kvmppc_core_prepare_to_enter(vcpu); if (vcpu->arch.doorbell_request) { vc->dpdes = 1; smp_wmb(); vcpu->arch.doorbell_request = 0; } L1 then sent vc->dpdes to L0 via kvmhv_save_hv_regs(), and while servicing H_ENTER_NESTED at L0, the below condition at L0 level made sure to abort and go back to L1 if vcpu->arch.doorbell_request = 1 so that L1 sets vc->dpdes as per above if condition: } else if (vcpu->arch.pending_exceptions \|\| vcpu->arch.doorbell_request \|\| xive_interrupt_pending(vcpu)) { vcpu->arch.ret = RESUME_HOST; goto out; } This worked fine since vcpu->arch.doorbell_request was used more like a flag and vc->dpdes was used to pass around the doorbell state. But after Commit `6398326b9b` ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes"), vcpu->arch.doorbell_request is the only variable used to pass around doorbell state. With the plumbing for handling doorbells for nested guests updated to use vcpu->arch.doorbell_request over vc->dpdes, the above "else if" stops doorbells from working correctly as L0 aborts execution of L2 and instead goes back to L1. Remove vcpu->arch.doorbell_request from the above "else if" condition as it is no longer needed for L0 to correctly handle the doorbell status while running L2. [1] Terminology 1. L0 : PowerNV linux running with HV privileges 2. L1 : Pseries KVM guest running on top of L0 2. L2 : Nested KVM guest running on top of L1 Fixes: `6398326b9b` ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes") Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241109063301.105289-4-gautam@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Gautam Menghani	e7d134bd28	KVM: PPC: Book3S HV: Stop using vc->dpdes for nested KVM guests [ Upstream commit 0d3c6b28896f9889c8864dab469e0343a0ad1c0c ] commit `6398326b9b` ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes") introduced an optimization to use only vcpu->doorbell_request for SMT emulation for Power9 and above guests, but the code for nested guests still relies on the old way of handling doorbells, due to which an L2 guest (see [1]) cannot be booted with XICS with SMT>1. The command to repro this issue is: // To be run in L1 qemu-system-ppc64 \ -drive file=rhel.qcow2,format=qcow2 \ -m 20G \ -smp 8,cores=1,threads=8 \ -cpu host \ -nographic \ -machine pseries,ic-mode=xics -accel kvm Fix the plumbing to utilize vcpu->doorbell_request instead of vcore->dpdes for nested KVM guests on P9 and above. [1] Terminology 1. L0 : PowerNV linux running with HV privileges 2. L1 : Pseries KVM guest running on top of L0 2. L2 : Nested KVM guest running on top of L1 Fixes: `6398326b9b` ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes") Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241109063301.105289-3-gautam@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00
Harshit Mogalapalli	a6faea503b	dax: delete a stale directory pmem [ Upstream commit b8e6d7ce50673c39514921ac61f7af00bbb58b87 ] After commit: `83762cb5c7` ("dax: Kill DEV_DAX_PMEM_COMPAT") the pmem/ directory is not needed anymore and Makefile changes were made accordingly in this commit, but there is a Makefile and pmem.c in pmem/ which are now stale and pmem.c is empty, remove them. Fixes: `83762cb5c7` ("dax: Kill DEV_DAX_PMEM_COMPAT") Suggested-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20241017101144.1654085-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-09 10:32:21 +01:00

1 2 3 4 5 ...

1230126 Commits