commit bf202e654bfa57fb8cf9d93d4c6855890b70b9c4 upstream.
To address the performance drop issue, an optimization has been
implemented. The incorrect highest performance value previously set by the
low-level power firmware for AMD CPUs with Family ID 0x19 and Model ID
ranging from 0x70 to 0x7F series has been identified as the cause.
To resolve this, a check has been implemented to accurately determine the
CPU family and model ID. The correct highest performance value is now set
and the performance drop caused by the incorrect highest performance value
are eliminated.
Before the fix, the highest frequency was set to 4200MHz, now it is set
to 4971MHz which is correct.
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000
1 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000
2 0 0 1 1:1:1:0 yes 4971.0000 400.0000 4865.8140
3 0 0 1 1:1:1:0 yes 4971.0000 400.0000 400.0000
Fixes: f3a052391822 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218759
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Gaha Bana <gahabana@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3a052391822b772b4e27f2594526cf1eb103cab upstream.
amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
The initial core rankings are set up by amd-pstate when the
system boots.
Add a variable hw_prefcore in cpudata structure. It will check
if the processor and power firmware support preferred core
feature.
Add one new early parameter `disable` to allow user to disable
the preferred core.
Only when hardware supports preferred core and user set `enabled`
in early parameter, amd pstate driver supports preferred core featue.
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eba2591d99d1f14a04c8a8a845ab0795b93f5646 upstream.
Instead of directly dereferencing page tables entries, which can cause
issues (see commit 20a004e7b0 ("arm64: mm: Use READ_ONCE/WRITE_ONCE when
accessing page tables"), let's introduce new functions to get the
pud/p4d/pgd entries (the pte and pmd versions already exist).
Note that arm pgd_t is actually an array so pgdp_get() is defined as a
macro to avoid a build error.
Those new functions will be used in subsequent commits by the riscv
architecture.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20231213203001.179237-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4e8771a3666c8f216eefd6bd2fd50121c6c437db ]
null-ptr-deref will occur when (req_op_level == SMB2_OPLOCK_LEVEL_LEASE)
and parse_lease_state() return NULL.
Fix this by check if 'lease_ctx_info' is NULL.
Additionally, remove the redundant parentheses in
parse_durable_handle_context().
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4bc0a264fb482b019c84fbc7202dd3cab059087 ]
The overflow/underflow conditions in pata_macio_qc_prep() should never
happen. But if they do there's no need to kill the system entirely, a
WARN and failing the IO request should be sufficient and might allow the
system to keep running.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 783bf5d09f86b9736605f3e01a3472e55ef98ff8 ]
Referring to the errata ERR051608 of I.MX93, LPSPI TCR[PRESCALE]
can only be configured to be 0 or 1, other values are not valid
and will cause LPSPI to not work.
Add the prescale limitation for LPSPI in I.MX93. Other platforms
are not affected.
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20240820070658.672127-1-carlos.song@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2f11c6f3e1fc60742673b8675c95b78447f3dae ]
If we need to increase the tree depth, allocate a new node, and then
race with another thread that increased the tree depth before us, we'll
still have a preallocated node that might be used later.
If we then use that node for a new non-root node, it'll still have a
pointer to the old root instead of being zeroed - fix this by zeroing it
in the cmpxchg failure path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b739dffa5d570b411d4bdf4bb9b8dfd6b7d72305 ]
When of_irq_parse_raw() is invoked with a device address smaller than
the interrupt parent node (from #address-cells property), KASAN detects
the following out-of-bounds read when populating the initial match table
(dyndbg="func of_irq_parse_* +p"):
OF: of_irq_parse_one: dev=/soc@0/picasso/watchdog, index=0
OF: parent=/soc@0/pci@878000000000/gpio0@17,0, intsize=2
OF: intspec=4
OF: of_irq_parse_raw: ipar=/soc@0/pci@878000000000/gpio0@17,0, size=2
OF: -> addrsize=3
==================================================================
BUG: KASAN: slab-out-of-bounds in of_irq_parse_raw+0x2b8/0x8d0
Read of size 4 at addr ffffff81beca5608 by task bash/764
CPU: 1 PID: 764 Comm: bash Tainted: G O 6.1.67-484c613561-nokia_sm_arm64 #1
Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2023.01-12.24.03-dirty 01/01/2023
Call trace:
dump_backtrace+0xdc/0x130
show_stack+0x1c/0x30
dump_stack_lvl+0x6c/0x84
print_report+0x150/0x448
kasan_report+0x98/0x140
__asan_load4+0x78/0xa0
of_irq_parse_raw+0x2b8/0x8d0
of_irq_parse_one+0x24c/0x270
parse_interrupts+0xc0/0x120
of_fwnode_add_links+0x100/0x2d0
fw_devlink_parse_fwtree+0x64/0xc0
device_add+0xb38/0xc30
of_device_add+0x64/0x90
of_platform_device_create_pdata+0xd0/0x170
of_platform_bus_create+0x244/0x600
of_platform_notify+0x1b0/0x254
blocking_notifier_call_chain+0x9c/0xd0
__of_changeset_entry_notify+0x1b8/0x230
__of_changeset_apply_notify+0x54/0xe4
of_overlay_fdt_apply+0xc04/0xd94
...
The buggy address belongs to the object at ffffff81beca5600
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 8 bytes inside of
128-byte region [ffffff81beca5600, ffffff81beca5680)
The buggy address belongs to the physical page:
page:00000000230d3d03 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1beca4
head:00000000230d3d03 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x8000000000010200(slab|head|zone=2)
raw: 8000000000010200 0000000000000000 dead000000000122 ffffff810000c300
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffffff81beca5500: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffff81beca5580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffffff81beca5600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffffff81beca5680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffff81beca5700: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
==================================================================
OF: -> got it !
Prevent the out-of-bounds read by copying the device address into a
buffer of sufficient size.
Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com>
Link: https://lore.kernel.org/r/20240812100652.3800963-1-stefan.wiehler@nokia.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 810ee43d9cd245d138a2733d87a24858a23f577d ]
Syzkiller reports a "KMSAN: uninit-value in pick_link" bug.
This is caused by an uninitialised page, which is ultimately caused
by a corrupted symbolic link size read from disk.
The reason why the corrupted symlink size causes an uninitialised
page is due to the following sequence of events:
1. squashfs_read_inode() is called to read the symbolic
link from disk. This assigns the corrupted value
3875536935 to inode->i_size.
2. Later squashfs_symlink_read_folio() is called, which assigns
this corrupted value to the length variable, which being a
signed int, overflows producing a negative number.
3. The following loop that fills in the page contents checks that
the copied bytes is less than length, which being negative means
the loop is skipped, producing an uninitialised page.
This patch adds a sanity check which checks that the symbolic
link size is not larger than expected.
--
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Link: https://lore.kernel.org/r/20240811232821.13903-1-phillip@squashfs.org.uk
Reported-by: Lizhi Xu <lizhi.xu@windriver.com>
Reported-by: syzbot+24ac24ff58dc5b0d26b9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000a90e8c061e86a76b@google.com/
V2: fix spelling mistake.
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5876b088ba03a62124266fa20d00e65533c7269 ]
ipheth_sndbulk_callback() can submit carrier_work
as a part of its error handling. That means that
the driver must make sure that the work is cancelled
after it has made sure that no more URB can terminate
with an error condition.
Hence the order of actions in ipheth_close() needs
to be inverted.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Foster Snowhill <forst@pen.gy>
Tested-by: Georgi Valkov <gvalkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8e947e9f64cac9df85a07672b658df5b2bcff07 ]
Some arch + compiler combinations report a potentially unused variable
location in btrfs_lookup_dentry(). This is a false alert as the variable
is passed by value and always valid or there's an error. The compilers
cannot probably reason about that although btrfs_inode_by_name() is in
the same file.
> + /kisskb/src/fs/btrfs/inode.c: error: 'location.objectid' may be used
+uninitialized in this function [-Werror=maybe-uninitialized]: => 5603:9
> + /kisskb/src/fs/btrfs/inode.c: error: 'location.type' may be used
+uninitialized in this function [-Werror=maybe-uninitialized]: => 5674:5
m68k-gcc8/m68k-allmodconfig
mips-gcc8/mips-allmodconfig
powerpc-gcc5/powerpc-all{mod,yes}config
powerpc-gcc5/ppc64_defconfig
Initialize it to zero, this should fix the warnings and won't change the
behaviour as btrfs_inode_by_name() accepts only a root or inode item
types, otherwise returns an error.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/linux-btrfs/bd4e9928-17b3-9257-8ba7-6b7f9bbb639a@linux-m68k.org/
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5127c42c77de18651aa9e8e0a3ced190103b449c ]
If the value of max_speed_hz is 0, it may cause a division by zero
error in hisi_calc_effective_speed().
The value of max_speed_hz is provided by firmware.
Firmware is generally considered as a trusted domain. However, as
division by zero errors can cause system failure, for defense measure,
the value of max_speed is validated here. So 0 is regarded as invalid
and an error code is returned.
Signed-off-by: Devyn Liu <liudingyuan@huawei.com>
Reviewed-by: Jay Fang <f.fangjian@huawei.com>
Link: https://patch.msgid.link/20240730032040.3156393-3-liudingyuan@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 291e4baf70019f17a81b7b47aeb186b27d222159 ]
Even if a vgem device is configured in, we will skip the import_vgem_fd()
test almost every time.
TAP version 13
1..11
# Testing heap: system
# =======================================
# Testing allocation and importing:
ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver
version information but leave the name field a non-null-terminated string.
Terminate it properly to actually test against the vgem device.
While at it, let's check the length of the driver name is exactly 4 bytes
and return early otherwise (in case there is a name like "vgemfoo" that
gets converted to "vgem\0" unexpectedly).
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240729024604.2046-1-yuzenghui@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9bc7501b0b90f4d0c34b97c14ff1f708ce7ad8f3 ]
According to I3C Spec 1.1.1, 11-Jun-2021, section: 5.1.2.2.3:
If the Controller chooses to start an I3C Message with an I3C Dynamic
Address, then special provisions shall be made because that same I3C Target
may be initiating an IBI or a Controller Role Request. So, one of three
things may happen: (skip 1, 2)
3. The Addresses match and the RnW bits also match, and so neither
Controller nor Target will ACK since both are expecting the other side to
provide ACK. As a result, each side might think it had "won" arbitration,
but neither side would continue, as each would subsequently see that the
other did not provide ACK.
...
For either value of RnW: Due to the NACK, the Controller shall defer the
Private Write or Private Read, and should typically transmit the Target
^^^^^^^^^^^^^^^^^^^
Address again after a Repeated START (i.e., the next one or any one prior
^^^^^^^^^^^^^
to a STOP in the Frame). Since the Address Header following a Repeated
START is not arbitrated, the Controller will always win (see Section
5.1.2.2.4).
Resend target address again if address is not 7E and controller get NACK.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3a5e3e872f3688ae0dc57bb78ca633921d96a91 ]
When using cachefiles, lockdep may emit something similar to the circular
locking dependency notice below. The problem appears to stem from the
following:
(1) Cachefiles manipulates xattrs on the files in its cache when called
from ->writepages().
(2) The setxattr() and removexattr() system call handlers get the name
(and value) from userspace after taking the sb_writers lock, putting
accesses of the vma->vm_lock and mm->mmap_lock inside of that.
(3) The afs filesystem uses a per-inode lock to prevent multiple
revalidation RPCs and in writeback vs truncate to prevent parallel
operations from deadlocking against the server on one side and local
page locks on the other.
Fix this by moving the getting of the name and value in {get,remove}xattr()
outside of the sb_writers lock. This also has the minor benefits that we
don't need to reget these in the event of a retry and we never try to take
the sb_writers lock in the event we can't pull the name and value into the
kernel.
Alternative approaches that might fix this include moving the dispatch of a
write to the cache off to a workqueue or trying to do without the
validation lock in afs. Note that this might also affect other filesystems
that use netfslib and/or cachefiles.
======================================================
WARNING: possible circular locking dependency detected
6.10.0-build2+ #956 Not tainted
------------------------------------------------------
fsstress/6050 is trying to acquire lock:
ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0
but task is already holding lock:
ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&vma->vm_lock->lock){++++}-{3:3}:
__lock_acquire+0xaf0/0xd80
lock_acquire.part.0+0x103/0x280
down_write+0x3b/0x50
vma_start_write+0x6b/0xa0
vma_link+0xcc/0x140
insert_vm_struct+0xb7/0xf0
alloc_bprm+0x2c1/0x390
kernel_execve+0x65/0x1a0
call_usermodehelper_exec_async+0x14d/0x190
ret_from_fork+0x24/0x40
ret_from_fork_asm+0x1a/0x30
-> #3 (&mm->mmap_lock){++++}-{3:3}:
__lock_acquire+0xaf0/0xd80
lock_acquire.part.0+0x103/0x280
__might_fault+0x7c/0xb0
strncpy_from_user+0x25/0x160
removexattr+0x7f/0x100
__do_sys_fremovexattr+0x7e/0xb0
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #2 (sb_writers#14){.+.+}-{0:0}:
__lock_acquire+0xaf0/0xd80
lock_acquire.part.0+0x103/0x280
percpu_down_read+0x3c/0x90
vfs_iocb_iter_write+0xe9/0x1d0
__cachefiles_write+0x367/0x430
cachefiles_issue_write+0x299/0x2f0
netfs_advance_write+0x117/0x140
netfs_write_folio.isra.0+0x5ca/0x6e0
netfs_writepages+0x230/0x2f0
afs_writepages+0x4d/0x70
do_writepages+0x1e8/0x3e0
filemap_fdatawrite_wbc+0x84/0xa0
__filemap_fdatawrite_range+0xa8/0xf0
file_write_and_wait_range+0x59/0x90
afs_release+0x10f/0x270
__fput+0x25f/0x3d0
__do_sys_close+0x43/0x70
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&vnode->validate_lock){++++}-{3:3}:
__lock_acquire+0xaf0/0xd80
lock_acquire.part.0+0x103/0x280
down_read+0x95/0x200
afs_writepages+0x37/0x70
do_writepages+0x1e8/0x3e0
filemap_fdatawrite_wbc+0x84/0xa0
filemap_invalidate_inode+0x167/0x1e0
netfs_unbuffered_write_iter+0x1bd/0x2d0
vfs_write+0x22e/0x320
ksys_write+0xbc/0x130
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #0 (mapping.invalidate_lock#3){++++}-{3:3}:
check_noncircular+0x119/0x160
check_prev_add+0x195/0x430
__lock_acquire+0xaf0/0xd80
lock_acquire.part.0+0x103/0x280
down_read+0x95/0x200
filemap_fault+0x26e/0x8b0
__do_fault+0x57/0xd0
do_pte_missing+0x23b/0x320
__handle_mm_fault+0x2d4/0x320
handle_mm_fault+0x14f/0x260
do_user_addr_fault+0x2a2/0x500
exc_page_fault+0x71/0x90
asm_exc_page_fault+0x22/0x30
other info that might help us debug this:
Chain exists of:
mapping.invalidate_lock#3 --> &mm->mmap_lock --> &vma->vm_lock->lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
rlock(&vma->vm_lock->lock);
lock(&mm->mmap_lock);
lock(&vma->vm_lock->lock);
rlock(mapping.invalidate_lock#3);
*** DEADLOCK ***
1 lock held by fsstress/6050:
#0: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
stack backtrace:
CPU: 0 PID: 6050 Comm: fsstress Not tainted 6.10.0-build2+ #956
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x80
check_noncircular+0x119/0x160
? queued_spin_lock_slowpath+0x4be/0x510
? __pfx_check_noncircular+0x10/0x10
? __pfx_queued_spin_lock_slowpath+0x10/0x10
? mark_lock+0x47/0x160
? init_chain_block+0x9c/0xc0
? add_chain_block+0x84/0xf0
check_prev_add+0x195/0x430
__lock_acquire+0xaf0/0xd80
? __pfx___lock_acquire+0x10/0x10
? __lock_release.isra.0+0x13b/0x230
lock_acquire.part.0+0x103/0x280
? filemap_fault+0x26e/0x8b0
? __pfx_lock_acquire.part.0+0x10/0x10
? rcu_is_watching+0x34/0x60
? lock_acquire+0xd7/0x120
down_read+0x95/0x200
? filemap_fault+0x26e/0x8b0
? __pfx_down_read+0x10/0x10
? __filemap_get_folio+0x25/0x1a0
filemap_fault+0x26e/0x8b0
? __pfx_filemap_fault+0x10/0x10
? find_held_lock+0x7c/0x90
? __pfx___lock_release.isra.0+0x10/0x10
? __pte_offset_map+0x99/0x110
__do_fault+0x57/0xd0
do_pte_missing+0x23b/0x320
__handle_mm_fault+0x2d4/0x320
? __pfx___handle_mm_fault+0x10/0x10
handle_mm_fault+0x14f/0x260
do_user_addr_fault+0x2a2/0x500
exc_page_fault+0x71/0x90
asm_exc_page_fault+0x22/0x30
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/2136178.1721725194@warthog.procyon.org.uk
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christian Brauner <brauner@kernel.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: Gao Xiang <xiang@kernel.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-erofs@lists.ozlabs.org
cc: linux-fsdevel@vger.kernel.org
[brauner: fix minor issues]
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 542440fd7b30983cae23e32bd22f69a076ec7ef4 ]
With gcc-14.1, there is a false-postive -Wuninitialized warning in
regcache_maple_drop:
drivers/base/regmap/regcache-maple.c: In function 'regcache_maple_drop':
drivers/base/regmap/regcache-maple.c:113:23: error: 'lower_index' is used uninitialized [-Werror=uninitialized]
113 | unsigned long lower_index, lower_last;
| ^~~~~~~~~~~
drivers/base/regmap/regcache-maple.c:113:36: error: 'lower_last' is used uninitialized [-Werror=uninitialized]
113 | unsigned long lower_index, lower_last;
| ^~~~~~~~~~
I've created a reduced test case to see if this needs to be reported
as a gcc, but it appears that the gcc-14.x branch already has a change
that turns this into a more sensible -Wmaybe-uninitialized warning, so
I ended up not reporting it so far.
The reduced test case also produces a warning for gcc-13 and gcc-12
but I don't see that with the version in the kernel.
Link: https://godbolt.org/z/oKbohKqd3
Link: https://lore.kernel.org/all/CAMuHMdWj=FLmkazPbYKPevDrcym2_HDb_U7Mb9YE9ovrP0jJfA@mail.gmail.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20240719104030.1382465-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0124fbb4c6dba23dbdf80c829be68adbccde2722 ]
fw_arg1 is in memory space rather than I/O space, so we should use
early_memremap_ro() instead of early_ioremap() to map the cmdline.
Moreover, we should unmap it after using.
Suggested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 555a05d84ca2c587e2d4777006e2c2fb3dfbd91d ]
The dpaa-eth driver is written for PowerPC and Arm SoCs which have 1-24
CPUs. It depends on CONFIG_NR_CPUS having a reasonably small value in
Kconfig. Otherwise, there are 2 functions which allocate on-stack arrays
of NR_CPUS elements, and these can quickly explode in size, leading to
warnings such as:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:3280:12: warning:
stack frame size (16664) exceeds limit (2048) in 'dpaa_eth_probe' [-Wframe-larger-than]
The problem is twofold:
- Reducing the array size to the boot-time num_possible_cpus() (rather
than the compile-time NR_CPUS) creates a variable-length array,
which should be avoided in the Linux kernel.
- Using NR_CPUS as an array size makes the driver blow up in stack
consumption with generic, as opposed to hand-crafted, .config files.
A simple solution is to use dynamic allocation for num_possible_cpus()
elements (aka a small number determined at runtime).
Link: https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23e89e8ee7be73e21200947885a6d3a109a2c58d ]
RFC 9293 states that in the case of simultaneous connect(), the connection
gets established when SYN+ACK is received. [0]
TCP Peer A TCP Peer B
1. CLOSED CLOSED
2. SYN-SENT --> <SEQ=100><CTL=SYN> ...
3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT
4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED
5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...
6. ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED
7. ... <SEQ=100><ACK=301><CTL=SYN,ACK> --> ESTABLISHED
However, since commit 0c24604b68 ("tcp: implement RFC 5961 4.2"), such a
SYN+ACK is dropped in tcp_validate_incoming() and responded with Challenge
ACK.
For example, the write() syscall in the following packetdrill script fails
with -EAGAIN, and wrong SNMP stats get incremented.
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460,sackOK,TS val 1000 ecr 0,nop,wscale 8>
+0 < S 0:0(0) win 1000 <mss 1000>
+0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 3308134035 ecr 0,nop,wscale 8>
+0 < S. 0:0(0) ack 1 win 1000
+0 write(3, ..., 100) = 100
+0 > P. 1:101(100) ack 1
--
# packetdrill cross-synack.pkt
cross-synack.pkt:13: runtime error in write call: Expected result 100 but got -1 with errno 11 (Resource temporarily unavailable)
# nstat
...
TcpExtTCPChallengeACK 1 0.0
TcpExtTCPSYNChallenge 1 0.0
The problem is that bpf_skops_established() is triggered by the Challenge
ACK instead of SYN+ACK. This causes the bpf prog to miss the chance to
check if the peer supports a TCP option that is expected to be exchanged
in SYN and SYN+ACK.
Let's accept a bare SYN+ACK for active-open TCP_SYN_RECV sockets to avoid
such a situation.
Note that tcp_ack_snd_check() in tcp_rcv_state_process() is skipped not to
send an unnecessary ACK, but this could be a bit risky for net.git, so this
targets for net-next.
Link: https://www.rfc-editor.org/rfc/rfc9293.html#section-3.5-7 [0]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240710171246.87533-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a4e772898f8bf2e7e1cf661a12c60a5612c4afab ]
One of the true positives that the cfg_access_lock lockdep effort
identified is this sequence:
WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70
RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70
Call Trace:
<TASK>
? __warn+0x8c/0x190
? pci_bridge_secondary_bus_reset+0x5d/0x70
? report_bug+0x1f8/0x200
? handle_bug+0x3c/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? pci_bridge_secondary_bus_reset+0x5d/0x70
pci_reset_bus+0x1d8/0x270
vmd_probe+0x778/0xa10
pci_device_probe+0x95/0x120
Where pci_reset_bus() users are triggering unlocked secondary bus resets.
Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses
pci_bus_lock() before issuing the reset which locks everything *but* the
bridge itself.
For the same motivation as adding:
bridge = pci_upstream_bridge(dev);
if (bridge)
pci_dev_lock(bridge);
to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add
pci_dev_lock() for @bus->self to pci_bus_lock().
Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.com
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuch
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
[bhelgaas: squash in recursive locking deadlock fix from Keith Busch:
https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ad8735994b854b23c824dd6b1dd2126e893a3b4 ]
The exception vector of the booting hart is not set before enabling
the mmu and then still points to the value of the previous firmware,
typically _start. That makes it hard to debug setup_vm() when bad
things happen. So fix that by setting the exception vector earlier.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: yang.zhang <yang.zhang@hexintek.com>
Link: https://lore.kernel.org/r/20240508022445.6131-1-gaoshanliukou@163.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 82a3e3a235633aa0575fac9507d648dd80f3437f ]
When a root decoder is configured the interleave target list is read
from the BIOS populated CFMWS structure. Per the CXL spec 3.1 Table
9-22 the target list is in interleave order. The CXL driver populates
its decoder target list in the same order and stores it in 'struct
cxl_switch_decoder' field "@target: active ordered target list in
current decoder configuration"
Given the promise of an ordered list, the driver can stop duplicating
the work of BIOS and simply check target positions against the ordered
list during region configuration.
The simplified check against the ordered list is presented here.
A follow-on patch will remove the unused code.
For Modulo arithmetic this is not a fix, only a simplification.
For XOR arithmetic this is a fix for HB IW of 3,6,12.
Fixes: f9db85bfec ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/35d08d3aba08fee0f9b86ab1cef0c25116ca8a55.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b56329a782314fde5b61058e2a25097af7ccb675 ]
Instead of a BUG_ON() just return an error, log an error message and
abort the transaction in case we find an extent buffer belonging to the
relocation tree that doesn't have the full backref flag set. This is
unexpected and should never happen (save for bugs or a potential bad
memory).
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8ccef048354074a548f108e51d0557d6adfd3a3 ]
In reada we BUG_ON(refs == 0), which could be unkind since we aren't
holding a lock on the extent leaf and thus could get a transient
incorrect answer. In walk_down_proc we also BUG_ON(refs == 0), which
could happen if we have extent tree corruption. Change that to return
-EUCLEAN. In do_walk_down() we catch this case and handle it correctly,
however we return -EIO, which -EUCLEAN is a more appropriate error code.
Finally in walk_up_proc we have the same BUG_ON(refs == 0), so convert
that to proper error handling. Also adjust the error message so we can
actually do something with the information.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f9d44c0a12730a24f8bb75c5e1102207413cc9b ]
We have a couple of areas where we check to make sure the tree block is
locked before looking up or messing with references. This is old code
so it has this as BUG_ON(). Convert this to ASSERT() for developers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 744375343662058cbfda96d871786e5a5cbe1947 ]
Mark ntfs dirty in this case.
Rename ntfs_filldir to ntfs_dir_emit.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 77aeb1b685f9db73d276bad4bb30d48505a6fd23 ]
For CONFIG_DEBUG_OBJECTS_WORK=y kernels sscs.work defined by
INIT_WORK_ONSTACK() is initialized by debug_object_init_on_stack() for
the debug check in __init_work() to work correctly.
But this lacks the counterpart to remove the tracked object from debug
objects again, which will cause a debug object warning once the stack is
freed.
Add the missing destroy_work_on_stack() invocation to cure that.
[ tglx: Massaged changelog ]
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240704065213.13559-1-qiang.zhang1211@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 320debca1ba3a81c87247eac84eff976ead09ee0 ]
A gang submit won't work if the VMID is reserved and we can't flush out
VM changes from multiple engines at the same time.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54624acf8843375a6de3717ac18df3b5104c39c5 ]
The test thread will start N benchmark kthreads and then schedule out
until the test time finished and notify the benchmark kthreads to stop.
The benchmark kthreads will keep running until notified to stop.
There's a problem with current implementation when the benchmark
kthreads number is equal to the CPUs on a non-preemptible kernel:
since the scheduler will balance the kthreads across the CPUs and
when the test time's out the test thread won't get a chance to be
scheduled on any CPU then cannot notify the benchmark kthreads to stop.
This can be easily reproduced on a VM (simulated with 16 CPUs) with
PREEMPT_VOLUNTARY:
estuary:/mnt$ ./dma_map_benchmark -t 16 -s 1
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 10-...!: (5221 ticks this GP) idle=ed24/1/0x4000000000000000 softirq=142/142 fqs=0
rcu: (t=5254 jiffies g=-559 q=45 ncpus=16)
rcu: rcu_sched kthread starved for 5255 jiffies! g-559 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=12
rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_sched state:R running task stack:0 pid:16 tgid:16 ppid:2 flags:0x00000008
Call trace
__switch_to+0xec/0x138
__schedule+0x2f8/0x1080
schedule+0x30/0x130
schedule_timeout+0xa0/0x188
rcu_gp_fqs_loop+0x128/0x528
rcu_gp_kthread+0x1c8/0x208
kthread+0xec/0xf8
ret_from_fork+0x10/0x20
Sending NMI from CPU 10 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 332 Comm: dma-map-benchma Not tainted 6.10.0-rc1-vanilla-LSE #8
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : arm_smmu_cmdq_issue_cmdlist+0x218/0x730
lr : arm_smmu_cmdq_issue_cmdlist+0x488/0x730
sp : ffff80008748b630
x29: ffff80008748b630 x28: 0000000000000000 x27: ffff80008748b780
x26: 0000000000000000 x25: 000000000000bc70 x24: 000000000001bc70
x23: ffff0000c12af080 x22: 0000000000010000 x21: 000000000000ffff
x20: ffff80008748b700 x19: ffff0000c12af0c0 x18: 0000000000010000
x17: 0000000000000001 x16: 0000000000000040 x15: ffffffffffffffff
x14: 0001ffffffffffff x13: 000000000000ffff x12: 00000000000002f1
x11: 000000000001ffff x10: 0000000000000031 x9 : ffff800080b6b0b8
x8 : ffff0000c2a48000 x7 : 000000000001bc71 x6 : 0001800000000000
x5 : 00000000000002f1 x4 : 01ffffffffffffff x3 : 000000000009aaf1
x2 : 0000000000000018 x1 : 000000000000000f x0 : ffff0000c12af18c
Call trace:
arm_smmu_cmdq_issue_cmdlist+0x218/0x730
__arm_smmu_tlb_inv_range+0xe0/0x1a8
arm_smmu_iotlb_sync+0xc0/0x128
__iommu_dma_unmap+0x248/0x320
iommu_dma_unmap_page+0x5c/0xe8
dma_unmap_page_attrs+0x38/0x1d0
map_benchmark_thread+0x118/0x2c0
kthread+0xec/0xf8
ret_from_fork+0x10/0x20
Solve this by adding scheduling point in the kthread loop,
so if there're other threads in the system they may have
a chance to run, especially the thread to notify the test
end. However this may degrade the test concurrency so it's
recommended to run this on an idle system.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Barry Song <baohua@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0bab8db4152c4a2185a1367db09cc402bdc62d5e ]
We encountered a problem that the file system could not be mounted in
the power-off scenario. The analysis of the file system mirror shows that
only part of the data is written to the last commit block.
The valid data of the commit block is concentrated in the first sector.
However, the data of the entire block is involved in the checksum calculation.
For different hardware, the minimum atomic unit may be different.
If the checksum of a committed block is incorrect, clear the data except the
'commit_header' and then calculate the checksum. If the checkusm is correct,
it is considered that the block is partially committed, Then continue to replay
journal.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240620072405.3533701-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 33f23fc3155b13c4a96d94a0a22dc26db767440b ]
[Why]
If VF request full GPU access and the request failed,
the VF driver can get stuck accessing registers for an extended period during
the unload of KMS.
[How]
Set no_hw_access flag when VF request for full GPU access fails
This prevents further hardware access attempts, avoiding the prolonged
stuck state.
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cedc12c5b57f7efa6dbebfb2b140e8675f5a2616 ]
In the current state, an erroneous call to
bpf_object__find_map_by_name(NULL, ...) leads to a segmentation
fault through the following call chain:
bpf_object__find_map_by_name(obj = NULL, ...)
-> bpf_object__for_each_map(pos, obj = NULL)
-> bpf_object__next_map((obj = NULL), NULL)
-> return (obj = NULL)->maps
While calling bpf_object__find_map_by_name with obj = NULL is
obviously incorrect, this should not lead to a segmentation
fault but rather be handled gracefully.
As __bpf_map__iter already handles this situation correctly, we
can delegate the check for the regular case there and only add
a check in case the prev or next parameter is NULL.
Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c1de37969b7bc0abcb20b86e91e70caebbd4f89 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large
negative number such as -9223372036854775808 is provided by the user.
Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0403e10bf0824bf0ec2bb135d4cf1c0cc3bf4bf0 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large
negative number such as -9223372036854775808 is provided by the user.
Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af64e3e1537896337405f880c1e9ac1f8c0c6198 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large
negative number such as -9223372036854775808 is provided by the user.
Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>