This reverts commit 487a32ec24.
should_skip_kasan_poison() reads the PG_skip_kasan_poison flag from
page->flags. However, this line of code in free_pages_prepare():
page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
clears most of page->flags, including PG_skip_kasan_poison, before calling
should_skip_kasan_poison(), which meant that it would never return true as
a result of the page flag being set. Therefore, fix the code to call
should_skip_kasan_poison() before clearing the flags, as we were doing
before the reverted patch.
This fixes a measurable performance regression introduced in the reverted
commit, where munmap() takes longer than intended if HW tags KASAN is
supported and enabled at runtime. Without this patch, we see a
single-digit percentage performance regression in a particular
mmap()-heavy benchmark when enabling HW tags KASAN, and with the patch,
there is no statistically significant performance impact when enabling HW
tags KASAN.
Link: https://lkml.kernel.org/r/20230310042914.3805818-2-pcc@google.com
Fixes: 487a32ec24 ("kasan: drop skip_kasan_poison variable in free_pages_prepare")
Link: https://linux-review.googlesource.com/id/Ic4f13affeebd20548758438bb9ed9ca40e312b79
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> [6.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The ioctl helper function nilfs_ioctl_wrap_copy(), which exchanges a
metadata array to/from user space, may copy uninitialized buffer regions
to user space memory for read-only ioctl commands NILFS_IOCTL_GET_SUINFO
and NILFS_IOCTL_GET_CPINFO.
This can occur when the element size of the user space metadata given by
the v_size member of the argument nilfs_argv structure is larger than the
size of the metadata element (nilfs_suinfo structure or nilfs_cpinfo
structure) on the file system side.
KMSAN-enabled kernels detect this issue as follows:
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user
include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0xc0/0x100 lib/usercopy.c:33
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
_copy_to_user+0xc0/0x100 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:169 [inline]
nilfs_ioctl_wrap_copy+0x6fa/0xc10 fs/nilfs2/ioctl.c:99
nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
__do_compat_sys_ioctl fs/ioctl.c:968 [inline]
__se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
__ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
Uninit was created at:
__alloc_pages+0x9f6/0xe90 mm/page_alloc.c:5572
alloc_pages+0xab0/0xd80 mm/mempolicy.c:2287
__get_free_pages+0x34/0xc0 mm/page_alloc.c:5599
nilfs_ioctl_wrap_copy+0x223/0xc10 fs/nilfs2/ioctl.c:74
nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
__do_compat_sys_ioctl fs/ioctl.c:968 [inline]
__se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
__ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
Bytes 16-127 of 3968 are uninitialized
...
This eliminates the leak issue by initializing the page allocated as
buffer using get_zeroed_page().
Link: https://lkml.kernel.org/r/20230307085548.6290-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+132fdd2f1e1805fdc591@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/000000000000a5bd2d05f63f04ae@google.com
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Fix mas_skip_node() for mas_empty_area()", v2.
mas_empty_area() was incorrectly returning an error when there was room.
The issue was tracked down to mas_skip_node() using the incorrect
end-of-slot count. Instead of using the nodes hard limit, the limit of
data should be used.
mas_skip_node() was also setting the min and max to that of the child
node, which was unnecessary. Within these limits being set, there was
also a bug that corrupted the maple state's max if the offset was set to
the maximum node pivot. The bug was without consequence unless there was
a sufficient gap in the next child node which would cause an error to be
returned.
This patch set fixes these errors by removing the limit setting from
mas_skip_node() and uses the mas_data_end() for slot limits, and adds
tests for all failures discovered.
This patch (of 2):
mas_skip_node() is used to move the maple state to the node with a higher
limit. It does this by walking up the tree and increasing the slot count.
Since slot count may not be able to be increased, it may need to walk up
multiple times to find room to walk right to a higher limit node. The
limit of slots that was being used was the node limit and not the last
location of data in the node. This would cause the maple state to be
shifted outside actual data and enter an error state, thus returning
-EBUSY.
The result of the incorrect error state means that mas_awalk() would
return an error instead of finding the allocation space.
The fix is to use mas_data_end() in mas_skip_node() to detect the nodes
data end point and continue walking the tree up until it is safe to move
to a node with a higher limit.
The walk up the tree also sets the maple state limits so remove the buggy
code from mas_skip_node(). Setting the limits had the unfortunate side
effect of triggering another bug if the parent node was full and the there
was no suitable gap in the second last child, but room in the next child.
mas_skip_node() may also be passed a maple state in an error state from
mas_anode_descend() when no allocations are available. Return on such an
error state immediately.
Link: https://lkml.kernel.org/r/20230307180247.2220303-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230307180247.2220303-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Snild Dolkow <snild@sony.com>
Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
Tested-by: Snild Dolkow <snild@sony.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gao Xiang has reported that the page allocator complains about high order
__GFP_NOFAIL request coming from the vmalloc core:
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
__vmalloc_area_node mm/vmalloc.c:3057 [inline]
__vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
kvmalloc_node+0x156/0x1a0 mm/util.c:606
kvmalloc include/linux/slab.h:737 [inline]
kvmalloc_array include/linux/slab.h:755 [inline]
kvcalloc include/linux/slab.h:760 [inline]
it seems that I have completely missed high order allocation backing
vmalloc areas case when implementing __GFP_NOFAIL support. This means
that [k]vmalloc at al. can allocate higher order allocations with
__GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or
cause a lot of reclaim/compaction activity if those requests cannot be
satisfied.
Fix the issue by falling back to zero order allocations for __GFP_NOFAIL
requests if the high order request fails.
Link: https://lkml.kernel.org/r/ZAXynvdNqcI0f6Us@dhcp22.suse.cz
Fixes: 9376130c39 ("mm/vmalloc: add support for __GFP_NOFAIL")
Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lkml.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Borkmann says:
====================
pull-request: bpf 2023-03-23
We've added 8 non-merge commits during the last 13 day(s) which contain
a total of 21 files changed, 238 insertions(+), 161 deletions(-).
The main changes are:
1) Fix verification issues in some BPF programs due to their stack usage
patterns, from Eduard Zingerman.
2) Fix to add missing overflow checks in xdp_umem_reg and return an error
in such case, from Kal Conley.
3) Fix and undo poisoning of strlcpy in libbpf given it broke builds for
libcs which provided the former like uClibc-ng, from Jesus Sanchez-Palencia.
4) Fix insufficient bpf_jit_limit default to avoid users running into hard
to debug seccomp BPF errors, from Daniel Borkmann.
5) Fix driver return code when they don't support a bpf_xdp_metadata kfunc
to make it unambiguous from other errors, from Jesper Dangaard Brouer.
6) Two BPF selftest fixes to address compilation errors from recent changes
in kernel structures, from Alexei Starovoitov.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
xdp: bpf_xdp_metadata use EOPNOTSUPP for no driver support
bpf: Adjust insufficient default bpf_jit_limit
xsk: Add missing overflow check in xdp_umem_reg
selftests/bpf: Fix progs/test_deny_namespace.c issues.
selftests/bpf: Fix progs/find_vma_fail1.c build error.
libbpf: Revert poisoning of strlcpy
selftests/bpf: Tests for uninitialized stack reads
bpf: Allow reads from uninit stack
====================
Link: https://lore.kernel.org/r/20230323225221.6082-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix MGMT add advmon with RSSI command
- L2CAP: Fix responding with wrong PDU type
- Fix race condition in hci_cmd_sync_clear
- ISO: Fix timestamped HCI ISO data packet parsing
- HCI: Fix global-out-of-bounds
- hci_sync: Resume adv with no RPA when active scan
* tag 'for-net-2023-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: HCI: Fix global-out-of-bounds
Bluetooth: mgmt: Fix MGMT add advmon with RSSI command
Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work
Bluetooth: L2CAP: Fix responding with wrong PDU type
Bluetooth: btqcomsmd: Fix command timeout after setting BD address
Bluetooth: btinel: Check ACPI handle for NULL before accessing
Bluetooth: Remove "Power-on" check from Mesh feature
Bluetooth: Fix race condition in hci_cmd_sync_clear
Bluetooth: btintel: Iterate only bluetooth device ACPI entries
Bluetooth: ISO: fix timestamped HCI ISO data packet parsing
Bluetooth: btusb: Remove detection of ISO packets over bulk
Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet
Bluetooth: hci_sync: Resume adv with no RPA when active scan
====================
Link: https://lore.kernel.org/r/20230323202335.3380841-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kalle Valo says:
====================
wireless fixes for v6.3
Third set of fixes for v6.3. mt76 has two kernel crash fixes and
adding back 160 MHz channel support for mt7915. mac80211 has fixes for
a race in transmit path and two mesh related fixes. iwlwifi also has
fixes for races.
* tag 'wireless-2023-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: fix mesh path discovery based on unicast packets
wifi: mac80211: fix qos on mesh interfaces
wifi: iwlwifi: mvm: protect TXQ list manipulation
wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
wifi: mac80211: Serialize ieee80211_handle_wake_tx_queue()
wifi: mwifiex: mark OF related data as maybe unused
wifi: mt76: connac: do not check WED status for non-mmio devices
wifi: mt76: mt7915: add back 160MHz channel width support for MT7915
wifi: mt76: do not run mt76_unregister_device() on unregistered hw
====================
Link: https://lore.kernel.org/r/20230323110332.C4FE4C433D2@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull gfs2 fix from Andreas Gruenbacher:
- Reinstate commit 970343cd49 ("GFS2: free disk inode which is
deleted by remote node -V2") as reverting that commit could cause
gfs2_put_super() to hang.
* tag 'gfs2-v6.3-rc3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
Reinstate "GFS2: free disk inode which is deleted by remote node -V2"
The MGMT command: MGMT_OP_ADD_ADV_PATTERNS_MONITOR_RSSI uses variable
length argument. This causes host not able to register advmon with rssi.
This patch has been locally tested by adding monitor with rssi via
btmgmt on a kernel 6.1 machine.
Reviewed-by: Archie Pusaka <apusaka@chromium.org>
Fixes: b338d91703 ("Bluetooth: Implement support for Mesh")
Signed-off-by: Howard Chung <howardchung@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
In btsdio_probe, &data->work was bound with btsdio_work.In
btsdio_send_frame, it was started by schedule_work.
If we call btsdio_remove with an unfinished job, there may
be a race condition and cause UAF bug on hdev.
Fixes: ddbaf13e36 ("[Bluetooth] Add generic driver for Bluetooth SDIO devices")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
L2CAP_ECRED_CONN_REQ shall be responded with L2CAP_ECRED_CONN_RSP not
L2CAP_LE_CONN_RSP:
L2CAP LE EATT Server - Reject - run
Listening for connections
New client connection with handle 0x002a
Sending L2CAP Request from client
Client received response code 0x15
Unexpected L2CAP response code (expected 0x18)
L2CAP LE EATT Server - Reject - test failed
> ACL Data RX: Handle 42 flags 0x02 dlen 26
LE L2CAP: Enhanced Credit Connection Request (0x17) ident 1 len 18
PSM: 39 (0x0027)
MTU: 64
MPS: 64
Credits: 5
Source CID: 65
Source CID: 66
Source CID: 67
Source CID: 68
Source CID: 69
< ACL Data TX: Handle 42 flags 0x00 dlen 16
LE L2CAP: LE Connection Response (0x15) ident 1 len 8
invalid size
00 00 00 00 00 00 06 00
L2CAP LE EATT Server - Reject - run
Listening for connections
New client connection with handle 0x002a
Sending L2CAP Request from client
Client received response code 0x18
L2CAP LE EATT Server - Reject - test passed
Fixes: 15f02b9105 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
On most devices using the btqcomsmd driver (e.g. the DragonBoard 410c
and other devices based on the Qualcomm MSM8916/MSM8909/... SoCs)
the Bluetooth firmware seems to become unresponsive for a while after
setting the BD address. On recent kernel versions (at least 5.17+)
this often causes timeouts for subsequent commands, e.g. the HCI reset
sent by the Bluetooth core during initialization:
Bluetooth: hci0: Opcode 0x c03 failed: -110
Unfortunately this behavior does not seem to be documented anywhere.
Experimentation suggests that the minimum necessary delay to avoid
the problem is ~150us. However, to be sure add a sleep for > 1ms
in case it is a bit longer on other firmware versions.
Older kernel versions are likely also affected, although perhaps with
slightly different errors or less probability. Side effects can easily
hide the issue in most cases, e.g. unrelated incoming interrupts that
cause the necessary delay.
Fixes: 1511cc750c ("Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
There are two related issues that appear in certain combinations with
clang and GNU binutils.
The first occurs when a version of clang that supports zicsr or zifencei
via '-march=' [1] (i.e, >= 17.x) is used in combination with a version
of GNU binutils that do not recognize zicsr and zifencei in the
'-march=' value (i.e., < 2.36):
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei'
riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/file.o
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei'
riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/super.o
The second occurs when a version of clang that does not support zicsr or
zifencei via '-march=' (i.e., <= 16.x) is used in combination with a
version of GNU as that defaults to a newer ISA base spec, which requires
specifying zicsr and zifencei in the '-march=' value explicitly (i.e, >=
2.38):
../arch/riscv/kernel/kexec_relocate.S: Assembler messages:
../arch/riscv/kernel/kexec_relocate.S:147: Error: unrecognized opcode `fence.i', extension `zifencei' required
clang-12: error: assembler command failed with exit code 1 (use -v to see invocation)
This is the same issue addressed by commit 6df2a016c0 ("riscv: fix
build with binutils 2.38") (see [2] for additional information) but
older versions of clang miss out on it because the cc-option check
fails:
clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr'
clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr'
To resolve the first issue, only attempt to add zicsr and zifencei to
the march string when using the GNU assembler 2.38 or newer, which is
when the default ISA spec was updated, requiring these extensions to be
specified explicitly. LLVM implements an older version of the base
specification for all currently released versions, so these instructions
are available as part of the 'i' extension. If LLVM's implementation is
updated in the future, a CONFIG_AS_IS_LLVM condition can be added to
CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI.
To resolve the second issue, use version 2.2 of the base ISA spec when
using an older version of clang that does not support zicsr or zifencei
via '-march=', as that is the spec version most compatible with the one
clang/LLVM implements and avoids the need to specify zicsr and zifencei
explicitly due to still being a part of 'i'.
[1]: 22e199e6af
[2]: https://lore.kernel.org/ZAxT7T9Xy1Fo3d5W@aurel32.net/
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1808
Co-developed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230313-riscv-zicsr-zifencei-fiasco-v1-1-dd1b7840a551@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
reconnecting using the same local TCP port. Unique NFS operations are
identified by the per-TCP connection set of XIDs. This prevents file
corruption when non-idempotent NFS operations are retried.
Currently, NFS client TCP connections are using different local TCP ports
when reconnecting to NFS servers.
After an NFS server initiates shutdown of the TCP connection, the NFS
client's TCP socket is set to NULL after the socket state has reached
TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
the same local port but fails with EADDRNOTAVAIL (99). This forces the
socket to use a different local TCP port to reconnect to the remote NFS
server.
State Transition and Events:
TCP_CLOSE_WAIT(8)
TCP_LAST_ACK(9)
connect(fail EADDRNOTAVAIL(99))
TCP_CLOSE(7)
bind on new port
connect success
dmesg excerpts showing reconnect switching from TCP local port of 926 to
763 after commit 7c81e6a9d7:
[13354.947854] NFS call mkdir testW
...
[13405.654781] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654813] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[13405.654826] RPC: xs_data_ready...
[13405.654892] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654895] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.654899] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654900] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.654950] RPC: xs_connect scheduled xprt 00000000037d0f03
[13405.654975] RPC: xs_bind 0.0.0.0:926: ok (0)
[13405.654980] RPC: worker connecting xprt 00000000037d0f03 via tcp
to 10.101.6.228 (port 2049)
[13405.654991] RPC: 00000000037d0f03 connect status 99 connected 0
sock state 7
[13405.655001] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.655002] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.655024] RPC: xs_connect scheduled xprt 00000000037d0f03
[13405.655038] RPC: xs_bind 0.0.0.0:763: ok (0)
[13405.655041] RPC: worker connecting xprt 00000000037d0f03 via tcp
to 10.101.6.228 (port 2049)
[13405.655065] RPC: 00000000037d0f03 connect status 115 connected 0
sock state 2
State Transition and Events with patch applied:
TCP_CLOSE_WAIT(8)
TCP_LAST_ACK(9)
TCP_CLOSE(7)
connect(reuse of port succeeds)
dmesg excerpts showing reconnect on same TCP local port of 936 with patch
applied:
[ 257.139935] NFS: mkdir(0:59/560857152), testQ
[ 257.139937] NFS call mkdir testQ
...
[ 307.822702] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[ 307.822714] RPC: xs_data_ready...
[ 307.822817] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.822821] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.822825] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.822826] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823606] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.823609] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823629] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.823632] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823676] RPC: xs_connect scheduled xprt 00000000ce702f14
[ 307.823704] RPC: xs_bind 0.0.0.0:936: ok (0)
[ 307.823709] RPC: worker connecting xprt 00000000ce702f14 via tcp
to 10.101.1.30 (port 2049)
[ 307.823748] RPC: 00000000ce702f14 connect status 115 connected 0
sock state 2
...
[ 314.916193] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 314.916251] RPC: xs_connect scheduled xprt 00000000ce702f14
[ 314.916282] RPC: xs_bind 0.0.0.0:936: ok (0)
[ 314.916292] RPC: worker connecting xprt 00000000ce702f14 via tcp
to 10.101.1.30 (port 2049)
[ 314.916342] RPC: 00000000ce702f14 connect status 115 connected 0
sock state 2
Fixes: 7c81e6a9d7 ("SUNRPC: Tweak TCP socket shutdown in the RPC client")
Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.3
- send Identify with CNS 06h only to I/O controllers (Martin George)
- fix nvme_tcp_term_pdu to match spec (Caleb Sander)"
* tag 'nvme-6.3-2023-03-23' of git://git.infradead.org/nvme:
nvme-tcp: fix nvme_tcp_term_pdu to match spec
nvme: send Identify with CNS 06h only to I/O controllers
It turns out that reverting commit 970343cd49 ("GFS2: free disk inode
which is deleted by remote node -V2") causes a regression related to
evicting inodes that were unlinked on a different cluster node.
We could also have simply added a call to d_mark_dontcache() to function
gfs2_try_evict(), but the original pre-revert code is better tested and
proven.
This reverts commit 445cb1277e.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When in dual role mode (dr_mode == USB_DR_MODE_OTG), platform probe
successively basically calls:
- dwc2_gadget_init()
- dwc2_hcd_init()
- dwc2_lowlevel_hw_disable() since recent change [1]
- usb_add_gadget_udc()
The PHYs (and so the clocks it may provide) shouldn't be disabled for all
SoCs, in OTG mode, as the HCD part has been initialized.
On STM32 this creates some weird race condition upon boot, when:
- initially attached as a device, to a HOST
- and there is a gadget script invoked to setup the device part.
Below issue becomes systematic, as long as the gadget script isn't
started by userland: the hardware PHYs (and so the clocks provided by the
PHYs) remains disabled.
It ends up in having an endless interrupt storm, before the watchdog
resets the platform.
[ 16.924163] dwc2 49000000.usb-otg: EPs: 9, dedicated fifos, 952 entries in SPRAM
[ 16.962704] dwc2 49000000.usb-otg: DWC OTG Controller
[ 16.966488] dwc2 49000000.usb-otg: new USB bus registered, assigned bus number 2
[ 16.974051] dwc2 49000000.usb-otg: irq 77, io mem 0x49000000
[ 17.032170] hub 2-0:1.0: USB hub found
[ 17.042299] hub 2-0:1.0: 1 port detected
[ 17.175408] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.181741] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.189303] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
...
The host part is also not functional, until the gadget part is configured.
The HW may only be disabled for peripheral mode (original init), e.g.
dr_mode == USB_DR_MODE_PERIPHERAL, until the gadget driver initializes.
But when in USB_DR_MODE_OTG, the HW should remain enabled, as the HCD part
is able to run, while the gadget part isn't necessarily configured.
I don't fully get the of purpose the original change, that claims disabling
the hardware is missing. It creates conditions on SOCs using the PHY
initialization to be completely non working in OTG mode. Original
change [1] should be reworked to be platform specific.
[1] https://lore.kernel.org/r/20221206-dwc2-gadget-dual-role-v1-2-36515e1092cd@theobroma-systems.com
Fixes: ade23d7b7e ("usb: dwc2: power on/off phy for peripheral mode in dual-role mode")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Tested-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20230315144433.3095859-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Each time the platform goes to low power, PM suspend / resume routines
call: __dwc2_lowlevel_hw_enable -> devm_add_action_or_reset().
This adds a new devres each time.
This may also happen at runtime, as dwc2_lowlevel_hw_enable() can be
called from udc_start().
This can be seen with tracing:
- echo 1 > /sys/kernel/debug/tracing/events/dev/devres_log/enable
- go to low power
- cat /sys/kernel/debug/tracing/trace
A new "ADD" entry is found upon each low power cycle:
... devres_log: 49000000.usb-otg ADD 82a13bba devm_action_release (8 bytes)
... devres_log: 49000000.usb-otg ADD 49889daf devm_action_release (8 bytes)
...
A second issue is addressed here:
- regulator_bulk_enable() is called upon each PM cycle (suspend/resume).
- regulator_bulk_disable() never gets called.
So the reference count for these regulators constantly increase, by one
upon each low power cycle, due to missing regulator_bulk_disable() call
in __dwc2_lowlevel_hw_disable().
The original fix that introduced the devm_add_action_or_reset() call,
fixed an issue during probe, that happens due to other errors in
dwc2_driver_probe() -> dwc2_core_reset(). Then the probe fails without
disabling regulators, when dr_mode == USB_DR_MODE_PERIPHERAL.
Rather fix the error path: disable all the low level hardware in the
error path, by using the "hsotg->ll_hw_enabled" flag. Checking dr_mode
has been introduced to avoid a dual call to dwc2_lowlevel_hw_disable().
"ll_hw_enabled" should achieve the same (and is used currently in the
remove() routine).
Fixes: 54c1960605 ("usb: dwc2: Always disable regulators on driver teardown")
Fixes: 33a06f1300 ("usb: dwc2: Fix error path in gadget registration")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20230316084127.126084-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull zonefs fixes from Damien Le Moal:
- Silence a false positive smatch warning about an uninitialized
variable
- Fix an error message to provide more useful information about invalid
zone append write results
* tag 'zonefs-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Fix error message in zonefs_file_dio_append()
zonefs: Prevent uninitialized symbol 'size' warning
In the output of /proc/fs/cifs/open_files, we only print
the tree id for the tcon of each open file. It becomes
difficult to know which tcon these files belong to with
just the tree id.
This change dumps ses id in addition to all other data today.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Currently, we only dump the pending mid information only
on the primary channel in /proc/fs/cifs/DebugData.
If multichannel is active, we do not print the pending MID
list on secondary channels.
This change will dump the pending mids for all the channels
based on server->conn_id.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
When querying server interfaces returns -EOPNOTSUPP,
clear the list of interfaces. Assumption is that multichannel
would be disabled too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
We have the server interface list hanging off the tcon
structure today for reasons unknown. So each tcon which is
connected to a file server can query them separately,
which is really unnecessary. To avoid this, in the query
function, we will check the time of last update of the
interface list, and avoid querying the server if it is
within a certain range.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
The logic in efi_random_alloc() will iterate over the memory map twice,
once to count the number of candidate slots, and another time to locate
the chosen slot after randomization.
If there is insufficient memory to do the allocation, the second loop
will run to completion without actually having located a slot, but we
currently return EFI_SUCCESS in this case, as we fail to initialize
status to the appropriate error value of EFI_OUT_OF_RESOURCES.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
[Why]
The commit c76e483cd9 ("drm/amd/display: Don't restrict bpc to 8 bpc")
removes the historical 8bpc dependency and sets max_bpc to 16.
[How]
The comment that states "8bpc for non-edp" needs to be removed as well.
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sriov needs to enter/exit safe mode in update umd p state
add the cg flag to let it enter or exit while needed
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.
[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
when gfx do soft reset, mes will also do reset, if mes is not
resumed when do recover from soft reset, mes is unable to respond
in later sequence
[how]
resume mes when do gfx post soft reset
Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For GC IP v11.0.4/11, PSP TMR need to be reserved
for ASIC mode2 reset. But for S4, when psp suspend,
it will destroy the TMR that fails the ASIC reset.
[ 96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset
[ 100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002
[ 100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed!
[ 100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62
[ 100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62
[ 100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs
[ 100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62
We can skip the reset on APUs, assuming we can resume them
properly. Verified on some GFX11, GFX10 and old GFX9 APUs.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x