linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 12:17:12 +09:00

Author	SHA1	Message	Date
Ryder Lee	ab0eec4bf2	wifi: mt76: mt7996: enable BSS_CHANGED_MCAST_RATE support Similar to BSS_CHANGED_BASIC_RATES, this enables mcast rate configuration through fixed rate tables. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Change-Id: Ifc305e8c7de9a7df4ad5f856e2097d721a886aaa Signed-off-by: Felix Fietkau <nbd@nbd.name>	2023-04-19 10:09:43 +02:00
Ryder Lee	15ee62e737	wifi: mt76: mt7996: enable BSS_CHANGED_BASIC_RATES support The connac3 removes fixed rate fields to reduce txd size and introduces global rate tables (64 entries) for rate setting. Driver needs to fill the corresponding idx in MT_TXD6_TX_RATE while tx, and push mt76_rate into predifined table at bootup stage so that mvif->basic_rates_idx can immediately switch out once setting changes. spe_idx is also needed for fixed rate frames, and will be updated by future patches. Note that all table entries are shared across driver and firmware (i.e.TxBF), hence adding MT7996_BASIC_RATES_TBL to reflect mapping status. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2023-04-19 10:09:43 +02:00
David S. Miller	ed7f9c01e2	Merge branch 'mptcp-fixes' Matthieu Baerts says: ==================== mptcp: fixes around listening sockets and the MPTCP worker Christoph Paasch reported a couple of issues found by syzkaller and linked to operations done by the MPTCP worker on (un)accepted sockets. Fixing these issues was not obvious and rather complex but Paolo Abeni nicely managed to propose these excellent patches that seem to satisfy syzkaller. Patch 1 partially reverts a recent fix but while still providing a solution for the previous issue, it also prevents the MPTCP worker from running concurrently with inet_csk_listen_stop(). A warning is then avoided. The partially reverted patch has been introduced in v6.3-rc3, backported up to v6.1 and fixing an issue visible from v5.18. Patch 2 prevents the MPTCP worker to race with mptcp_accept() causing a UaF when a fallback to TCP is done while in parallel, the socket is being accepted by the userspace. This is also a fix of a previous fix introduced in v6.3-rc3, backported up to v6.1 but here fixing an issue that is in theory there from v5.7. There is no need to backport it up to here as it looks like it is only visible later, around v5.18, see the previous cover-letter linked to this original fix. ==================== Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>	2023-04-19 09:08:37 +01:00
Paolo Abeni	63740448a3	mptcp: fix accept vs worker race The mptcp worker and mptcp_accept() can race, as reported by Christoph: refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 14351 at lib/refcount.c:25 refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Modules linked in: CPU: 1 PID: 14351 Comm: syz-executor.2 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Code: 02 31 ff 89 de e8 1b f0 a7 ff 84 db 0f 85 6e ff ff ff e8 3e f5 a7 ff 48 c7 c7 d8 c7 34 83 c6 05 6d 2d 0f 02 01 e8 cb 3d 90 ff <0f> 0b e9 4f ff ff ff e8 1f f5 a7 ff 0f b6 1d 54 2d 0f 02 31 ff 89 RSP: 0018:ffffc90000a47bf8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88802eae98c0 RSI: ffffffff81097d4f RDI: 0000000000000001 RBP: ffff88802e712180 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffff88802eaea148 R12: ffff88802e712100 R13: ffff88802e712a88 R14: ffff888005cb93a8 R15: ffff88802e712a88 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f277fd89120 CR3: 0000000035486002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __refcount_add include/linux/refcount.h:199 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] __mptcp_close+0x4c6/0x4d0 net/mptcp/protocol.c:3051 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x56/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x51/0xf0 net/socket.c:653 sock_close+0x18/0x20 net/socket.c:1395 __fput+0x113/0x430 fs/file_table.c:321 task_work_run+0x96/0x100 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x4fc/0x10c0 kernel/exit.c:869 do_group_exit+0x51/0xf0 kernel/exit.c:1019 get_signal+0x12b0/0x1390 kernel/signal.c:2859 arch_do_signal_or_restart+0x25/0x260 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x131/0x1a0 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x19/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fec4b4926a9 Code: Unable to access opcode bytes at 0x7fec4b49267f. RSP: 002b:00007fec49f9dd78 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 00000000006bc058 RCX: 00007fec4b4926a9 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006bc058 RBP: 00000000006bc050 R08: 00000000007df998 R09: 00000000007df998 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 000000000000000b R15: 000000000001fe40 </TASK> The root cause is that the worker can force fallback to TCP the first mptcp subflow, actually deleting the unaccepted msk socket. We can explicitly prevent the race delaying the unaccepted msk deletion at listener shutdown time. In case the closed subflow is later accepted, just drop the mptcp context and let the user-space deal with the paired mptcp socket. Fixes: `b6985b9b82` ("mptcp: use the workqueue to destroy unaccepted sockets") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/375 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 09:08:37 +01:00
Paolo Abeni	2a6a870e44	mptcp: stops worker on unaccepted sockets at listener close This is a partial revert of the blamed commit, with a relevant change: mptcp_subflow_queue_clean() now just change the msk socket status and stop the worker, so that the UaF issue addressed by the blamed commit is not re-introduced. The above prevents the mptcp worker from running concurrently with inet_csk_listen_stop(), as such race would trigger a warning, as reported by Christoph: RSP: 002b:00007f784fe09cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e WARNING: CPU: 0 PID: 25807 at net/ipv4/inet_connection_sock.c:1387 inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f7850afd6a9 RDX: 0000000000000000 RSI: 0000000020000340 RDI: 0000000000000004 Modules linked in: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK> CPU: 0 PID: 25807 Comm: syz-executor.7 Not tainted 6.2.0-g778e54711659 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: 0000000000000000 RBX: ffff888100dfbd40 RCX: 0000000000000000 RDX: ffff8881363aab80 RSI: ffffffff81c494f4 RDI: 0000000000000005 RBP: ffff888126dad080 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888100dfe040 R13: 0000000000000001 R14: 0000000000000000 R15: ffff888100dfbdd8 FS: 00007f7850a2c800(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32d26000 CR3: 000000012fdd8006 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __tcp_close+0x5b2/0x620 net/ipv4/tcp.c:2875 __mptcp_close_ssk+0x145/0x3d0 net/mptcp/protocol.c:2427 mptcp_destroy_common+0x8a/0x1c0 net/mptcp/protocol.c:3277 mptcp_destroy+0x41/0x60 net/mptcp/protocol.c:3304 __mptcp_destroy_sock+0x56/0x140 net/mptcp/protocol.c:2965 __mptcp_close+0x38f/0x4a0 net/mptcp/protocol.c:3057 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x53/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x4e/0xf0 net/socket.c:651 sock_close+0x15/0x20 net/socket.c:1393 __fput+0xff/0x420 fs/file_table.c:321 task_work_run+0x8b/0xe0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f7850af70dc RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7850af70dc RDX: 00007f7850a2c800 RSI: 0000000000000002 RDI: 0000000000000003 RBP: 00000000006bd980 R08: 0000000000000000 R09: 00000000000018a0 R10: 00000000316338a4 R11: 0000000000000293 R12: 0000000000211e31 R13: 00000000006bc05c R14: 00007f785062c000 R15: 0000000000211af0 Fixes: `0a3f4f1f9c` ("mptcp: fix UaF in listener shutdown") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/371 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 09:08:36 +01:00
Pablo Neira Ayuso	2cdaa3eefe	netfilter: conntrack: restore IPS_CONFIRMED out of nf_conntrack_hash_check_insert() `e6d57e9ff0` ("netfilter: conntrack: fix rmmod double-free race") consolidates IPS_CONFIRMED bit set in nf_conntrack_hash_check_insert(). However, this breaks ctnetlink: # conntrack -I -p tcp --timeout 123 --src 1.2.3.4 --dst 5.6.7.8 --state ESTABLISHED --sport 1 --dport 4 -u SEEN_REPLY conntrack v1.4.6 (conntrack-tools): Operation failed: Device or resource busy This is a partial revert of the aforementioned commit to restore IPS_CONFIRMED. Fixes: `e6d57e9ff0` ("netfilter: conntrack: fix rmmod double-free race") Reported-by: Stéphane Graber <stgraber@stgraber.org> Tested-by: Stéphane Graber <stgraber@stgraber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-04-19 10:07:59 +02:00
Alexander Aring	4e006c7a6d	net: rpl: fix rpl header size calculation This patch fixes a missing 8 byte for the header size calculation. The ipv6_rpl_srh_size() is used to check a skb_pull() on skb->data which points to skb_transport_header(). Currently we only check on the calculated addresses fields using CmprI and CmprE fields, see: https://www.rfc-editor.org/rfc/rfc6554#section-3 there is however a missing 8 byte inside the calculation which stands for the fields before the addresses field. Those 8 bytes are represented by sizeof(struct ipv6_rpl_sr_hdr) expression. Fixes: `8610c7c6e3` ("net: ipv6: add support for rpl sr exthdr") Signed-off-by: Alexander Aring <aahringo@redhat.com> Reported-by: maxpl0it <maxpl0it@protonmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 09:04:16 +01:00
Seiji Nishikawa	6f4833383e	net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete() When vmxnet3_rq_create() fails to allocate rq->data_ring.base due to page allocation failure, subsequent call to vmxnet3_rq_rx_complete() can result in NULL pointer dereference. To fix this bug, check not only that rxDataRingUsed is true but also that adapter->rxdataring_enabled is true before calling memcpy() in vmxnet3_rq_rx_complete(). [1728352.477993] ethtool: page allocation failure: order:9, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0 ... [1728352.478009] Call Trace: [1728352.478028] dump_stack+0x41/0x60 [1728352.478035] warn_alloc.cold.120+0x7b/0x11b [1728352.478038] ? _cond_resched+0x15/0x30 [1728352.478042] ? __alloc_pages_direct_compact+0x15f/0x170 [1728352.478043] __alloc_pages_slowpath+0xcd3/0xd10 [1728352.478047] __alloc_pages_nodemask+0x2e2/0x320 [1728352.478049] __dma_direct_alloc_pages.constprop.25+0x8a/0x120 [1728352.478053] dma_direct_alloc+0x5a/0x2a0 [1728352.478056] vmxnet3_rq_create.part.57+0x17c/0x1f0 [vmxnet3] ... [1728352.478188] vmxnet3 0000:0b:00.0 ens192: rx data ring will be disabled ... [1728352.515347] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 ... [1728352.515440] RIP: 0010:memcpy_orig+0x54/0x130 ... [1728352.515655] Call Trace: [1728352.515665] <IRQ> [1728352.515672] vmxnet3_rq_rx_complete+0x419/0xef0 [vmxnet3] [1728352.515690] vmxnet3_poll_rx_only+0x31/0xa0 [vmxnet3] ... Signed-off-by: Seiji Nishikawa <snishika@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 09:03:05 +01:00
David S. Miller	cd02a1a248	Merge branch 'mlx5e-xdp-extend' Tariq Toukan says: ==================== net/mlx5e: Extend XDP multi-buffer capabilities This series extends the XDP multi-buffer support in the mlx5e driver. Patchset breakdown: - Infrastructural changes and preparations. - Add XDP multi-buffer support for XDP redirect-in. - Use TX MPWQE (multi-packet WQE) HW feature for non-linear single-segmented XDP frames. - Add XDP multi-buffer support for striding RQ. In Striding RQ, we overcome the lack of headroom and tailroom between the RQ strides by allocating a side page per packet and using it for the xdp_buff descriptor. We structure the xdp_buff so that it contains nothing in the linear part, and the whole packet resides in the fragments. Performance highlight: Packet rate test, 64 bytes, 32 channels, MTU 9000 bytes. CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz. NIC: ConnectX-6 Dx, at 100 Gbps. +----------+-------------+-------------+---------+ \| Test \| Legacy RQ \| Striding RQ \| Speedup \| +----------+-------------+-------------+---------+ \| XDP_DROP \| 101,615,544 \| 117,191,020 \| +15% \| +----------+-------------+-------------+---------+ \| XDP_TX \| 95,608,169 \| 117,043,422 \| +22% \| +----------+-------------+-------------+---------+ Series generated against net commit: `e61caf04b9` Merge branch 'page_pool-allow-caching-from-safely-localized-napi' I'm submitting this directly as Saeed is traveling. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:27 +01:00
Tariq Toukan	f52ac7028b	net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ Here we add support for multi-buffer XDP handling in Striding RQ, which is our default out-of-the-box RQ type. Before this series, loading such an XDP program would fail, until you switch to the legacy RQ (by unsetting the rx_striding_rq priv-flag). To overcome the lack of headroom and tailroom between the strides, we allocate a side page to be used for the descriptor (xdp_buff / skb) and the linear part. When an XDP program is attached, we structure the xdp_buff so that it contains no data in the linear part, and the whole packet resides in the fragments. In case of XDP_PASS, where an SKB still needs to be created, we copy up to 256 bytes to its linear part, to match the current behavior, and satisfy functions that assume finding the packet headers in the SKB linear part (like eth_type_trans). Performance testing: Packet rate test, 64 bytes, 32 channels, MTU 9000 bytes. CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz. NIC: ConnectX-6 Dx, at 100 Gbps. +----------+-------------+-------------+---------+ \| Test \| Legacy RQ \| Striding RQ \| Speedup \| +----------+-------------+-------------+---------+ \| XDP_DROP \| 101,615,544 \| 117,191,020 \| +15% \| +----------+-------------+-------------+---------+ \| XDP_TX \| 95,608,169 \| 117,043,422 \| +22% \| +----------+-------------+-------------+---------+ Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:27 +01:00
Tariq Toukan	2cb0e27d43	net/mlx5e: RX, Prepare non-linear striding RQ for XDP multi-buffer support In preparation for supporting XDP multi-buffer in striding RQ, use xdp_buff struct to describe the packet. Make its skb_shared_info collide the one of the allocated SKB, then add the fragments using the xdp_buff API. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:27 +01:00
Tariq Toukan	221c8c7ad7	net/mlx5e: RX, Generalize mlx5e_fill_mxbuf() Make the function more generic. Let it get an additional frame_sz parameter instead of deriving it from the RQ struct. No functional change here, just a preparation for a downstream patch. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:27 +01:00
Tariq Toukan	27602319e3	net/mlx5e: RX, Take shared info fragment addition into a function Introduce mlx5e_add_skb_shared_info_frag(), a function dedicated for adding a fragment into a struct skb_shared_info object. Use it in the Legacy RQ flow. Similar usage will be added in a downstream patch by the corresponding Striding RQ flow. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:27 +01:00
Tariq Toukan	63abf14e13	net/mlx5e: XDP, Allow non-linear single-segment frames in XDP TX MPWQE Under a few restrictions, TX MPWQE feature can serve multiple TX packets in a single TX descriptor. It requires each of the packets to have a single scatter entry / segment. Today we allow only linear frames to use this feature, although there's no real problem with non-linear ones where the whole packet reside in the first fragment. Expand the XDP TX MPWQE feature support to include such frames. This is in preparation for the downstream patch, in which we will generate such non-linear frames. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	124d0d8daf	net/mlx5e: XDP, Remove un-established assumptions on XDP buffer Remove the assumption of non-zero linear length in the XDP xmit function, used to serve both internal XDP_TX operations as well as redirected-in requests. Do not apply the MLX5E_XDP_MIN_INLINE check unless necessary. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	20409abe52	net/mlx5e: XDP, Consider large muti-buffer packets in Striding RQ params calculations Function mlx5e_rx_get_linear_stride_sz() returns PAGE_SIZE immediately in case an XDP program is attached. The more accurate formula is ALIGN(sz, PAGE_SIZE), to prevent two packets from residing on the same page. The assumption behind the current code is that sz <= PAGE_SIZE holds for all cases with XDP program set. This is true because it is being called from: - 3 times from Striding RQ flows, in which XDP is not supported for such large packets. - 1 time from Legacy RQ flow, under the condition mlx5e_rx_is_linear_skb(). No functional change here, just removing the implied assumption in preparation for supporting XDP multi-buffer in Striding RQ. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	abd3f84eca	net/mlx5e: XDP, Let XDP checker function get the params as input Change mlx5e_xdp_allowed() so it gets the params structure with the xdp_prog applied, rather than creating a local copy based on the current params in priv. This reduces the amount of memory on the stack, and acts on the exact params instance that's about to be applied. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	7fc06dd2ae	net/mlx5e: XDP, Improve Striding RQ check with XDP Non-linear mem scheme of Striding RQ does not yet support XDP at this point. Take the check where it belongs, inside the params validation function mlx5e_params_validate_xdp(). Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	c1783e74fc	net/mlx5e: XDP, Add support for multi-buffer XDP redirect-in Handle multi-buffer XDP redirect-in requests coming through mlx5e_xdp_xmit. Extend struct mlx5e_xmit_data_frags with an additional dma_arr field, to point to the fragments dma mapping, as they cannot be retrieved via the page_pool_get_dma_addr() function. Push a dma_addr xdpi instance per each fragment, and use them in the completion flow to dma_unmap the frags. Finally, remove the restriction in mlx5e_open_xdpsq, and set the flag in xdp_features. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:26 +01:00
Tariq Toukan	3f734b8c59	net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifo Here we fix the current wi->num_pkts abuse, as it was used to indicate multiple xdpi entries in the xdpi_fifo. Instead, reduce mlx5e_xdp_info to the size of a single field, making it a union of unions. Per packet, use as many instances as needed to provide the information needed at the time of completion. The sequence of xdpi instances pushed is well defined, derived by the xmit_mode. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:25 +01:00
Tariq Toukan	3a48ba12b4	net/mlx5e: XDP, Remove doubtful unlikely calls It is not likely nor unlikely that the xdp buff has fragments, it depends on the program loaded and size of the packet received. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:25 +01:00
Tariq Toukan	eb9b9fdcaf	net/mlx5e: Introduce extended version for mlx5e_xmit_data Introduce struct mlx5e_xmit_data_frags to be used for non-linear xmit buffers. Let it include sinfo pointer. Take one bit from the len field to indicate if the descriptor has fragments and can be casted-up into the extended version. Zero-init to make sure has_frags, and potentially future fields, are zero when not explicitly assigned. Another field will be added in a downstream patch to indicate and point to dma addresses of the different frags, for redirect-in requests. This simplifies the mlx5e_xmit_xdp_frame/mlx5e_xmit_xdp_frame_mpwqe functions params. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:25 +01:00
Tariq Toukan	e32654f198	net/mlx5e: Move struct mlx5e_xmit_data to datapath header Move TX datapath struct from the generic en.h to the datapath txrx.h header, where it belongs. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:25 +01:00
Tariq Toukan	aebc62d336	net/mlx5e: Move XDP struct and enum to XDP header Move struct mlx5e_xdp_info and enum mlx5e_xdp_xmit_mode from the generic en.h to the XDP header, where they belong. Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:59:25 +01:00
Ido Schimmel	c484fcc058	bonding: Fix memory leak when changing bond type to Ethernet When a net device is put administratively up, its 'IFF_UP' flag is set (if not set already) and a 'NETDEV_UP' notification is emitted, which causes the 8021q driver to add VLAN ID 0 on the device. The reverse happens when a net device is put administratively down. When changing the type of a bond to Ethernet, its 'IFF_UP' flag is incorrectly cleared, resulting in the kernel skipping the above process and VLAN ID 0 being leaked [1]. Fix by restoring the flag when changing the type to Ethernet, in a similar fashion to the restoration of the 'IFF_SLAVE' flag. The issue can be reproduced using the script in [2], with example out before and after the fix in [3]. [1] unreferenced object 0xffff888103479900 (size 256): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): 00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff8406426c>] vlan_vid_add+0x30c/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0 unreferenced object 0xffff88810f6a83e0 (size 32): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff ..G.......G..... 81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff84064369>] vlan_vid_add+0x409/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0 [2] ip link add name t-nlmon type nlmon ip link add name t-dummy type dummy ip link add name t-bond type bond mode active-backup ip link set dev t-bond up ip link set dev t-nlmon master t-bond ip link set dev t-nlmon nomaster ip link show dev t-bond ip link set dev t-dummy master t-bond ip link show dev t-bond ip link del dev t-bond ip link del dev t-dummy ip link del dev t-nlmon [3] Before: 12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff After: 12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff Fixes: `e36b9d16c6` ("bonding: clean muticast addresses when device changes type") Fixes: `75c78500dd` ("bonding: remap muticast addresses without using dev_close() and dev_open()") Fixes: `9ec7eb60dc` ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.unizg.hr/ Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 08:55:27 +01:00
Hans de Goede	ef16799640	wifi: iwlwifi: dvm: Fix memcpy: detected field-spanning write backtrace A received TKIP key may be up to 32 bytes because it may contain MIC rx/tx keys too. These are not used by iwl and copying these over overflows the iwl_keyinfo.key field. Add a check to not copy more data to iwl_keyinfo.key then will fit. This fixes backtraces like this one: memcpy: detected field-spanning write (size 32) of single field "sta_cmd.key.key" at drivers/net/wireless/intel/iwlwifi/dvm/sta.c:1103 (size 16) WARNING: CPU: 1 PID: 946 at drivers/net/wireless/intel/iwlwifi/dvm/sta.c:1103 iwlagn_send_sta_key+0x375/0x390 [iwldvm] <snip> Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21 05/08/2017 RIP: 0010:iwlagn_send_sta_key+0x375/0x390 [iwldvm] <snip> Call Trace: <TASK> iwl_set_dynamic_key+0x1f0/0x220 [iwldvm] iwlagn_mac_set_key+0x1e4/0x280 [iwldvm] drv_set_key+0xa4/0x1b0 [mac80211] ieee80211_key_enable_hw_accel+0xa8/0x2d0 [mac80211] ieee80211_key_replace+0x22d/0x8e0 [mac80211] <snip> Link: https://www.alionet.org/index.php?topic=1469.0 Link: https://lore.kernel.org/linux-wireless/20230218191056.never.374-kees@kernel.org/ Link: https://lore.kernel.org/linux-wireless/68760035-7f75-1b23-e355-bfb758a87d83@redhat.com/ Cc: Kees Cook <keescook@chromium.org> Suggested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2023-04-19 09:42:28 +02:00
Rob Herring	0d19bd4df7	ALSA: Use of_property_read_bool() for boolean properties It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. Convert reading boolean properties to to of_property_read_bool(). Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230310144734.1546587-1-robh@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2023-04-19 08:26:08 +02:00
Rob Herring	d42c521ff4	ALSA: ppc/tumbler: Use of_property_present() for testing DT property presence It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. As part of this, convert of_get_property/of_find_property calls to the recently added of_property_present() helper when we just want to test for presence of a property and nothing more. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230310144733.1546500-1-robh@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2023-04-19 08:25:57 +02:00
Alain Volmat	14cac66223	net: ethernet: stmmac: dwmac-sti: remove stih415/stih416/stid127 Remove no more supported platforms (stih415/stih416 and stid127) Signed-off-by: Alain Volmat <avolmat@me.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20230416195523.61075-1-avolmat@me.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-18 21:29:19 -07:00
Arnd Bergmann	33d74c8ff5	net: mscc: ocelot: remove incompatible prototypes The types for the register argument changed recently, but there are still incompatible prototypes that got left behind, and gcc-13 warns about these: In file included from drivers/net/ethernet/mscc/ocelot.c:13: drivers/net/ethernet/mscc/ocelot.h:97:5: error: conflicting types for 'ocelot_port_readl' due to enum/integer mismatch; have 'u32(struct ocelot_port , u32)' {aka 'unsigned int(struct ocelot_port , unsigned int)'} [-Werror=enum-int-mismatch] 97 \| u32 ocelot_port_readl(struct ocelot_port *port, u32 reg); \| ^~~~~~~~~~~~~~~~~ Just remove the two prototypes, and rely on the copy in the global header. Fixes: `9ecd05794b` ("net: mscc: ocelot: strengthen type of "u32 reg" in I/O accessors") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230417205531.1880657-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-18 21:13:23 -07:00
Corinna Vinschen	6b2c6e4a93	net: stmmac: propagate feature flags to vlan stmmac_dev_probe doesn't propagate feature flags to VLANs. So features like offloading don't correspond with the general features and it's not possible to manipulate features via ethtool -K to affect VLANs. Propagate feature flags to vlan features. Drop TSO feature because it does not work on VLANs yet. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Link: https://lore.kernel.org/r/20230417192845.590034-1-vinschen@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-18 21:13:13 -07:00
Tiezhu Yang	b5533e990d	tools/loongarch: Use __SIZEOF_LONG__ to define __BITS_PER_LONG Although __SIZEOF_POINTER__ is equal to _SIZEOF_LONG__ on LoongArch, it is better to use __SIZEOF_LONG__ to define __BITS_PER_LONG to keep consistent between arch/loongarch/include/uapi/asm/bitsperlong.h and tools/arch/loongarch/include/uapi/asm/bitsperlong.h. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:34 +08:00
Enze Li	213ef669d1	LoongArch: Replace hard-coded values in comments with VALEN According to LoongArch documentation [1], CSR.PGDL and CSR.PGDH are concerned with the VA's MSB which is VALEN-1 instead of always being 47. Fix comments to avoid misleading others. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#page-global-directory-base-address-for-lower-half-address-space Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Enze Li <lienze@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:27 +08:00
Tiezhu Yang	afca6e0649	LoongArch: Clean up plat_swiotlb_setup() related code After commit `c78c43fe7d` ("LoongArch: Use acpi_arch_dma_setup() and remove ARCH_HAS_PHYS_TO_DMA"), plat_swiotlb_setup() has been deleted, so clean up the related code. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:27 +08:00
Tiezhu Yang	370a3b8f58	LoongArch: Check unwind_error() in arch_stack_walk() We can see the following messages with CONFIG_PROVE_LOCKING=y on LoongArch: BUG: MAX_STACK_TRACE_ENTRIES too low! turning off the locking correctness validator. This is because stack_trace_save() returns a big value after call arch_stack_walk(), here is the call trace: save_trace() stack_trace_save() arch_stack_walk() stack_trace_consume_entry() arch_stack_walk() should return immediately if unwind_next_frame() failed, no need to do the useless loops to increase the value of c->len in stack_trace_consume_entry(), then we can fix the above problem. Cc: stable@vger.kernel.org Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/8a44ad71-68d2-4926-892f-72bfc7a67e2a@roeck-us.net/ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:27 +08:00
Qing Zhang	e32b3b8222	LoongArch: Adjust user_regset_copyin parameter to the correct offset Ensure that user_watch_state can be set correctly by the user. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:27 +08:00
Qing Zhang	ff9f3d7aef	LoongArch: Adjust user_watch_state for explicit alignment This is done in order to easily calculate the number of breakpoints in hw_break_get()/hw_break_set(). Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:27 +08:00
Hangbin Liu	980f0799a1	bonding: add software tx timestamping support Currently, bonding only obtain the timestamp (ts) information of the active slave, which is available only for modes 1, 5, and 6. For other modes, bonding only has software rx timestamping support. However, some users who use modes such as LACP also want tx timestamp support. To address this issue, let's check the ts information of each slave. If all slaves support tx timestamping, we can enable tx timestamping support for the bond. Add a note that the get_ts_info may be called with RCU, or rtnl or reference on the device in ethtool.h> Suggested-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20230418034841.2566262-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-18 20:48:59 -07:00
Jakub Kicinski	92e8c732d8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Unbreak br_netfilter physdev match support, from Florian Westphal. 2) Use GFP_KERNEL_ACCOUNT for stateful/policy objects, from Chen Aotian. 3) Use IS_ENABLED() in nf_reset_trace(), from Florian Westphal. 4) Fix validation of catch-all set element. 5) Tighten requirements for catch-all set elements. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements netfilter: nf_tables: validate catch-all set elements netfilter: nf_tables: fix ifdef to also consider nf_tables=m netfilter: nf_tables: Modify nla_memdup's flag to GFP_KERNEL_ACCOUNT netfilter: br_netfilter: fix recent physdev match breakage ==================== Link: https://lore.kernel.org/r/20230418145048.67270-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-18 20:46:31 -07:00
Palmer Dabbelt	2e75ab3189	Merge patch series "riscv: Use PUD/P4D/PGD pages for the linear mapping" Alexandre Ghiti <alexghiti@rivosinc.com> says: This patchset intends to improve tlb utilization by using hugepages for the linear mapping. As reported by Anup in v6, when STRICT_KERNEL_RWX is enabled, we must take care of isolating the kernel text and rodata so that they are not mapped with a PUD mapping which would then assign wrong permissions to the whole region: it is achieved the same way as arm64 by using the memblock nomap API which isolates those regions and re-merge them afterwards thus avoiding any issue with the system resources tree creation. arch/riscv/include/asm/page.h \| 19 ++++++- arch/riscv/mm/init.c \| 102 ++++++++++++++++++++++++++-------- arch/riscv/mm/physaddr.c \| 16 ++++++ drivers/of/fdt.c \| 11 ++-- 4 files changed, 118 insertions(+), 30 deletions(-) * b4-shazam-merge: riscv: Use PUD/P4D/PGD pages for the linear mapping riscv: Move the linear mapping creation in its own function riscv: Get rid of riscv_pfn_base variable Link: https://lore.kernel.org/r/20230324155421.271544-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:43:07 -07:00
Alexandre Ghiti	3335068f87	riscv: Use PUD/P4D/PGD pages for the linear mapping During the early page table creation, we used to set the mapping for PAGE_OFFSET to the kernel load address: but the kernel load address is always offseted by PMD_SIZE which makes it impossible to use PUD/P4D/PGD pages as this physical address is not aligned on PUD/P4D/PGD size (whereas PAGE_OFFSET is). But actually we don't have to establish this mapping (ie set va_pa_offset) that early in the boot process because: - first, setup_vm installs a temporary kernel mapping and among other things, discovers the system memory, - then, setup_vm_final creates the final kernel mapping and takes advantage of the discovered system memory to create the linear mapping. During the first phase, we don't know the start of the system memory and then until the second phase is finished, we can't use the linear mapping at all and phys_to_virt/virt_to_phys translations must not be used because it would result in a different translation from the 'real' one once the final mapping is installed. So here we simply delay the initialization of va_pa_offset to after the system memory discovery. But to make sure noone uses the linear mapping before, we add some guard in the DEBUG_VIRTUAL config. Finally we can use PUD/P4D/PGD hugepages when possible, which will result in a better TLB utilization. Note that: - this does not apply to rv32 as the kernel mapping lies in the linear mapping. - we rely on the firmware to protect itself using PMP. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Acked-by: Rob Herring <robh@kernel.org> # DT bits Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20230324155421.271544-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:43:04 -07:00
Alexandre Ghiti	8589e346bb	riscv: Move the linear mapping creation in its own function No change intended, it just splits the linear mapping creation from setup_vm_final: this prepares for upcoming additions to the linear mapping creation. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20230324155421.271544-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:43:03 -07:00
Alexandre Ghiti	a7407a1318	riscv: Get rid of riscv_pfn_base variable Use directly phys_ram_base instead, riscv_pfn_base is just the pfn of the address contained in phys_ram_base. Even if there is no functional change intended in this patch, actually setting phys_ram_base that early changes the behaviour of kernel_mapping_pa_to_va during the early boot: phys_ram_base used to be zero before this patch and now it is set to the physical start address of the kernel. But it does not break the conversion of a kernel physical address into a virtual address since kernel_mapping_pa_to_va should only be used on kernel physical addresses, i.e. addresses greater than the physical start address of the kernel. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20230324155421.271544-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:43:02 -07:00
Conor Dooley	5464912cfa	RISC-V: align ISA extension Kconfig help text with each other Other extensions only capitalise the first letter in the text visible in Kconfig menus, and provide a short comment about the extension's meaning. Do the same for Svnapot & Svpbmt. The precedent for capitalisation in the Kconfig text was set by Zicbom & sorta followed for Zicboz. The RVI styling used for multi-letter extensions only capitalises the first letter, so do the same here. If nothing else, my OCD likes it when the extensions follow a consistent pattern. While editing one of the lines, reformat the "spelling" of 64-bit. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230405-pucker-cogwheel-3a999a94a2f2@wendy Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:37:21 -07:00
Song Shuai	8bf7b3b667	riscv: Kconfig: enable SCHED_MC kconfig RISC-V now builds the sched domain based on the simple possible map. Enable SCHED_MC to make the building based on cpu_coregroup_mask() which also takes care of the NUMA and cores with LLC. Signed-off-by: Song Shuai <suagrfillet@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230310110336.970985-1-suagrfillet@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:35:43 -07:00
Song Shuai	c4b52d8b6c	riscv: export cpu/freq invariant to scheduler RISC-V now manages CPU topology using arch_topology which provides CPU capacity and frequency related interfaces to access the cpu/freq invariant in possible heterogeneous or DVFS-enabled platforms. Here adds topology.h file to export the arch_topology interfaces for replacing the scheduler's constant-based cpu/freq invariant accounting. Signed-off-by: Song Shuai <suagrfillet@gmail.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Ley Foon Tan <lftan@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230323123924.3032174-1-suagrfillet@gmail.com [Palmer: Fix the whitespace issues.] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 20:29:37 -07:00
Brian King	65a15d6560	scsi: ipr: Remove SATA support Linux SATA support in ipr has always been limited to SATA DVDs. The last systems that had the option of including a SATA DVD was Power 8, which have been withdrawn for some time now, so this support can be removed. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20230412174015.114764-1-brking@linux.vnet.ibm.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-18 23:01:23 -04:00
John Garry	0c028b6a11	scsi: scsi_debug: Abort commands from scsi_debug_device_reset() Currently scsi_debug_device_reset() does not do much apart from setting the SDEBUG_UA_POR ("Power on, reset, or bus device reset") flag, which is eventually passed back to the SCSI midlayer later for a "unit attention" command. There is a report that blktest scsi/007 test fails due to commit `1107c7b24e` ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd"). The problem there is that there are dangling scsi_debug queued commands when we attempt to remove the driver. scsi/007 test triggers SCSI EH and attempts to abort a timed-out command. Function scsi_debug_device_reset() is called as part of the EH, but does not deal with outstanding erroneous command. Prior to the named commit, removing the driver caused all dangling queued commands to be stopped - this should have not been necessary. Fix by aborting outstanding commands on a scsi_device basis from scsi_debug_device_reset(). Fixes: `1107c7b24e` ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd") Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202304071111.e762fcbd-yujie.liu@intel.com Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230416175654.159163-1-john.g.garry@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-18 22:58:50 -04:00
Palmer Dabbelt	eb04e72b34	Merge patch series "RISC-V Hardware Probing User Interface" Evan Green <evan@rivosinc.com> says: There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace. Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future. The actual user interface is a syscall, with a vDSO function in front of it. The vDSO function can answer some queries without a syscall at all, and falls back to the syscall for cases it doesn't have answers to. Currently we prepopulate it with an array of answers for all keys and a CPU set of "all CPUs". This can be adjusted as necessary to provide fast answers to the most common queries. An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1]. I was asked about the performance delta between this and something like sysfs. I created a small test program and ran it on a Nezha D1 Allwinner board. Doing each operation 100000 times and dividing, these operations take the following amount of time: - open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us - access("/sys/kernel/cpu_byteorder", R_OK): 1.3us - riscv_hwprobe() vDSO and syscall: .0094us - riscv_hwprobe() vDSO with no syscall: 0.0091us These numbers get farther apart if we query multiple keys, as sysfs will scale linearly with the number of keys, where the dedicated syscall stays the same. To frame these numbers, I also did a tight fork/exec/wait loop, which I measured as 4.8ms. So doing 4 open/read/close operations is a delta of about 0.3%, versus a single vDSO call is a delta of essentially zero. [1] https://patchwork.ozlabs.org/project/glibc/list/?series=343050 * b4-shazam-merge: RISC-V: Add hwprobe vDSO function and data selftests: Test the new RISC-V hwprobe interface RISC-V: hwprobe: Support probing of misaligned access performance RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA RISC-V: Add a syscall for HW probing RISC-V: Move struct riscv_cpuinfo to new header Link: https://lore.kernel.org/r/20230407231103.2622178-1-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 19:49:51 -07:00
David Howells	023fc150a3	cifs: Reapply lost fix from commit `30b2b2196d` Reapply the fix from: `30b2b2196d` ("cifs: do not include page data when checking signature") that got lost in the iteratorisation of the cifs driver. Fixes: `d08089f649` ("cifs: Change the I/O paths to use an iterator rather than a page list") Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reported-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Paulo Alcantara <pc@cjr.nz> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Bharath S M <bharathsm@microsoft.com> cc: Enzo Matsumiya <ematsumiya@suse.de> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2023-04-18 21:26:09 -05:00

... 46 47 48 49 50 ...

1185157 Commits