linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-11 05:17:10 +09:00

Author	SHA1	Message	Date
Gaosheng Cui	151e34d1c6	ring-buffer: Fix kernel-doc warnings in ring_buffer.c Fix kernel-doc warnings: kernel/trace/ring_buffer.c:954: warning: Function parameter or member 'cpu' not described in 'ring_buffer_wake_waiters' kernel/trace/ring_buffer.c:3383: warning: Excess function parameter 'event' description in 'ring_buffer_unlock_commit' kernel/trace/ring_buffer.c:5359: warning: Excess function parameter 'cpu' description in 'ring_buffer_reset_online_cpus' Link: https://lkml.kernel.org/r/20230724140827.1023266-2-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2023-07-28 19:59:03 -04:00
Linus Torvalds	c06f9091a2	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "Several smaller driver fixes and a core RDMA CM regression fix: - Fix improperly accepting flags from userspace in mlx4 - Add missing DMA barriers for irdma - Fix two kcsan warnings in irdma - Report the correct CQ op code to userspace in irdma - Report the correct MW bind error code for irdma - Load the destination address in RDMA CM to resolve a recent regression - Fix a QP regression in mthca - Remove a race processing completions in bnxt_re resulting in a crash - Fix driver unloading races with interrupts and tasklets in bnxt_re - Fix missing error unwind in rxe" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/irdma: Report correct WC error RDMA/irdma: Fix op_type reporting in CQEs RDMA/rxe: Fix an error handling path in rxe_bind_mw() RDMA/bnxt_re: Fix hang during driver unload RDMA/bnxt_re: Prevent handling any completions after qp destroy RDMA/mthca: Fix crash when polling CQ for shared QPs RDMA/core: Update CMA destination address on rdma_resolve_addr RDMA/irdma: Fix data race on CQP request done RDMA/irdma: Fix data race on CQP completion stats RDMA/irdma: Add missing read barriers RDMA/mlx4: Make check for invalid flags stricter	2023-07-28 16:55:56 -07:00
Alexei Starovoitov	eb03993a60	Merge branch 'support-defragmenting-ipv-4-6-packets-in-bpf' Daniel Xu says: ==================== Support defragmenting IPv(4\|6) packets in BPF === Context === In the context of a middlebox, fragmented packets are tricky to handle. The full 5-tuple of a packet is often only available in the first fragment which makes enforcing consistent policy difficult. There are really only two stateless options, neither of which are very nice: 1. Enforce policy on first fragment and accept all subsequent fragments. This works but may let in certain attacks or allow data exfiltration. 2. Enforce policy on first fragment and drop all subsequent fragments. This does not really work b/c some protocols may rely on fragmentation. For example, DNS may rely on oversized UDP packets for large responses. So stateful tracking is the only sane option. RFC 8900 [0] calls this out as well in section 6.3: Middleboxes [...] should process IP fragments in a manner that is consistent with [RFC0791] and [RFC8200]. In many cases, middleboxes must maintain state in order to achieve this goal. === BPF related bits === Policy has traditionally been enforced from XDP/TC hooks. Both hooks run before kernel reassembly facilities. However, with the new BPF_PROG_TYPE_NETFILTER, we can rather easily hook into existing netfilter reassembly infra. The basic idea is we bump a refcnt on the netfilter defrag module and then run the bpf prog after the defrag module runs. This allows bpf progs to transparently see full, reassembled packets. The nice thing about this is that progs don't have to carry around logic to detect fragments. === Changelog === Changes from v5: * Fix defrag disable codepaths Changes from v4: * Refactor module handling code to not sleep in rcu_read_lock() * Also unify the v4 and v6 hook structs so they can share codepaths * Fixed some checkpatch.pl formatting warnings Changes from v3: * Correctly initialize `addrlen` stack var for recvmsg() Changes from v2: * module_put() if ->enable() fails * Fix CI build errors Changes from v1: * Drop bpf_program__attach_netfilter() patches * static -> static const where appropriate * Fix callback assignment order during registration * Only request_module() if callbacks are missing * Fix retval when modprobe fails in userspace * Fix v6 defrag module name (nf_defrag_ipv6_hooks -> nf_defrag_ipv6) * Simplify priority checking code * Add warning if module doesn't assign callbacks in the future * Take refcnt on module while defrag link is active [0]: https://datatracker.ietf.org/doc/html/rfc8900 ==================== Link: https://lore.kernel.org/r/cover.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:09 -07:00
Daniel Xu	c313eae739	bpf: selftests: Add defrag selftests These selftests tests 2 major scenarios: the BPF based defragmentation can successfully be done and that packet pointers are invalidated after calls to the kfunc. The logic is similar for both ipv4 and ipv6. In the first scenario, we create a UDP client and UDP echo server. The the server side is fairly straightforward: we attach the prog and simply echo back the message. The on the client side, we send fragmented packets to and expect the reassembled message back from the server. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/33e40fdfddf43be93f2cb259303f132f46750953.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:08 -07:00
Daniel Xu	e15a220956	bpf: selftests: Support custom type and proto for client sockets Extend connect_to_fd_opts() to take optional type and protocol parameters for the client socket. These parameters are useful when opening a raw socket to send IP fragments. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/9067db539efdfd608aa86a2b143c521337c111fc.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:08 -07:00
Daniel Xu	3495e89cdc	bpf: selftests: Support not connecting client socket For connectionless protocols or raw sockets we do not want to actually connect() to the server. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/525c13d66dac2d640a1db922546842c051c6f2e6.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:08 -07:00
Daniel Xu	91721c2d02	netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link This commit adds support for enabling IP defrag using pre-existing netfilter defrag support. Basically all the flag does is bump a refcnt while the link the active. Checks are also added to ensure the prog requesting defrag support is run _after_ netfilter defrag hooks. We also take care to avoid any issues w.r.t. module unloading -- while defrag is active on a link, the module is prevented from unloading. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/5cff26f97e55161b7d56b09ddcf5f8888a5add1d.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:08 -07:00
Daniel Xu	9abddac583	netfilter: defrag: Add glue hooks for enabling/disabling defrag We want to be able to enable/disable IP packet defrag from core bpf/netfilter code. In other words, execute code from core that could possibly be built as a module. To help avoid symbol resolution errors, use glue hooks that the modules will register callbacks with during module init. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/f6a8824052441b72afe5285acedbd634bd3384c1.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:52:08 -07:00
Yonghong Song	ee932bf940	docs/bpf: Improve documentation for cpu=v4 instructions Improve documentation for cpu=v4 instructions based on David's suggestions. Cc: bpf@ietf.org Suggested-by: David Vernet <void@manifault.com> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728225105.919595-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-28 16:50:06 -07:00
Linus Torvalds	2b17e90d3f	Merge tag 'tpmdd-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "I picked up three small scale updates that I think would improve the quality of the release" * tag 'tpmdd-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm_tis: Explicitly check for error code tpm: Switch i2c drivers back to use .probe() security: keys: perform capable check only on privileged operations	2023-07-28 16:44:32 -07:00
Zheng Yejian	2d093282b0	ring-buffer: Fix wrong stat of cpu_buffer->read When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set to 0 in order to make sure any read iterators reset themselves. However, this will mess 'entries' stating, see following steps: # cd /sys/kernel/tracing/ # 1. Enlarge ring buffer prepare for later reducing: # echo 20 > per_cpu/cpu0/buffer_size_kb # 2. Write a log into ring buffer of cpu0: # taskset -c 0 echo "hello1" > trace_marker # 3. Read the log: # cat per_cpu/cpu0/trace_pipe <...>-332 [000] ..... 62.406844: tracing_mark_write: hello1 # 4. Stop reading and see the stats, now 0 entries, and 1 event readed: # cat per_cpu/cpu0/stats entries: 0 [...] read events: 1 # 5. Reduce the ring buffer # echo 7 > per_cpu/cpu0/buffer_size_kb # 6. Now entries became unexpected 1 because actually no entries!!! # cat per_cpu/cpu0/stats entries: 1 [...] read events: 0 To fix it, introduce 'page_removed' field to count total removed pages since last reset, then use it to let read iterators reset themselves instead of changing the 'read' pointer. Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyejian1@huawei.com Cc: <mhiramat@kernel.org> Cc: <vnagarnaik@google.com> Fixes: `83f40318da` ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2023-07-28 19:16:23 -04:00
Colin Ian King	3bdd85e2e3	net: ethernet: slicoss: remove redundant increment of pointer data The pointer data is being incremented but this change to the pointer is not used afterwards. The increment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Link: https://lore.kernel.org/r/20230726164522.369206-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 15:37:08 -07:00
Jakub Kicinski	05191d8896	Merge branch 'in-kernel-support-for-the-tls-alert-protocol' Chuck Lever says: ==================== In-kernel support for the TLS Alert protocol IMO the kernel doesn't need user space (ie, tlshd) to handle the TLS Alert protocol. Instead, a set of small helper functions can be used to handle sending and receiving TLS Alerts for in-kernel TLS consumers. ==================== Merged on top of a tag in case it's needed in the NFS tree. Link: https://lore.kernel.org/r/169047923706.5241.1181144206068116926.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:08:02 -07:00
Chuck Lever	b470985c76	net/handshake: Trace events for TLS Alert helpers Add observability for the new TLS Alert infrastructure. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047947409.5241.14548832149596892717.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	39067dda1d	SUNRPC: Use new helpers to handle TLS Alerts Use the helpers to parse the level and description fields in incoming alerts. "Warning" alerts are discarded, and "fatal" alerts mean the session is no longer valid. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047944747.5241.1974889594004407123.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	39d0e38dcc	net/handshake: Add helpers for parsing incoming TLS Alerts Kernel TLS consumers can replace common TLS Alert parsing code with these helpers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047942074.5241.13791647439480672048.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	5dd5ad682c	SUNRPC: Send TLS Closure alerts before closing a TCP socket Before closing a TCP connection, the TLS protocol wants peers to send session close Alert notifications. Add those in both the RPC client and server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047939404.5241.14392506226409865832.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	35b1b538d4	net/handshake: Add API for sending TLS Closure alerts This helper sends an alert only if a TLS session was established. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047936730.5241.618595693821012638.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	0257427146	net/tls: Add TLS Alert definitions I'm about to add support for kernel handshake API consumers to send TLS Alerts, so introduce the needed protocol definitions in the new header tls_prot.h. This presages support for Closure alerts. Also, support for alerts is a pre-requite for handling session re-keying, where one peer will signal the need for a re-key by sending a TLS Alert. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047934064.5241.8377890858495063518.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Chuck Lever	6a7eccef47	net/tls: Move TLS protocol elements to a separate header Kernel TLS consumers will need definitions of various parts of the TLS protocol, but often do not need the function declarations and other infrastructure provided in <net/tls.h>. Break out existing standardized protocol elements into a separate header, and make room for a few more elements in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/169047931374.5241.7713175865185969309.stgit@oracle-102.nfsv4bat.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:07:59 -07:00
Suman Ghosh	222a6c42e9	octeontx2-af: Initialize 'cntr_val' to fix uninitialized symbol error drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c:860 otx2_tc_update_mcam_table_del_req() error: uninitialized symbol 'cntr_val'. Fixes: `ec87f05402` ("octeontx2-af: Install TC filter rules in hardware based on priority") Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20230727163101.2793453-1-sumang@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 14:02:11 -07:00
Jakub Kicinski	a4989bee92	Merge branch 'eth-bnxt-fix-a-couple-of-w-1-c-1-warnings' Jakub Kicinski says: ==================== eth: bnxt: fix a couple of W=1 C=1 warnings Fix a couple of build warnings. ==================== Link: https://lore.kernel.org/r/20230727190726.1859515-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:47:36 -07:00
Jakub Kicinski	9f49db62f5	eth: bnxt: fix warning for define in struct_group Fix C=1 warning with sparse 0.6.4: drivers/net/ethernet/broadcom/bnxt/bnxt.c: note: in included file: drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h:30:1: warning: directive in macro's argument list Don't put defines in a struct_group(). Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230727190726.1859515-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:47:34 -07:00
Jakub Kicinski	833c4a8105	eth: bnxt: fix one of the W=1 warnings about fortified memcpy() Fix a W=1 warning with gcc 13.1: In function ‘fortify_memcpy_chk’, inlined from ‘bnxt_hwrm_queue_cos2bw_cfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:133:3: include/linux/fortify-string.h:592:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 592 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The field group is already defined and starts at queue_id: struct bnxt_cos2bw_cfg { u8 pad[3]; struct_group_attr(cfg, __packed, u8 queue_id; __le32 min_bw; Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230727190726.1859515-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:47:34 -07:00
Jakub Kicinski	b10d10a7c1	Merge tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-07-24 1) Generalize devcom implementation to be independent of number of ports or device's GUID. 2) Save memory on command interface statistics. 3) General code cleanups * tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Give esw_offloads_load/unload_rep() "mlx5_" prefix net/mlx5: Make mlx5_eswitch_load/unload_vport() static net/mlx5: Make mlx5_esw_offloads_rep_load/unload() static net/mlx5: Remove pointless devlink_rate checks net/mlx5: Don't check vport->enabled in port ops net/mlx5e: Make flow classification filters static net/mlx5e: Remove duplicate code for user flow net/mlx5: Allocate command stats with xarray net/mlx5: split mlx5_cmd_init() to probe and reload routines net/mlx5: Remove redundant cmdif revision check net/mlx5: Re-organize mlx5_cmd struct net/mlx5e: E-Switch, Allow devcom initialization on more vports net/mlx5e: E-Switch, Register devcom device with switch id key net/mlx5: Devcom, Infrastructure changes net/mlx5: Use shared code for checking lag is supported ==================== Link: https://lore.kernel.org/r/20230727183914.69229-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:41:59 -07:00
Jakub Kicinski	97d0dca794	Merge branch 'mlxsw-avoid-non-tracker-helpers-when-holding-and-putting-netdevices' Petr Machata says: ==================== mlxsw: Avoid non-tracker helpers when holding and putting netdevices Using the tracking helpers, netdev_hold() and netdev_put(), makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. For example, the following traceback shows the callpath to the point of an outstanding hold that was never put: unregister_netdevice: waiting for swp3 to become free. Usage count = 6 ref_tracker: eth%d@ffff888123c9a580 has 1/5 users at mlxsw_sp_switchdev_event+0x6bd/0xcc0 [mlxsw_spectrum] notifier_call_chain+0xbf/0x3b0 atomic_notifier_call_chain+0x78/0x200 br_switchdev_fdb_notify+0x25f/0x2c0 [bridge] fdb_notify+0x16a/0x1a0 [bridge] [...] In this patchset, get rid of all non-ref-tracking helpers in mlxsw. - Patch #1 drops two functions that are not used anymore, but contain dev_hold() / dev_put() calls. - Patch #2 avoids taking a reference in one function which is called under RTNL. - The remaining patches convert individual hold/put sites one by one from trackerless to tracker-enabled. Suggested-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/netdev/4c056da27c19d95ffeaba5acf1427ecadfc3f94c.camel@redhat.com/ ==================== Link: https://lore.kernel.org/r/cover.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:48 -07:00
Petr Machata	cb21162041	mlxsw: spectrum_router: IPv6 events: Use tracker helpers to hold & put netdevices Using the tracking helpers makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the router code that deals with IPv6 address events. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/f0af6ad4722b4ca6e598fd4fda8311a3041651ec.1690471775.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:46 -07:00
Petr Machata	d0e0e88012	mlxsw: spectrum_router: RIF: Use tracker helpers to hold & put netdevices Using the tracking helpers makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the router code that deals with RIF allocation. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/8b7701a7b439ac268e4be4040eff99d01e27ae47.1690471775.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Petr Machata	b17b2d57b7	mlxsw: spectrum_router: hw_stats: Use tracker helpers to hold & put netdevices Using the tracking helpers makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the router code that deals with hw_stats events. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/b972314cfef4f4c24e66e60d13cffa5d606d1bf3.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Petr Machata	deeaa3716f	mlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices Using the tracking helpers makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the router code that deals with FIB events. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/5221a92e751c40447c55959f622267ccc999ed04.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Petr Machata	1ae489ab43	mlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices Using the tracking helpers makes it easier to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled. Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the switchdev module. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/774c3d7b5b0231f1435df2ec9dd660192e382756.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Petr Machata	16f8c846cd	mlxsw: spectrum_nve: Do not take reference when looking up netdevice mlxsw_sp_nve_fid_disable() is always called under RTNL. It is therefore safe to call __dev_get_by_index() to get the netdevice pointer without bumping the reference count, because we can be sure the netdevice is not going away. That then obviates the need to put the netdevice later in the function. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/341d1046f89d8d839d9d00e4a3d58cdc351e9397.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Petr Machata	569f98b36b	mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put() As of commit `151b89f602` ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler"), the functions mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users. Drop them. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:38:45 -07:00
Patrick Rohr	5027d54a9c	net: change accept_ra_min_rtr_lft to affect all RA lifetimes accept_ra_min_rtr_lft only considered the lifetime of the default route and discarded entire RAs accordingly. This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and applies the value to individual RA sections; in particular, router lifetime, PIO preferred lifetime, and RIO lifetime. If any of those lifetimes are lower than the configured value, the specific RA section is ignored. In order for the sysctl to be useful to Android, it should really apply to all lifetimes in the RA, since that is what determines the minimum frequency at which RAs must be processed by the kernel. Android uses hardware offloads to drop RAs for a fraction of the minimum of all lifetimes present in the RA (some networks have very frequent RAs (5s) with high lifetimes (2h)). Despite this, we have encountered networks that set the router lifetime to 30s which results in very frequent CPU wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the WiFi firmware) entirely on such networks, it seems better to ignore the misconfigured routers while still processing RAs from other IPv6 routers on the same network (i.e. to support IoT applications). The previous implementation dropped the entire RA based on router lifetime. This turned out to be hard to expand to the other lifetimes present in the RA in a consistent manner; dropping the entire RA based on RIO/PIO lifetimes would essentially require parsing the whole thing twice. Fixes: `1671bcfd76` ("net: add sysctl accept_ra_min_rtr_lft") Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Patrick Rohr <prohr@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 13:30:51 -07:00
Davidlohr Bueso	ad64f5952c	cxl/memdev: Only show sanitize sysfs files when supported If the device does not support Sanitize or Secure Erase commands, hide the respective sysfs interfaces such that the operation can never be attempted. In order to be generic, keep track of the enabled security commands found in the CEL - the driver does not support Security Passthrough. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230726051940.3570-4-dave@stgolabs.net Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2023-07-28 13:16:54 -06:00
Davidlohr Bueso	3de8cd2242	cxl/memdev: Document security state in kern-doc ... as is the case with all members of struct cxl_memdev_state. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230726051940.3570-3-dave@stgolabs.net Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2023-07-28 13:16:54 -06:00
Davidlohr Bueso	0fcde5989e	cxl/memdev: Improve sanitize ABI descriptions Be more detailed about the CPU cache management situation. The same goes for both sanitize and secure erase. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230726051940.3570-2-dave@stgolabs.net Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2023-07-28 13:16:54 -06:00
Jakub Kicinski	5bdc312c1d	Merge branch 'net-store-netdevs-in-an-xarray' Jakub Kicinski says: ==================== net: store netdevs in an xarray One of more annoying developer experience gaps we have in netlink is iterating over netdevs. It's painful. Add an xarray to make it trivial. v1: https://lore.kernel.org/all/20230722014237.4078962-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20230726185530.2247698-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 11:36:00 -07:00
Jakub Kicinski	84e00d9bd4	net: convert some netlink netdev iterators to depend on the xarray Reap the benefits of easier iteration thanks to the xarray. Convert just the genetlink ones, those are easier to test. Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230726185530.2247698-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 11:35:58 -07:00
Jakub Kicinski	759ab1edb5	net: store netdevs in an xarray Iterating over the netdev hash table for netlink dumps is hard. Dumps are done in "chunks" so we need to save the position after each chunk, so we know where to restart from. Because netdevs are stored in a hash table we remember which bucket we were in and how many devices we dumped. Since we don't hold any locks across the "chunks" - devices may come and go while we're dumping. If that happens we may miss a device (if device is deleted from the bucket we were in). We indicate to user space that this may have happened by setting NLM_F_DUMP_INTR. User space is supposed to dump again (I think) if it sees that. Somehow I doubt most user space gets this right.. To illustrate let's look at an example: System state: start: # [A, B, C] del: B # [A, C] with the hash table we may dump [A, B], missing C completely even tho it existed both before and after the "del B". Add an xarray and use it to allocate ifindexes. This way we can iterate ifindexes in order, without the worry that we'll skip one. We may still generate a dump of a state which "never existed", for example for a set of values and sequence of ops: System state: start: # [A, B] add: C # [A, C, B] del: B # [A, C] we may generate a dump of [A], if C got an index between A and B. System has never been in such state. But I'm 90% sure that's perfectly fine, important part is that we can't _miss_ devices which exist before and after. User space which wants to mirror kernel's state subscribes to notifications and does periodic dumps so it will know that C exists from the notification about its creation or from the next dump (next dump is _guaranteed_ to include C, if it doesn't get removed). To avoid any perf regressions keep the hash table for now. Most net namespaces have very few devices and microbenchmarking 1M lookups on Skylake I get the following results (not counting loopback to number of devs): #devs \| hash \| xa \| delta 2 \| 18.3 \| 20.1 \| + 9.8% 16 \| 18.3 \| 20.1 \| + 9.5% 64 \| 18.3 \| 26.3 \| +43.8% 128 \| 20.4 \| 26.3 \| +28.6% 256 \| 20.0 \| 26.4 \| +32.1% 1024 \| 26.6 \| 26.7 \| + 0.2% 8192 \|541.3 \| 33.5 \| -93.8% No surprises since the hash table has 256 entries. The microbenchmark scans indexes in order, if the pattern is more random xa starts to win at 512 devices already. But that's a lot of devices, in practice. Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230726185530.2247698-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-28 11:35:58 -07:00
Georg Müller	98ce8e4a9d	perf test uprobe_from_different_cu: Skip if there is no gcc Without gcc, the test will fail. On cleanup, ignore probe removal errors. Otherwise, in case of an error adding the probe, the temporary directory is not removed. Fixes: `56cbeacf14` ("perf probe: Add test for regression introduced by switch to die_get_decl_file()") Signed-off-by: Georg Müller <georgmueller@gmx.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Georg Müller <georgmueller@gmx.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230728151812.454806-2-georgmueller@gmx.net Link: https://lore.kernel.org/r/CAP-5=fUP6UuLgRty3t2=fQsQi3k4hDMz415vWdp1x88QMvZ8ug@mail.gmail.com/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-07-28 15:31:21 -03:00
Linus Torvalds	f837f0a3c9	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - A couple of SME updates for recent fixes (one of which went to stable): reverting the flushing of the SME hardware state along with the thread flushing and making sure we have the correct vector length before reallocating. - An ACPI/IORT fix to avoid skipping ID mappings whose "number of IDs" is 0 (the spec reports the number of IDs in the mapping range minus 1). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ACPI/IORT: Remove erroneous id_count check in iort_node_get_rmr_info() arm64/sme: Set new vector length before reallocating arm64/fpsimd: Don't flush SME register hardware state along with thread	2023-07-28 11:21:57 -07:00
Linus Torvalds	81eef8909d	Merge tag 'for-linus-6.5a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for a performance problem in QubesOS, adding a way to drain the queue of grants experiencing delayed unmaps faster - A patch enabling the use of static event channels from user mode, which was omitted when introducing supporting static event channels - A fix for a problem where Xen related code didn't check properly for running in a Xen environment, resulting in a WARN splat * tag 'for-linus-6.5a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: speed up grant-table reclaim xen/evtchn: Introduce new IOCTL to bind static evtchn xenbus: check xen_domain in xenbus_probe_initcall	2023-07-28 11:17:30 -07:00
Alexander Steffen	513253f8c2	tpm_tis: Explicitly check for error code recv_data either returns the number of received bytes, or a negative value representing an error code. Adding the return value directly to the total number of received bytes therefore looks a little weird, since it might add a negative error code to a sum of bytes. The following check for size < expected usually makes the function return ETIME in that case, so it does not cause too many problems in practice. But to make the code look cleaner and because the caller might still be interested in the original error code, explicitly check for the presence of an error code and pass that through. Cc: stable@vger.kernel.org Fixes: `cb5354253a` ("[PATCH] tpm: spacing cleanups 2") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2023-07-28 18:13:39 +00:00
Uwe Kleine-König	be6f48a7c8	tpm: Switch i2c drivers back to use .probe() After commit `b8a1a4cd5a` ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then `03c835f498` ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2023-07-28 18:12:40 +00:00
Christian Göttsche	2d7f105edb	security: keys: perform capable check only on privileged operations If the current task fails the check for the queried capability via `capable(CAP_SYS_ADMIN)` LSMs like SELinux generate a denial message. Issuing such denial messages unnecessarily can lead to a policy author granting more privileges to a subject than needed to silence them. Reorder CAP_SYS_ADMIN checks after the check whether the operation is actually privileged. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2023-07-28 18:07:41 +00:00
Linus Torvalds	e62e26d3e9	Merge tag 'ceph-for-6.5-rc4' of https://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "A patch to reduce the potential for erroneous RBD exclusive lock blocklisting (fencing) with a couple of prerequisites and a fixup to prevent metrics from being sent to the MDS even just once after that has been disabled by the user. All marked for stable" * tag 'ceph-for-6.5-rc4' of https://github.com/ceph/ceph-client: rbd: retrieve and check lock owner twice before blocklisting rbd: harden get_lock_owner_info() a bit rbd: make get_lock_owner_info() return a single locker or NULL ceph: never send metrics if disable_send_metrics is set	2023-07-28 10:47:24 -07:00
Linus Torvalds	28d79b746c	Merge tag '9p-fixes-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p fixes from Eric Van Hensbergen: "Misc set of fixes for 9p. Most of these clean up warnings we've gotten out of compilation tools, but several of them were from inspection while hunting down a couple of regressions. The most important one is `75b396821c` ("fs/9p: remove unnecessary and overrestrictive check") which caused a regression for some folks by restricting mmap in any case where writeback caches weren't enabled. Most of the other bugs caught via inspection were type mismatches" * tag '9p-fixes-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: Remove unused extern declaration 9p: remove dead stores (variable set again without being read) 9p: virtio: skip incrementing unused variable 9p: virtio: make sure 'offs' is initialized in zc_request 9p: virtio: fix unlikely null pointer deref in handle_rerror 9p: fix ignored return value in v9fs_dir_release fs/9p: remove unnecessary invalidate_inode_pages2 fs/9p: fix type mismatch in file cache mode helper fs/9p: fix typo in comparison logic for cache mode fs/9p: remove unnecessary and overrestrictive check fs/9p: Fix a datatype used with V9FS_DIRECT_IO	2023-07-28 10:43:16 -07:00
Linus Torvalds	818680d154	Merge tag 'block-6.5-2023-07-28' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: "A few fixes that should go into the current kernel release, mainly: - Set of fixes for dasd (Stefan) - Handle interruptible waits returning because of a signal for ublk (Ming)" * tag 'block-6.5-2023-07-28' of git://git.kernel.dk/linux: ublk: return -EINTR if breaking from waiting for existed users in DEL_DEV ublk: fail to recover device if queue setup is interrupted ublk: fail to start device if queue setup is interrupted block: Fix a source code comment in include/uapi/linux/blkzoned.h s390/dasd: print copy pair message only for the correct error s390/dasd: fix hanging device after request requeue s390/dasd: use correct number of retries for ERP requests s390/dasd: fix hanging device after quiesce/resume	2023-07-28 10:23:41 -07:00
Linus Torvalds	9c65505826	Merge tag 'io_uring-6.5-2023-07-28' of git://git.kernel.dk/linux Pull io_uring fix from Jens Axboe: "Just a single tweak to a patch from last week, to avoid having idle cqring waits be attributed as iowait" * tag 'io_uring-6.5-2023-07-28' of git://git.kernel.dk/linux: io_uring: gate iowait schedule on having pending requests	2023-07-28 10:19:44 -07:00

... 18 19 20 21 22 ...

1202612 Commits