linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Christophe JAILLET	978015f7ef	net/mlx5e: Remove a useless function call 'handle' is known to be NULL here. There is no need to kfree() it. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:53 -07:00
Shay Drory	e71383fb9c	net/mlx5: Light probe local SFs In case user wants to configure the SFs, for example: to use only vdpa functionality, he needs to fully probe a SF, configure what he wants, and afterward reload the SF. In order to save the time of the reload, local SFs will probe without any auxiliary sub-device, so that the SFs can be configured prior to its full probe. The defaults of the enable_* devlink params of these SFs are set to false. Usage example: Create SF: $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 $ devlink port function set pci/0000:08:00.0/32768 \ hw_addr 00:00:00:00:00:11 state active Enable ETH auxiliary device: $ devlink dev param set auxiliary/mlx5_core.sf.1 \ name enable_eth value true cmode driverinit Now, in order to fully probe the SF, use devlink reload: $ devlink dev reload auxiliary/mlx5_core.sf.1 At this point the user have SF devlink instance with auxiliary device for the Ethernet functionality only. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:53 -07:00
Shay Drory	3f90840305	net/mlx5: Move esw multiport devlink param to eswitch code Move the param registration and handling code into the eswitch code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:53 -07:00
Shay Drory	2059cf51f3	net/mlx5: Split function_setup() to enable and open functions mlx5_cmd_init_hca() is taking ~0.2 seconds. In case of a user who desire to disable some of the SF aux devices, and with large scale-1K SFs for example, this user will waste more than 3 minutes on mlx5_cmd_init_hca() which isn't needed at this stage. Downstream patch will change SFs which are probe over the E-switch, local SFs, to be probed without any aux dev. In order to support this, split function_setup() to avoid executing mlx5_cmd_init_hca(). Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:52 -07:00
Daniel Jurgens	7057fe5619	net/mlx5: Set max number of embedded CPU VFs Set the maximum number of embedded cpu VF functions available. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:52 -07:00
Daniel Jurgens	6d98f314bf	net/mlx5: Update SRIOV enable/disable to handle EC/VFs Previously on the embedded CPU platform SRIOV was never enabled/disabled via mlx5_core_sriov_configure. Host VF updates are provided by an event handler. Now in the disable flow it must be known if this is a disable due to driver unload or SRIOV detach, or if the user updated the number of VFs. If due to change in the number of VFs only wait for the pages of ECVFs. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:52 -07:00
Daniel Jurgens	42a84a4309	net/mlx5: Query correct caps for min msix vectors The VFs on the host and the embedded CPU platform share function numbers. Set the ec_vf_function field to query the caps for the correct function. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:52 -07:00
Daniel Jurgens	2ee3db806e	net/mlx5: Use correct vport when restoring GUIDs Prior to enabling EC VF functionality the vport number and function ID were always the same. That's not the case now. Use the correct vport number to modify the HCA vport context. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:52 -07:00
Daniel Jurgens	395ccd6eb4	net/mlx5: Add new page type for EC VF pages When the embedded cpu supports SRIOV it can be enabled and disabled independently from the host SRIOV. Track the pages separately so we can properly wait for returned VF pages. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:51 -07:00
Daniel Jurgens	fa3c73eee6	net/mlx5: Add/remove peer miss rules for EC VFs Add and remove the peer miss rules for EC VFs. It's possible that there are different amounts of total VFs per function so only create rules for the minimum number of max VFs. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:51 -07:00
Daniel Jurgens	a7719b29a8	net/mlx5: Add management of EC VF vports Add init, load, unload, and cleanup of the EC VF vports. This includes changes in how eswitch SRIOV is managed. Previous on an embedded CPU platform the number of VFs provided when enabling the eswitch was always 0, host VFs vports are handled in the eswitch functions change event handler. Now track the number of EC VFs as well, so they can be handled properly in the enable/disable flows. There are only 3 marks available for use in xarrays, all 3 were already in use for this use case. EC VF vports are in a known range so we can access them by index instead of marks. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:51 -07:00
Daniel Jurgens	9ac0b12824	net/mlx5: Update vport caps query/set for EC VFs These functions are for query/set by vport, there was an underlying assumption that vport was equal to function ID. That's not the case for EC VF functions. Set the ec_vf_function bit accordingly. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:51 -07:00
Daniel Jurgens	dc13180824	net/mlx5: Enable devlink port for embedded cpu VF vports Enable creation of a devlink port for EC VF vports. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:51 -07:00
Daniel Jurgens	93b36d0f28	net/mlx5: mlx5_ifc updates for embedded CPU SRIOV Add ec_vf_vport_base to HCA Capabilities 2. This indicates the base vport of embedded CPU virtual functions that are connected to the eswitch. Add ec_vf_function to query/set_hca_caps. If set this indicates accessing a virtual function on the embedded CPU by function ID. This should only be used with other_function set to 1. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:50 -07:00
Daniel Jurgens	18a92b0542	net/mlx5: Simplify unload all rep code Instead of using type specific iterators which are only used in one place just traverse the xarray. It will provide suitable ordering based on the vport numbers. This will also eliminate the need for changes here when new types are added. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-06-09 18:40:50 -07:00
Jakub Kicinski	ded5c1a16e	Merge branch 'tools-ynl-gen-code-gen-improvements-before-ethtool' Jakub Kicinski says: ==================== tools: ynl-gen: code gen improvements before ethtool I was going to post ethtool but I couldn't stand the ugliness of the if conditions which were previously generated. So I cleaned that up and improved a number of other things ethtool will benefit from. ==================== Link: https://lore.kernel.org/r/20230608211200.1247213-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:33 -07:00
Jakub Kicinski	76abff37f0	tools: ynl-gen: support / skip pads on the way to kernel Kernel does not have padding requirements for 64b attrs. We can ignore pad attrs. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	6f96ec73cb	tools: ynl-gen: don't pass op_name to RenderInfo The op_name argument is barely used and identical to op.name in all cases. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	6da3424fd6	tools: ynl-gen: support code gen for events Netlink specs support both events and notifications (former can define their own message contents). Plug in missing code to generate types, parsers and include events into notification tables. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	ced1568862	tools: ynl-gen: sanitize notification tracking Don't modify the raw dicts (as loaded from YAML) to pretend that the notify attributes also exist on the ops. This makes the code easier to follow. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	d0915d64c3	tools: ynl: regen: stop generating common notification handlers Remove unused notification handlers. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	f2ba1e5e22	tools: ynl-gen: stop generating common notification handlers Common notification handler was supposed to be a way for the user to parse the notifications from a socket synchronously. I don't think we'll end up using it, ynl_ntf_check() works for all known use cases. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	7234415b8f	tools: ynl: regen: regenerate the if ladders Renegate the code to combine } and else and use tmp variable to store type. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	e4ea3cc684	tools: ynl-gen: get attr type outside of if() Reading attr type with mnl_attr_get_type() for each condition leads to most conditions being longer than 80 chars. Avoid this by reading the type to a variable on the stack. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	2c0f146686	tools: ynl-gen: combine else with closing bracket Code gen currently prints: } else if (... This is really ugly. Fix it by delaying printing of closing brackets in anticipation of else coming along. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	820343ccbb	tools: ynl-gen: complete the C keyword list C keywords need to be avoided when naming things. Complete the list (ethtool has at least one thing called "auto"). Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	9b52fd4b63	tools: ynl: regen: cleanup user space header includes Remove unnecessary includes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	30b5c720e1	tools: ynl-gen: cleanup user space header includes Bots started screaming that we're including stdlib.h twice. While at it move string.h into a common spot and drop stdio.h which we don't need. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5464 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5466 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5467 Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 14:40:31 -07:00
Jakub Kicinski	7ec5d48fdb	Revert "tools: ynl: Remove duplicated include in handshake-user.c" This reverts commit `e7c5433c5a`. Commit `e7c5433c5a` ("tools: ynl: Remove duplicated include in handshake-user.c") was applied too hastily. It changes an auto-generated file, and there's already a proper fix on the list. Link: https://lore.kernel.org/all/ZIMPLYi%2FxRih+DlC@nanopsycho/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-09 11:01:04 -07:00
Yang Li	e7c5433c5a	tools: ynl: Remove duplicated include in handshake-user.c ./tools/net/ynl/generated/handshake-user.c: stdlib.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5464 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 11:36:41 +01:00
David S. Miller	56f7783ba4	Merge branch 'broadcom-phy-led-brightness' Florian Fainelli says: ==================== LED brightness support for Broadcom PHYs This patch series adds support for controlling the LED brightness on Broadcom PHYs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:38:44 +01:00
Florian Fainelli	bd5736e146	net: phy: broadcom: Add support for setting LED brightness Broadcom PHYs have two LEDs selector registers which allow us to control the LED assignment, including how to turn them on/off. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:38:43 +01:00
Florian Fainelli	57fd7d59b1	net: phy: broadcom: Rename LED registers These registers are common to most PHYs and are not specific to the BCM5482, renamed the constants accordingly, no functional change. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:38:43 +01:00
David S. Miller	54a8c43f3b	Merge branch 'net-ncsi-refactoring-for-GMA-cmd' Ivan Mikhaylov says: ==================== net/ncsi: refactoring for GMA command Make one GMA function for all manufacturers, change ndo_set_mac_address to dev_set_mac_address for notifiying net layer about MAC change which ndo_set_mac_address doesn't do. Changes from v1: 1. delete ftgmac100.txt changes about mac-address-increment 2. add convert to yaml from ftgmac100.txt 3. add mac-address-increment option for ethernet-controller.yaml Changes from v2: 1. remove DT changes from series, will be done in another one ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:32:51 +01:00
Ivan Mikhaylov	790071347a	net/ncsi: change from ndo_set_mac_address to dev_set_mac_address Change ndo_set_mac_address to dev_set_mac_address because dev_set_mac_address provides a way to notify network layer about MAC change. In other case, services may not aware about MAC change and keep using old one which set from network adapter driver. As example, DHCP client from systemd do not update MAC address without notification from net subsystem which leads to the problem with acquiring the right address from DHCP server. Fixes: `cb10c7c0df` ("net/ncsi: Add NCSI Broadcom OEM command") Cc: stable@vger.kernel.org # v6.0+ `2f38e84` net/ncsi: make one oem_gma function for all mfr id Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:32:51 +01:00
Ivan Mikhaylov	74b449b98d	net/ncsi: make one oem_gma function for all mfr id Make the one Get Mac Address function for all manufacturers and change this call in handlers accordingly. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:32:51 +01:00
Foster Snowhill	0c6e9d32ef	usbnet: ipheth: update Kconfig description This module has for a long time not been limited to iPhone <= 3GS. Update description to match the actual state of the driver. Remove dead link from 2010, instead reference an existing userspace iOS device pairing implementation as part of libimobiledevice. Signed-off-by: Foster Snowhill <forst@pen.gy> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:26:57 +01:00
Foster Snowhill	a2d274c62e	usbnet: ipheth: add CDC NCM support Recent iOS releases support CDC NCM encapsulation on RX. This mode is the default on macOS and Windows. In this mode, an iOS device may include one or more Ethernet frames inside a single URB. Freshly booted iOS devices start in legacy mode, but are put into NCM mode by the official Apple driver. When reconnecting such a device from a macOS/Windows machine to a Linux host, the device stays in NCM mode, making it unusable with the legacy ipheth driver code. To correctly support such a device, the driver has to either support the NCM mode too, or put the device back into legacy mode. To match the behaviour of the macOS/Windows driver, and since there is no documented control command to revert to legacy mode, implement NCM support. The device is attempted to be put into NCM mode by default, and falls back to legacy mode if the attempt fails. Signed-off-by: Foster Snowhill <forst@pen.gy> Tested-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:26:57 +01:00
Foster Snowhill	3e65efcca8	usbnet: ipheth: transmit URBs without trailing padding The behaviour of the official iOS tethering driver on macOS is to not transmit any trailing padding at the end of URBs. This is applicable to both NCM and legacy modes, including older devices. Adapt the driver to not include trailing padding in TX URBs, matching the behaviour of the official macOS driver. Signed-off-by: Foster Snowhill <forst@pen.gy> Tested-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:26:57 +01:00
Georgi Valkov	2203718c2f	usbnet: ipheth: fix risk of NULL pointer deallocation The cleanup precedure in ipheth_probe will attempt to free a NULL pointer in dev->ctrl_buf if the memory allocation for this buffer is not successful. While kfree ignores NULL pointers, and the existing code is safe, it is a better design to rearrange the goto labels and avoid this. Signed-off-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: Foster Snowhill <forst@pen.gy> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-09 10:26:57 +01:00
Jakub Kicinski	fd5f4d7da2	Merge branch 'splice-net-rewrite-splice-to-socket-fix-splice_f_more-and-handle-msg_splice_pages-in-af_tls' David Howells says: ==================== splice, net: Rewrite splice-to-socket, fix SPLICE_F_MORE and handle MSG_SPLICE_PAGES in AF_TLS Here are patches to do the following: (1) Block MSG_SENDPAGE_* flags from leaking into ->sendmsg() from userspace, whilst allowing splice_to_socket() to pass them in. (2) Allow MSG_SPLICE_PAGES to be passed into tls_*_sendmsg(). Until support is added, it will be ignored and a splice-driven sendmsg() will be treated like a normal sendmsg(). TCP, UDP, AF_UNIX and Chelsio-TLS already handle the flag in net-next. (3) Replace a chain of functions to splice-to-sendpage with a single function to splice via sendmsg() with MSG_SPLICE_PAGES. This allows a bunch of pages to be spliced from a pipe in a single call using a bio_vec[] and pushes the main processing loop down into the bowels of the protocol driver rather than repeatedly calling in with a page at a time. (4) Provide a ->splice_eof() op[2] that allows splice to signal to its output that the input observed a premature EOF and that the caller didn't flag SPLICE_F_MORE, thereby allowing a corked socket to be flushed. This attempts to maintain the current behaviour. It is also not called if we didn't manage to read any data and so didn't called the actor function. This needs routing though several layers to get it down to the network protocol. [!] Note that I chose not to pass in any flags - I'm not sure it's particularly useful to pass in the splice flags; I also elected not to return any error code - though we might actually want to do that. (5) Provide tls_{device,sw}_splice_eof() to flush a pending TLS record if there is one. (6) Provide splice_eof() for UDP, TCP, Chelsio-TLS and AF_KCM. AF_UNIX doesn't seem to pay attention to the MSG_MORE or MSG_SENDPAGE_NOTLAST flags. (7) Alter the behaviour of sendfile() and fix SPLICE_F_MORE/MSG_MORE signalling[1] such SPLICE_F_MORE is always signalled until we have read sufficient data to finish the request. If we get a zero-length before we've managed to splice sufficient data, we now leave the socket expecting more data and leave it to userspace to deal with it. (8) Make AF_TLS handle the MSG_SPLICE_PAGES internal sendmsg flag. MSG_SPLICE_PAGES is an internal hint that tells the protocol that it should splice the pages supplied if it can. Its sendpage implementations are then turned into wrappers around that. Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ [2] Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1 Link: https://lore.kernel.org/r/20230524153311.3625329-1-dhowells@redhat.com/ # v1 ==================== Link: https://lore.kernel.org/r/20230607181920.2294972-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:33 -07:00
David Howells	3dc8976c7a	tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES Convert tls_device_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. With that, the tls_iter_offset union is no longer necessary and can be replaced with an iov_iter pointer and the zc_page argument to tls_push_data() can also be removed. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	24763c9c09	tls/device: Support MSG_SPLICE_PAGES Make TLS's device sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	45e5be844a	tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES Convert tls_sw_sendpage() and tls_sw_sendpage_locked() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. [!] Note that tls_sw_sendpage_locked() appears to have the wrong locking upstream. I think the caller will only hold the socket lock, but it should hold tls_ctx->tx_lock too. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	fe1e81d4f7	tls/sw: Support MSG_SPLICE_PAGES Make TLS's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	219d92056b	splice, net: Fix SPLICE_F_MORE signalling in splice_direct_to_actor() splice_direct_to_actor() doesn't manage SPLICE_F_MORE correctly[1] - and, as a result, it incorrectly signals/fails to signal MSG_MORE when splicing to a socket. The problem I'm seeing happens when a short splice occurs because we got a short read due to hitting the EOF on a file: as the length read (read_len) is less than the remaining size to be spliced (len), SPLICE_F_MORE (and thus MSG_MORE) is set. The issue is that, for the moment, we have no way to know why the short read occurred and so can't make a good decision on whether we should keep MSG_MORE set. MSG_SENDPAGE_NOTLAST was added to work around this, but that is also set incorrectly under some circumstances - for example if a short read fills a single pipe_buffer, but the next read would return more (seqfile can do this). This was observed with the multi_chunk_sendfile tests in the tls kselftest program. Some of those tests would hang and time out when the last chunk of file was less than the sendfile request size: build/kselftest/net/tls -r tls.12_aes_gcm.multi_chunk_sendfile This has been observed before[2] and worked around in AF_TLS[3]. Fix this by making splice_direct_to_actor() always signal SPLICE_F_MORE if we haven't yet hit the requested operation size. SPLICE_F_MORE remains signalled if the user passed it in to splice() but otherwise gets cleared when we've read sufficient data to fulfill the request. If, however, we get a premature EOF from ->splice_read(), have sent at least one byte and SPLICE_F_MORE was not set by the caller, ->splice_eof() will be invoked. Signed-off-by: David Howells <dhowells@redhat.com> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Jens Axboe <axboe@kernel.dk> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox <willy@infradead.org> cc: Jan Kara <jack@suse.cz> cc: Jeff Layton <jlayton@kernel.org> cc: David Hildenbrand <david@redhat.com> cc: Christian Brauner <brauner@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ [2] Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d452d48b9f8b1a7f8152d33ef52cfd7fe1735b0a [3] Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	951ace9951	kcm: Use splice_eof() to flush Allow splice to undo the effects of MSG_MORE after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Cong Wang <cong.wang@bytedance.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	c289a1601a	chelsio/chtls: Use splice_eof() to flush Allow splice to end a Chelsio TLS record after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Ayush Sawal <ayush.sawal@chelsio.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:31 -07:00
David Howells	1d7e4538a5	ipv4, ipv6: Use splice_eof() to flush Allow splice to undo the effects of MSG_MORE after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. For UDP, a pending packet will not be emitted if the socket is closed before it is flushed; with this change, it be flushed by ->splice_eof(). For TCP, it's not clear that MSG_MORE is actually effective. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Kuniyuki Iwashima <kuniyu@amazon.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:30 -07:00
David Howells	d4c1e80b0d	tls/device: Use splice_eof() to flush Allow splice to end a TLS record after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called TLS with a sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-08 19:40:30 -07:00

1 2 3 4 5 ...

1187393 Commits