[ Upstream commit c11204c78d ]
This code fix a bug that sk->sk_txrehash gets its default enable
value from sysctl_txrehash only when the socket is a TCP listener.
We should have sysctl_txrehash to set the default sk->sk_txrehash,
no matter TCP, nor listerner/connector.
Tested by following packetdrill:
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 socket(..., SOCK_DGRAM, IPPROTO_UDP) = 4
// SO_TXREHASH == 74, default to sysctl_txrehash == 1
+0 getsockopt(3, SOL_SOCKET, 74, [1], [4]) = 0
+0 getsockopt(4, SOL_SOCKET, 74, [1], [4]) = 0
Fixes: 26859240e4 ("txhash: Add socket option to control TX hash rethink behavior")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c966153d12 ]
Parameters 'queue_index' and 'napi_id' are passed in a swapped order.
Fix it here.
Fixes: 23233e577e ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b27517627 ]
On some platforms, 100/1000/2500 speeds seem to have sometimes problems
reporting false positive tx unit hang during stressful UDP traffic. Likely
other Intel drivers introduce responses to a tx hang. Update the 'tx hang'
comparator with the comparison of the head and tail of ring pointers and
restore the tx_timeout_factor to the previous value (one).
This can be test by using netperf or iperf3 applications.
Example:
iperf3 -s -p 5001
iperf3 -c 192.168.0.2 --udp -p 5001 --time 600 -b 0
netserver -p 16604
netperf -H 192.168.0.2 -l 600 -p 16604 -t UDP_STREAM -- -m 64000
Fixes: b27b8dc77b ("igc: Increase timeout value for Speed 100/1000/2500")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230206235818.662384-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f0d1451ec ]
Currently, remove and reload flows can run in parallel to module cleanup.
This design is error prone. For example: aux_drivers callbacks are called
from both cleanup and remove flows with different lockings, which can
cause a deadlock[1].
Hence, serialize module cleanup with reload and remove.
[1]
cleanup remove
------- ------
auxiliary_driver_unregister();
devl_lock()
auxiliary_device_delete(mlx5e_aux)
device_lock(mlx5e_aux)
devl_lock()
device_lock(mlx5e_aux)
Fixes: 912cebf420 ("net/mlx5e: Connect ethernet part to auxiliary bus")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 184e1e4474 ]
When tracer is reloaded, the device will log the traces at the
beginning of the log buffer. Also, driver is reading the log buffer in
chunks in accordance to the consumer index.
Hence, zero consumer index when reloading the tracer.
Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db561fed6b ]
Whenever the driver is reading the string DBs into buffers, the driver
is setting the load bit, but the driver never clears this bit.
As a result, in case load bit is on and the driver query the device for
new string DBs, the driver won't read again the string DBs.
Fix it by clearing the load bit when query the device for new string
DBs.
Fixes: 2d69356752 ("net/mlx5: Add support for fw live patch event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9965bbebae ]
Currently, each core device has VF pages counter which stores number of
fw pages used by its VFs and SFs.
The current design led to a hang when performing firmware reset on DPU,
where the DPU PFs stalled in sriov unload flow due to waiting on release
of SFs pages instead of waiting on only VFs pages.
Thus, Add a separate counter for SF firmware pages, which will prevent
the stall scenario described above.
Fixes: 1958fc2f07 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3bdbaea65 ]
Currently, an independent page counter is used for tracking memory usage
for each function type such as VF, PF and host PF (DPU).
For better code-readibilty, use a single array that stores
the number of allocated memory pages for each function type.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Stable-dep-of: 9965bbebae ("net/mlx5: Expose SF firmware pages counter")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8aa5f171d5 ]
ethtool is returning an error for unknown speeds for the IPoIB interface:
$ ethtool ib0
netlink error: failed to retrieve link settings
netlink error: Invalid argument
netlink error: failed to retrieve link settings
netlink error: Invalid argument
Settings for ib0:
Link detected: no
After this change, ethtool will return success and show "unknown speed":
$ ethtool ib0
Settings for ib0:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Full
Auto-negotiation: off
Port: Other
PHYAD: 0
Transceiver: internal
Link detected: no
Fixes: eb234ee9d5 ("net/mlx5e: IPoIB, Add support for get_link_ksettings in ethtool")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8974aa9638 ]
Moving to switchdev mode with rx-vlan-filter on and then setting it off
causes the kernel to crash since fs->vlan is freed during nic profile
cleanup flow.
RX VLAN filtering is not supported in switchdev mode so unset it when
changing to switchdev and restore its value when switching back to
legacy.
trace:
[] RIP: 0010:mlx5e_disable_cvlan_filter+0x43/0x70
[] set_feature_cvlan_filter+0x37/0x40 [mlx5_core]
[] mlx5e_handle_feature+0x3a/0x60 [mlx5_core]
[] mlx5e_set_features+0x6d/0x160 [mlx5_core]
[] __netdev_update_features+0x288/0xa70
[] ethnl_set_features+0x309/0x380
[] ? __nla_parse+0x21/0x30
[] genl_family_rcv_msg_doit.isra.17+0x110/0x150
[] genl_rcv_msg+0x112/0x260
[] ? features_reply_size+0xe0/0xe0
[] ? genl_family_rcv_msg_doit.isra.17+0x150/0x150
[] netlink_rcv_skb+0x4e/0x100
[] genl_rcv+0x24/0x40
[] netlink_unicast+0x1ab/0x290
[] netlink_sendmsg+0x257/0x4f0
[] sock_sendmsg+0x5c/0x70
Fixes: cb67b83292 ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da0c52426c ]
SWITCHDEV_FDB_ADD_TO_BRIDGE event handler that updates FDB entry 'lastuse'
field is only executed for eswitch that owns the entry. However, if peer
entry processed packets at least once it will have hardware counter 'used'
value greater than entry 'lastuse' from that point on, which will cause FDB
entry not being aged out.
Process the event on all eswitch instances.
Fixes: ff9b752146 ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e66220948 ]
rq->hw_mtu is used in function en_rx.c/mlx5e_skb_from_cqe_mpwrq_linear()
to catch oversized packets. If FCS is concatenated to the end of the
packet then the check should be updated accordingly.
Rx rings initialization (mlx5e_init_rxq_rq()) invoked for every new set
of channels, as part of mlx5e_safe_switch_params(), unknowingly if it
runs with default configuration or not. Current rq->hw_mtu
initialization assumes default configuration and ignores
params->scatter_fcs_en flag state.
Fix this, by accounting for params->scatter_fcs_en flag state during
rq->hw_mtu initialization.
In addition, updating rq->hw_mtu value during ingress traffic might
lead to packets drop and oversize_pkts_sw_drop counter increase with no
good reason. Hence we remove this optimization and switch the set of
channels with a new one, to make sure we don't get false positives on
the oversize_pkts_sw_drop counter.
Fixes: 102722fc68 ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f964f8399d ]
Alternative short title: don't instruct the hardware to match on
EtherType with "protocol 802.1Q" flower filters. It doesn't work for the
reasons detailed below.
With a command such as the following:
tc filter add dev $swp1 ingress chain $(IS1 2) pref 3 \
protocol 802.1Q flower skip_sw vlan_id 200 src_mac $h1_mac \
action vlan modify id 300 \
action goto chain $(IS2 0 0)
the created filter is set by ocelot_flower_parse_key() to be of type
OCELOT_VCAP_KEY_ETYPE, and etype is set to {value=0x8100, mask=0xffff}.
This gets propagated all the way to is1_entry_set() which commits it to
hardware (the VCAP_IS1_HK_ETYPE field of the key). Compare this to the
case where src_mac isn't specified - the key type is OCELOT_VCAP_KEY_ANY,
and is1_entry_set() doesn't populate VCAP_IS1_HK_ETYPE.
The problem is that for VLAN-tagged frames, the hardware interprets the
ETYPE field as holding the encapsulated VLAN protocol. So the above
filter will only match those packets which have an encapsulated protocol
of 0x8100, rather than all packets with VLAN ID 200 and the given src_mac.
The reason why this is allowed to occur is because, although we have a
block of code in ocelot_flower_parse_key() which sets "match_protocol"
to false when VLAN keys are present, that code executes too late.
There is another block of code, which executes for Ethernet addresses,
and has a "goto finished_key_parsing" and skips the VLAN header parsing.
By skipping it, "match_protocol" remains with the value it was
initialized with, i.e. "true", and "proto" is set to f->common.protocol,
or 0x8100.
The concept of ignoring some keys rather than erroring out when they are
present but can't be offloaded is dubious in itself, but is present
since the initial commit fe3490e610 ("net: mscc: ocelot: Hardware
ofload for tc flower filter"), and it's outside of the scope of this
patch to change that.
The problem was introduced when the driver started to interpret the
flower filter's protocol, and populate the VCAP filter's ETYPE field
based on it.
To fix this, it is sufficient to move the code that parses the VLAN keys
earlier than the "goto finished_key_parsing" instruction. This will
ensure that if we have a flower filter with both VLAN and Ethernet
address keys, it won't match on ETYPE 0x8100, because the VLAN key
parsing sets "match_protocol = false".
Fixes: 86b956de11 ("net: mscc: ocelot: support matching on EtherType")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230205192409.1796428-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b6d642510 ]
Frank reports that in a mt7530 setup where some ports are standalone and
some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
lose their VLAN tag on xmit, as seen by the link partner.
This seems to occur because once the other ports join the VLAN-aware
bridge, mt7530_port_vlan_filtering() also calls
mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
that the switch processes the traffic of the standalone port.
Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it:
EG_TAG: Incoming Port Egress Tag VLAN Attribution
0: disabled (system default)
1: consistent (keep the original ingress tag attribute)
My interpretation is that this setting applies on the ingress port, and
"disabled" is basically the normal behavior, where the egress tag format
of the packet (tagged or untagged) is decided by the VLAN table
(MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG).
But there is also an option of overriding the system default behavior,
and for the egress tagging format of packets to be decided not by the
VLAN table, but simply by copying the ingress tag format (if ingress was
tagged, egress is tagged; if ingress was untagged, egress is untagged;
aka "consistent). This is useful in 2 scenarios:
- VLAN-unaware bridge ports will always encounter a miss in the VLAN
table. They should forward a packet as-is, though. So we use
"consistent" there. See commit e045124e93 ("net: dsa: mt7530: fix
tagged frames pass-through in VLAN-unaware mode").
- Traffic injected from the CPU port. The operating system is in god
mode; if it wants a packet to exit as VLAN-tagged, it sends it as
VLAN-tagged. Otherwise it sends it as VLAN-untagged*.
*This is true only if we don't consider the bridge TX forwarding offload
feature, which mt7530 doesn't support.
So for now, make the CPU port always stay in "consistent" mode to allow
software VLANs to be forwarded to their egress ports with the VLAN tag
intact, and not stripped.
Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/
Fixes: e045124e93 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode")
Reported-by: Frank Wunderlich <frank-w@public-files.de>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230205140713.1609281-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c793f8ea15 ]
If the user turns on the vf-true-promiscuous-support flag, then Rx VLAN
filtering will be disabled if the VF requests to enable promiscuous
mode. When the VF is in a port VLAN, this is the incorrect behavior
because it will allow the VF to receive traffic outside of its port VLAN
domain. Fortunately this only resulted in the VF(s) receiving broadcast
traffic outside of the VLAN domain because all of the VLAN promiscuous
rules are based on the port VLAN ID. Fix this by setting the
.disable_rx_filtering VLAN op to a no-op when a port VLAN is enabled on
the VF.
Also, make sure to make this fix for both Single VLAN Mode and Double
VLAN Mode enabled devices.
Fixes: c31af68a1b ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a2127e66a ]
set_cpus_allowed_ptr() will fail with -EINVAL if the requested
affinity mask is not a subset of the task_cpu_possible_mask() for the
task being updated. Consequently, on a heterogeneous system with cpusets
spanning the different CPU types, updates to the cgroup hierarchy can
silently fail to update task affinities when the effective affinity
mask for the cpuset is expanded.
For example, consider an arm64 system with 4 CPUs, where CPUs 2-3 are
the only cores capable of executing 32-bit tasks. Attaching a 32-bit
task to a cpuset containing CPUs 0-2 will correctly affine the task to
CPU 2. Extending the cpuset to CPUs 0-3, however, will fail to extend
the affinity mask of the 32-bit task because update_tasks_cpumask() will
pass the full 0-3 mask to set_cpus_allowed_ptr().
Extend update_tasks_cpumask() to take a temporary 'cpumask' paramater
and use it to mask the 'effective_cpus' mask with the possible mask for
each task being updated.
Fixes: 431c69fac0 ("cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f20660f05 ]
An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
making the retry attempt fail at sync_file_get_fence().
The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
passed value for exbuf->fence_fd when returning anything besides a
successful result from the ioctl.
Fixes: 2cd7b6f08b ("drm/virtio: add in/out fence support for explicit synchronization")
Signed-off-by: Ryan Neph <ryanneph@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203233345.2477767-1-ryanneph@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90d5e8301a ]
Due to a workaround we have to make sure the WM1 watermarks block/lines
values are sensible even when WM1 is disabled. To that end we copy those
values from WM0.
However since we now keep each wm level enabled on a per-plane basis
it doesn't seem necessary to do that copy when we already have an
enabled WM1 on the current plane. That is, we might be in a situation
where another plane can only do WM0 (and thus needs the copy) but
the current plane's WM1 is still perfectly valid (ie. fits into the
current DDB allocation).
Skipping the copy could avoid reprogramming the plane's registers
needlessly in some cases.
Fixes: a301cb0fca ("drm/i915: Keep plane watermarks enabled more aggressively")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230131002127.29305-1-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
(cherry picked from commit c580c2d27a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7d94b2612 ]
Check all ports instead of just port_count ports. PTP init was only
checking ports 0 to port_count. If the hardware ports are not mapped
starting from 0 then they would be missed, e.g. if only ports 20-30 were
mapped it would attempt to init ports 0-10, resulting in NULL pointers
when attempting to timestamp. Now it will init all mapped ports.
Fixes: 70dfe25cd8 ("net: sparx5: Update extraction/injection for timestamping")
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03702d4d29 ]
Since commit 58e0be1ef6 ("net: use struct_group to copy ip/ipv6
header addresses"), ip and ipv6 headers started to use the __struct_group
definition, which is defined at include/uapi/linux/stddef.h. However,
linux/stddef.h isn't explicitly included in include/uapi/linux/{ip,ipv6}.h,
which breaks build of xskxceiver bpf selftest if you install the uapi
headers in the system:
$ make V=1 xskxceiver -C tools/testing/selftests/bpf
...
make: Entering directory '(...)/tools/testing/selftests/bpf'
gcc -g -O0 -rdynamic -Wall -Werror (...)
In file included from xskxceiver.c:79:
/usr/include/linux/ip.h:103:9: error: expected specifier-qualifier-list before ‘__struct_group’
103 | __struct_group(/* no tag */, addrs, /* no attrs */,
| ^~~~~~~~~~~~~~
...
Include the missing <linux/stddef.h> dependency in ip.h and do the
same for the ipv6.h header.
Fixes: 58e0be1ef6 ("net: use struct_group to copy ip/ipv6 header addresses")
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51be2fffd6 ]
On a sc7180-based Chromebook, when I go to
/sys/devices/system/cpu/cpu0/cpufreq I can see:
cpuinfo_cur_freq:2995200
cpuinfo_max_freq:1804800
scaling_available_frequencies:300000 576000 ... 1708800 1804800
scaling_cur_freq:1804800
scaling_max_freq:1804800
As you can see the `cpuinfo_cur_freq` is bogus. It turns out that this
bogus info started showing up as of commit c72cf0cb1d ("cpufreq:
qcom-hw: Fix the frequency returned by cpufreq_driver->get()"). That
commit seems to assume that everyone is on the LMH bandwagon, but
sc7180 isn't.
Let's go back to the old code in the case where LMH isn't used.
Fixes: c72cf0cb1d ("cpufreq: qcom-hw: Fix the frequency returned by cpufreq_driver->get()")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
[ Viresh: Fixed the 'fixes' tag ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b69585bfce ]
In one version of the HW there is a remote possibility that it
will miss the doorbell ring. This adds a bit of protection to
be sure we don't stall a queue from a missed doorbell.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e55f0f5bef ]
The same pre-work code is used before each call to
ionic_rx_fill(), so bring it in and make it a part of
the routine.
Signed-off-by: Neel Patel <neel@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: b69585bfce ("ionic: missed doorbell workaround")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8797a0584 ]
Clear the interrupt credits before enabling the queue rather
than after to be sure that the enabled queue starts at 0 and
that we don't wipe away possible credits after enabling the
queue.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Neel Patel <neel.patel@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce93fdb5f2 ]
After calling fwnode_phy_find_device(), the phy device refcount is
incremented. Then, when the phy device is attached to a netdev with
phy_attach_direct(), the refcount is also incremented but only
decremented in the caller if phy_attach_direct() fails. Move
phy_device_free() before the "if" to always release it correctly.
Indeed, either phy_attach_direct() failed and we don't want to keep a
reference to the phydev or it succeeded and a reference has been taken
internally.
Fixes: 25396f680d ("net: phylink: introduce phylink_fwnode_phy_connect()")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6028da3f12 ]
When copying the DSCP bits for decap-dscp into IPv6 don't assume the
outer encap is always IPv6. Instead, as with the inner IPv4 case, copy
the DSCP bits from the correctly saved "tos" value in the control block.
Fixes: 227620e295 ("[IPSEC]: Separate inner/outer mode processing on input")
Signed-off-by: Christian Hopps <chopps@chopps.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6ee896385 ]
int type = nla_type(nla);
if (type > XFRMA_MAX) {
return -EOPNOTSUPP;
}
@type is then used as an array index and can be used
as a Spectre v1 gadget.
if (nla_len(nla) < compat_policy[type].len) {
array_index_nospec() can be used to prevent leaking
content of kernel memory to malicious users.
Fixes: 5106f4a8ac ("xfrm/compat: Add 32=>64-bit messages translator")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4ae5e1e97c upstream.
The ISO 11783-5 standard, in "4.5.2 - Address claim requirements", states:
d) No CF shall begin, or resume, transmission on the network until 250
ms after it has successfully claimed an address except when
responding to a request for address-claimed.
But "Figure 6" and "Figure 7" in "4.5.4.2 - Address-claim
prioritization" show that the CF begins the transmission after 250 ms
from the first AC (address-claimed) message even if it sends another AC
message during that time window to resolve the address contention with
another CF.
As stated in "4.4.2.3 - Address-claimed message":
In order to successfully claim an address, the CF sending an address
claimed message shall not receive a contending claim from another CF
for at least 250 ms.
As stated in "4.4.3.2 - NAME management (NM) message":
1) A commanding CF can
d) request that a CF with a specified NAME transmit the address-
claimed message with its current NAME.
2) A target CF shall
d) send an address-claimed message in response to a request for a
matching NAME
Taking the above arguments into account, the 250 ms wait is requested
only during network initialization.
Do not restart the timer on AC message if both the NAME and the address
match and so if the address has already been claimed (timer has expired)
or the AC message has been sent to resolve the contention with another
CF (timer is still running).
Signed-off-by: Devid Antonio Filoni <devid.filoni@egluetechnologies.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/all/20221125170418.34575-1-devid.filoni@egluetechnologies.com
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6933c01e4 upstream.
Commit 7a8b64d17e ("of/address: use range parser for of_dma_get_range")
converted the parsing of dma-range properties to use code shared with the
PCI range parser. The intent was to introduce no functional changes however
in the case where we fail to translate the first resource instead of
returning -EINVAL the new code we return 0. Restore the previous behaviour
by returning an error if we find no valid ranges, the original code only
handled the first range but subsequently support for parsing all supplied
ranges was added.
This avoids confusing code using the parsed ranges which doesn't expect to
successfully parse ranges but have only a list terminator returned, this
fixes breakage with so far as I can tell all DMA for on SoC devices on the
Socionext Synquacer platform which has a firmware supplied DT. A bisect
identified the original conversion as triggering the issues there.
Fixes: 7a8b64d17e ("of/address: use range parser for of_dma_get_range")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Luca Di Stefano <luca.distefano@linaro.org>
Cc: 993612@bugs.debian.org
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230126-synquacer-boot-v2-1-cb80fd23c4e2@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e46d910d8 upstream.
poll() and select() on per_cpu trace_pipe and trace_pipe_raw do not work
since kernel 6.1-rc6. This issue is seen after the commit
42fb0a1e84 ("tracing/ring-buffer: Have
polling block on watermark").
This issue is firstly detected and reported, when testing the CXL error
events in the rasdaemon and also erified using the test application for poll()
and select().
This issue occurs for the per_cpu case, when calling the ring_buffer_poll_wait(),
in kernel/trace/ring_buffer.c, with the buffer_percent > 0 and then wait until the
percentage of pages are available. The default value set for the buffer_percent is 50
in the kernel/trace/trace.c.
As a fix, allow userspace application could set buffer_percent as 0 through
the buffer_percent_fops, so that the task will wake up as soon as data is added
to any of the specific cpu buffer.
Link: https://lore.kernel.org/linux-trace-kernel/20230202182309.742-2-shiju.jose@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <mchehab@kernel.org>
Cc: <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 42fb0a1e84 ("tracing/ring-buffer: Have polling block on watermark")
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff209ecc37 upstream.
This reverts commit 5e85eba6f5.
Thomas Witt reported that 5e85eba6f5 ("PCI/ASPM: Refactor L1 PM Substates
Control Register programming") broke suspend/resume on a Tuxedo
Infinitybook S 14 v5, which seems to use a Clevo L140CU Mainboard.
The main symptom is:
iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible
and the machine is only partially usable after resume. It can't run dmesg
and can't do a clean reboot. This happens on every suspend/resume cycle.
Revert 5e85eba6f5 until we can figure out the root cause.
Fixes: 5e85eba6f5 ("PCI/ASPM: Refactor L1 PM Substates Control Register programming")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Reported-by: Thomas Witt <kernel@witt.link>
Tested-by: Thomas Witt <kernel@witt.link>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.1+
Cc: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7152be79b upstream.
This reverts commit 4ff116d0d5.
Tasev Nikola and Mark Enriquez reported that resume from suspend was broken
in v6.1-rc1. Tasev bisected to a47126ec29 ("PCI/PTM: Cache PTM
Capability offset"), but we can't figure out how that could be related.
Mark saw the same symptoms and bisected to 4ff116d0d5 ("PCI/ASPM: Save L1
PM Substates Capability for suspend/resume"), which does have a connection:
it restores L1 Substates configuration while ASPM L1 may be enabled:
pci_restore_state
pci_restore_aspm_l1ss_state
aspm_program_l1ss
pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore
pci_restore_pcie_state
pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore
which is a problem because PCIe r6.0, sec 5.5.4, requires that:
If setting either or both of the enable bits for ASPM L1 PM
Substates, both ports must be configured as described in this
section while ASPM L1 is disabled.
Separately, Thomas Witt reported that 5e85eba6f5 ("PCI/ASPM: Refactor L1
PM Substates Control Register programming") broke suspend/resume, and it
depends on 4ff116d0d5.
Revert 4ff116d0d5 ("PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume") to fix the resume issue and enable revert of 5e85eba6f5
to fix the issue Thomas reported.
Note that reverting 4ff116d0d5 means L1 Substates config may be lost on
suspend/resume. As far as we know the system will use more power but will
still *work* correctly.
Fixes: 4ff116d0d5 ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Reported-by: Tasev Nikola <tasev.stefanoska@skynet.be>
Reported-by: Mark Enriquez <enriquezmark36@gmail.com>
Reported-by: Thomas Witt <kernel@witt.link>
Tested-by: Mark Enriquez <enriquezmark36@gmail.com>
Tested-by: Thomas Witt <kernel@witt.link>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.1+
Cc: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>