[ Upstream commit fa05bdb89b ]
This reverts commit 9ea3e52c5b.
Cited commit added a check to make sure 'action' is not NULL, but
'action' is already dereferenced before the check, when calling
flow_offload_has_one_action().
Therefore, the check does not make any sense and results in a smatch
warning:
include/net/flow_offload.h:322 flow_action_mixed_hw_stats_check() warn:
variable dereferenced before check 'action' (see line 319)
Fix by reverting this commit.
Cc: gushengxian <gushengxian@yulong.com>
Fixes: 9ea3e52c5b ("flow_offload: action should not be NULL when it is referenced")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20210819105842.1315705-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8da80c9d50 ]
Make changes to MAC address dependent on the response of PF.
Disallow changes to HW MAC address and MAC filter from untrusted
VF, thanks to that ping is not lost if VF tries to change MAC.
Add a new field in iavf_mac_filter, to indicate whether there
was response from PF for given filter. Based on this field pass
or discard the filter.
If untrusted VF tried to change it's address, it's not changed.
Still filter was changed, because of that ping couldn't go through.
Fixes: c5c922b3e0 ("iavf: fix MAC address setting for VFs when filter is rejected")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <Gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a222be597e ]
Without this patch, ATR does not work. Receive/transmit uses queue
selection based on SW DCB hashing method.
If traffic classes are not configured for PF, then use
netdev_pick_tx function for selecting queue for packet transmission.
Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx,
which ensures that packet is transmitted/received from CPU that is
running the application.
Reproduction steps:
1. Load i40e driver
2. Map each MSI interrupt of i40e port for each CPU
3. Disable ntuple, enable ATR i.e.:
ethtool -K $interface ntuple off
ethtool --set-priv-flags $interface flow-director-atr
4. Run application that is generating traffic and is bound to a
single CPU, i.e.:
taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10
5. Observe behavior:
Application's traffic should be restricted to the CPU provided in
taskset.
Fixes: 89ec1f0886 ("i40e: Fix queue-to-TC mapping on Tx")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a876a33d2a ]
The register of USB_BP2_EN is 16 bits, so we should use
ocp_write_word(), not ocp_write_byte().
Fixes: 9370f2d05a ("support request_firmware for RTL8153")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8798d36411 ]
This fixes improper iotlb invalidation in intel_pasid_tear_down_entry().
When a PASID was used as nested mode, released and reused, the following
error message will appear:
[ 180.187556] Unexpected page request in Privilege Mode
[ 180.187565] Unexpected page request in Privilege Mode
[ 180.279933] Unexpected page request in Privilege Mode
[ 180.279937] Unexpected page request in Privilege Mode
Per chapter 6.5.3.3 of VT-d spec 3.3, when tear down a pasid entry, the
software should use Domain selective IOTLB flush if the PGTT of the pasid
entry is SL only or Nested, while for the pasid entries whose PGTT is FL
only or PT using PASID-based IOTLB flush is enough.
Fixes: 2cd1311a26 ("iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr")
Signed-off-by: Kumar Sanjay K <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Yi Sun <yi.y.sun@intel.com>
Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01634047bf ]
fq qdisc requires tstamp to be cleared in the forwarding path. Now ovs
doesn't clear skb->tstamp. We encountered a problem with linux
version 5.4.56 and ovs version 2.14.1, and packets failed to
dequeue from qdisc when fq qdisc was attached to ovs port.
Fixes: fb420d5d91 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: kaixi.fan <fankaixi.li@bytedance.com>
Signed-off-by: xiexiaohui <xiexiaohui.xxh@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7bd0cef5da ]
When registering mdiobus children, if we get an -EPROBE_DEFER, we shouldn't
ignore it and continue registering the rest of the mdiobus children. This
would permanently prevent the deferring child mdiobus from working instead
of reattempting it in the future. So, if a child mdiobus needs to be
reattempted in the future, defer the entire mdio-mux initialization.
This fixes the issue where PHYs sitting under the mdio-mux aren't
initialized correctly if the PHY's interrupt controller is not yet ready
when the mdio-mux is being probed. Additional context in the link below.
Fixes: 0ca2997d14 ("netdev/of/phy: Add MDIO bus multiplexer support.")
Link: https://lore.kernel.org/lkml/CAGETcx95kHrv8wA-O+-JtfH7H9biJEGJtijuPVN0V5dUKUAB3A@mail.gmail.com/#t
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86b9bbd332 ]
When adding support for using the skb->hash value as the flow hash in CAKE,
I accidentally introduced a logic error that broke the host-only isolation
modes of CAKE (srchost and dsthost keywords). Specifically, the flow_hash
variable should stay initialised to 0 in cake_hash() in pure host-based
hashing mode. Add a check for this before using the skb->hash value as
flow_hash.
Fixes: b0c19ed608 ("sch_cake: Take advantage of skb->hash where appropriate")
Reported-by: Pete Heist <pete@heistp.net>
Tested-by: Pete Heist <pete@heistp.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbcf24d153 ]
Commit a02e8964ea ("virtio-net: ethtool configurable LRO")
maps LRO to virtio guest offloading features and allows the
administrator to enable and disable those features via ethtool.
This leads to several issues:
- For a device that doesn't support control guest offloads, the "LRO"
can't be disabled triggering WARN in dev_disable_lro() when turning
off LRO or when enabling forwarding bridging etc.
- For a device that supports control guest offloads, the guest
offloads are disabled in cases of bridging, forwarding etc slowing
down the traffic.
Fix this by using NETIF_F_GRO_HW instead. Though the spec does not
guarantee packets to be re-segmented as the original ones,
we can add that to the spec, possibly with a flag for devices to
differentiate between GRO and LRO.
Further, we never advertised LRO historically before a02e8964ea
("virtio-net: ethtool configurable LRO") and so bridged/forwarded
configs effectively always relied on virtio receive offloads behaving
like GRO - thus even if this breaks any configs it is at least not
a regression.
Fixes: a02e8964ea ("virtio-net: ethtool configurable LRO")
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Ivan <ivan@prestigetransportation.com>
Tested-by: Ivan <ivan@prestigetransportation.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97c2c69e19 ]
The number of queues implemented by many virtio backends is limited,
especially some machines have a large number of CPUs. In this case, it
is often impossible to allocate a separate queue for
XDP_TX/XDP_REDIRECT, then xdp cannot be loaded to work, even xdp does
not use the XDP_TX/XDP_REDIRECT.
This patch allows XDP_TX/XDP_REDIRECT to run by reuse the existing SQ
with __netif_tx_lock() hold when there are not enough queues.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09e856d54b ]
To fix the "reverse-NAT" for replies.
When a packet is sent over a VRF, the POST_ROUTING hooks are called
twice: Once from the VRF interface, and once from the "actual"
interface the packet will be sent from:
1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct()
This causes the POST_ROUTING hooks to run.
2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again.
Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and
second vrf_l3_rcv() calls them again.
As an example, consider the following SNAT rule:
> iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1
In this case sending over a VRF will create 2 conntrack entries.
The first is from the VRF interface, which performs the IP SNAT.
The second will run the SNAT, but since the "expected reply" will remain
the same, conntrack randomizes the source port of the packet:
e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack
rules are:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1
udp 17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is
SNAT-ed from 10000 --> 61033.
But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later
conntrack entry is matched:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1
udp 17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
And a "port 61033 unreachable" ICMP packet is sent back.
The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(),
the skb already has a conntrack flow attached to it, which means
nf_conntrack_in() will not resolve the flow again.
This means only the dest port is "reverse-NATed" (61033 -> 10000) but
the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's
not received.
This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'.
The fix is then to reset the flow when skb is received on a VRF, to let
conntrack resolve the flow again (which now will hit the earlier flow).
To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by
running 'bash ./repro'):
$ cat run_in_A1.py
import logging
logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
from scapy.all import *
import argparse
def get_packet_to_send(udp_dst_port, msg_name):
return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \
IP(src='3.3.3.3', dst='2.2.2.2')/ \
UDP(sport=53, dport=udp_dst_port)/ \
Raw(f'{msg_name}\x0012345678901234567890')
parser = argparse.ArgumentParser()
parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True,
help="From run_in_A3.py")
parser.add_argument('-socket_port', dest="socket_port", type=str,
required=True, help="From run_in_A3.py")
parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True,
help="From script")
args, _ = parser.parse_known_args()
iface_mac = args.iface_mac
socket_port = int(args.socket_port)
v1_mac = args.v1_mac
print(f'Source port before NAT: {socket_port}')
while True:
pkts = sniff(iface='_v0', store=True, count=1, timeout=10)
if 0 == len(pkts):
print('Something failed, rerun the script :(', flush=True)
break
pkt = pkts[0]
if not pkt.haslayer('UDP'):
continue
pkt_sport = pkt.getlayer('UDP').sport
print(f'Source port after NAT: {pkt_sport}', flush=True)
pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port')
sendp(pkt_to_send, '_v0', verbose=False) # Will not be received
pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port')
sendp(pkt_to_send, '_v0', verbose=False)
break
$ cat run_in_A2.py
import socket
import netifaces
print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}",
flush=True)
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
str('vrf_1' + '\0').encode('utf-8'))
s.connect(('3.3.3.3', 53))
print(f'{s. getsockname()[1]}', flush=True)
s.settimeout(5)
while True:
try:
# Periodically send in order to keep the conntrack entry alive.
s.send(b'a'*40)
resp = s.recvfrom(1024)
msg_name = resp[0].decode('utf-8').split('\0')[0]
print(f"Got {msg_name}", flush=True)
except Exception as e:
pass
$ cat repro.sh
ip netns del A1 2> /dev/null
ip netns del A2 2> /dev/null
ip netns add A1
ip netns add A2
ip -n A1 link add _v0 type veth peer name _v1 netns A2
ip -n A1 link set _v0 up
ip -n A2 link add e00000 type bond
ip -n A2 link add lo0 type dummy
ip -n A2 link add vrf_1 type vrf table 10001
ip -n A2 link set vrf_1 up
ip -n A2 link set e00000 master vrf_1
ip -n A2 addr add 1.1.1.1/24 dev e00000
ip -n A2 link set e00000 up
ip -n A2 link set _v1 master e00000
ip -n A2 link set _v1 up
ip -n A2 link set lo0 up
ip -n A2 addr add 2.2.2.2/32 dev lo0
ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000
ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001
ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \
SNAT --to-source 2.2.2.2 -o vrf_1
sleep 5
ip netns exec A2 python3 run_in_A2.py > x &
XPID=$!
sleep 5
IFACE_MAC=`sed -n 1p x`
SOCKET_PORT=`sed -n 2p x`
V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}`
ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \
${SOCKET_PORT} -v1_mac ${SOCKET_PORT}
sleep 5
kill -9 $XPID
wait $XPID 2> /dev/null
ip netns del A1
ip netns del A2
tail x -n 2
rm x
set +x
Fixes: 73e20b761a ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 828affc27e ]
Each completion ring entry has a valid bit to indicate that the entry
contains a valid completion event. The driver's main poll loop
__bnxt_poll_work() has the proper dma_rmb() to make sure the valid
bit of the next entry has been checked before proceeding further.
But when we call bnxt_rx_pkt() to process the RX event, the RX
completion event consists of two completion entries and only the
first entry has been checked to be valid. We need the same barrier
after checking the next completion entry. Add missing dma_rmb()
barriers in bnxt_rx_pkt() and other similar locations.
Fixes: 67a95e2022 ("bnxt_en: Need memory barrier when processing the completion ring.")
Reported-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55c8fca1da ]
During the swap dependency on PCH_GBE to selection PTP_1588_CLOCK_PCH
incidentally dropped the implicit dependency on the PCI. Restore it.
Fixes: 18d359ceb0 ("pch_gbe, ptp_pch: Fix the dependency direction between these drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19d1532a18 ]
Syzbot reported slab-out-of bounds write in decode_data().
The problem was in missing validation checks.
Syzbot's reproducer generated malicious input, which caused
decode_data() to be called a lot in sixpack_decode(). Since
rx_count_cooked is only 400 bytes and noone reported before,
that 400 bytes is not enough, let's just check if input is malicious
and complain about buffer overrun.
Fail log:
==================================================================
BUG: KASAN: slab-out-of-bounds in drivers/net/hamradio/6pack.c:843
Write of size 1 at addr ffff888087c5544e by task kworker/u4:0/7
CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.6.0-rc3-syzkaller #0
...
Workqueue: events_unbound flush_to_ldisc
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
__kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:641
__asan_report_store1_noabort+0x17/0x20 mm/kasan/generic_report.c:137
decode_data.part.0+0x23b/0x270 drivers/net/hamradio/6pack.c:843
decode_data drivers/net/hamradio/6pack.c:965 [inline]
sixpack_decode drivers/net/hamradio/6pack.c:968 [inline]
Reported-and-tested-by: syzbot+fc8cd9a673d4577fb2e4@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8d8c5d80f ]
skbs are freed on error and not put on the ring. We may, however,
be in a situation where we're freeing the last skb of a batch,
and there is a doorbell ring pending because of xmit_more() being
true earlier. Make sure we ring the door bell in such situations.
Since errors are rare don't pay attention to xmit_more() and just
always flush the pending frames.
The busy case should be safe to be left alone because it can
only happen if start_xmit races with completions and they
both enable the queue. In that case the kick can't be pending.
Noticed while reading the code.
Fixes: 4d172f21ce ("bnxt_en: Implement xmit_more.")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01cca6b933 ]
napi schedules DIM, napi has to be disabled first,
then DIM canceled.
Noticed while reading the code.
Fixes: 0bc0b97fca ("bnxt_en: cleanup DIM work on device shutdown")
Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c603136c9 ]
We can't take the tx lock from the napi poll routine, because
netpoll can poll napi at any moment, including with the tx lock
already held.
The tx lock is protecting against two paths - the disable
path, and (as Michael points out) the NETDEV_TX_BUSY case
which may occur if NAPI completions race with start_xmit
and both decide to re-enable the queue.
For the disable/ifdown path use synchronize_net() to make sure
closing the device does not race we restarting the queues.
Annotate accesses to dev_state against data races.
For the NAPI cleanup vs start_xmit path - appropriate barriers
are already in place in the main spot where Tx queue is stopped
but we need to do the same careful dance in the TX_BUSY case.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45c709f8c7 ]
"access skb fields ok" verifier test fails on s390 with the "verifier
bug. zext_dst is set, but no reg is defined" message. The first insns
of the test prog are ...
0: 61 01 00 00 00 00 00 00 ldxw %r0,[%r1+0]
8: 35 00 00 01 00 00 00 00 jge %r0,0,1
10: 61 01 00 08 00 00 00 00 ldxw %r0,[%r1+8]
... and the 3rd one is dead (this does not look intentional to me, but
this is a separate topic).
sanitize_dead_code() converts dead insns into "ja -1", but keeps
zext_dst. When opt_subreg_zext_lo32_rnd_hi32() tries to parse such
an insn, it sees this discrepancy and bails. This problem can be seen
only with JITs whose bpf_jit_needs_zext() returns true.
Fix by clearning dead insns' zext_dst.
The commits that contributed to this problem are:
1. 5aa5bd14c5 ("bpf: add initial suite for selftests"), which
introduced the test with the dead code.
2. 5327ed3d44 ("bpf: verifier: mark verified-insn with
sub-register zext flag"), which introduced the zext_dst flag.
3. 83a2881903 ("bpf: Account for BPF_FETCH in
insn_has_def32()"), which introduced the sanity check.
4. 9183671af6 ("bpf: Fix leakage under speculation on
mispredicted branches"), which bisect points to.
It's best to fix this on stable branches that contain the second one,
since that's the point where the inconsistency was introduced.
Fixes: 5327ed3d44 ("bpf: verifier: mark verified-insn with sub-register zext flag")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210812151811.184086-2-iii@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71dcadba34 ]
The orginal setting is not correct, fix it to follow hardware data sheet.
If keep this error setting, mt8173/mt8183 display ok
but mt8192 display abnormal.
Fixes: 0664d1392c ("drm/mediatek: Add AAL engine basic function")
Signed-off-by: Yongqiang Niu <yongqiang.niu@mediatek.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08dbd56602 ]
The current code treats an empty iotlb provdied in set_map() as a
special case and destroy the memory region object. This must not be done
since the virtqueue objects reference this MR. Doing so will cause the
driver unload to emit errors and log timeouts caused by the firmware
complaining on busy resources.
This patch treats an empty iotlb as any other change of mapping. In this
case, mlx5_vdpa_create_mr() will fail and the entire set_map() call to
fail.
This issue has not been encountered before but was seen to occur in a
non-official version of qemu. Since qemu is a userspace program, the
driver must protect against such case.
Fixes: 94abbccdf2 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210811053713.66658-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06a089ef64 ]
We have changed the return type for sysc_check_active_timer() from -EBUSY
to -ENXIO, but the gpt12 system timer fix still checks for -EBUSY. We are
also not returning on other errors like we did earlier as noted by
Pavel Machek <pavel@denx.de>.
Commit 3ff340e24c ("bus: ti-sysc: Fix gpt12 system timer issue with
reserved status") should have been updated for commit 65fb736761
("bus: ti-sysc: suppress err msg for timers used as clockevent/source").
Let's fix the issue by checking for -ENXIO and returning on any other
errors as suggested by Pavel Machek <pavel@denx.de>.
Fixes: 3ff340e24c ("bus: ti-sysc: Fix gpt12 system timer issue with reserved status")
Depends-on: 65fb736761 ("bus: ti-sysc: suppress err msg for timers used as clockevent/source")
Reported-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Pavel Machek (CIP) <pavel@denx.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86aab09a48 ]
GCC complains about empty macros in an 'if' statement, so convert
them to 'do {} while (0)' macros.
Fixes these build warnings:
net/dccp/output.c: In function 'dccp_xmit_packet':
../net/dccp/output.c:283:71: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
283 | dccp_pr_debug("transmit_skb() returned err=%d\n", err);
net/dccp/ackvec.c: In function 'dccp_ackvec_update_old':
../net/dccp/ackvec.c:163:80: warning: suggest braces around empty body in an 'else' statement [-Wempty-body]
163 | (unsigned long long)seqno, state);
Fixes: dc841e30ea ("dccp: Extend CCID packet dequeueing interface")
Fixes: 3802408644 ("dccp ccid-2: Update code for the Ack Vector input/registration routine")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: dccp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 484f2b7c61 ]
The 1.2 GHz variant of the Armada 3720 SOC is unstable with DVFS: when
the SOC boots, the WTMI firmware sets clocks and AVS values that work
correctly with 1.2 GHz CPU frequency, but random crashes occur once
cpufreq driver starts scaling.
We do not know currently what is the reason:
- it may be that the voltage value for L0 for 1.2 GHz variant provided
by the vendor in the OTP is simply incorrect when scaling is used,
- it may be that some delay is needed somewhere,
- it may be something else.
The most sane solution now seems to be to simply forbid the cpufreq
driver on 1.2 GHz variant.
Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 92ce45fb87 ("cpufreq: Add DVFS support for Armada 37xx")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5aa95d8834 ]
If probe_device is failing, iommu_group is not initialized because
iommu_group_add_device is not reached, so freeing it will result
in NULL pointer access.
iommu_bus_init
->bus_iommu_probe
->probe_iommu_group in for each:/* return -22 in fail case */
->iommu_probe_device
->__iommu_probe_device /* return -22 here.*/
-> ops->probe_device /* return -22 here.*/
-> iommu_group_get_for_dev
-> ops->device_group
-> iommu_group_add_device //good case
->remove_iommu_group //in fail case, it will remove group
->iommu_release_device
->iommu_group_remove_device // here we don't have group
In my case ops->probe_device (mtk_iommu_probe_device from
mtk_iommu_v1.c) is due to failing fwspec->ops mismatch.
Fixes: d72e31c937 ("iommu: IOMMU Groups")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://lore.kernel.org/r/20210731074737.4573-1-linux@fw-web.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cca342d98b ]
A different wait queue was used when removing ctrl_wait than when adding
it. This effectively made the remove operation without locking compared
to other operations on the wait queue ctrl_wait was part of. This caused
issues like below where dead000000000100 is LIST_POISON1 and
dead000000000200 is LIST_POISON2.
list_add corruption. next->prev should be prev (ffffffc1b0a33a08), \
but was dead000000000200. (next=ffffffc03ac77de0).
------------[ cut here ]------------
CPU: 3 PID: 2138 Comm: bluetoothd Tainted: G O 4.4.238+ #9
...
---[ end trace 0adc2158f0646eac ]---
Call trace:
[<ffffffc000443f78>] __list_add+0x38/0xb0
[<ffffffc0000f0d04>] add_wait_queue+0x4c/0x68
[<ffffffc00020eecc>] __pollwait+0xec/0x100
[<ffffffc000d1556c>] bt_sock_poll+0x74/0x200
[<ffffffc000bdb8a8>] sock_poll+0x110/0x128
[<ffffffc000210378>] do_sys_poll+0x220/0x480
[<ffffffc0002106f0>] SyS_poll+0x80/0x138
[<ffffffc00008510c>] __sys_trace_return+0x0/0x4
Unable to handle kernel paging request at virtual address dead000000000100
...
CPU: 4 PID: 5387 Comm: kworker/u15:3 Tainted: G W O 4.4.238+ #9
...
Call trace:
[<ffffffc0000f079c>] __wake_up_common+0x7c/0xa8
[<ffffffc0000f0818>] __wake_up+0x50/0x70
[<ffffffc000be11b0>] sock_def_wakeup+0x58/0x60
[<ffffffc000de5e10>] l2cap_sock_teardown_cb+0x200/0x224
[<ffffffc000d3f2ac>] l2cap_chan_del+0xa4/0x298
[<ffffffc000d45ea0>] l2cap_conn_del+0x118/0x198
[<ffffffc000d45f8c>] l2cap_disconn_cfm+0x6c/0x78
[<ffffffc000d29934>] hci_event_packet+0x564/0x2e30
[<ffffffc000d19b0c>] hci_rx_work+0x10c/0x360
[<ffffffc0000c2218>] process_one_work+0x268/0x460
[<ffffffc0000c2678>] worker_thread+0x268/0x480
[<ffffffc0000c94e0>] kthread+0x118/0x128
[<ffffffc000085070>] ret_from_fork+0x10/0x20
---[ end trace 0adc2158f0646ead ]---
Signed-off-by: Ole Bjørn Midtbø <omidtbo@cisco.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c4152b297d ]
[Why]
HPD disable and enable sequences are not mutually exclusive
on Linux. For HPDs that spans over 1s (i.e. HPD low = 1s),
part of the disable sequence (specifically, a request to SMU
to lower refclk) could come right before the call to PHY
enable, causing DMUB to access an unresponsive PHY
and thus a hard hang on the system.
[How]
Disable 48mhz refclk off on native DP.
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06050a0f01 ]
Why:
In DCN2x, HW doesn't automatically divide MASTER_UPDATE_LOCK_DB_X
by the number of pipes ODM Combined.
How:
Set MASTER_UPDATE_LOCK_DB_X to the value that is adjusted by the
number of pipes ODM Combined.
Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Bing Guo <bing.guo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b67d4d63e ]
Currently phy_device state could be left in inconsistent state shown
by following alert message[1]. This is because phy_read_status could
be called concurrently from lan78xx_delayedwork, phy_state_machine and
__ethtool_get_link. Fix this by making sure that phy_device state is
updated atomically.
[1] lan78xx 1-1.1.1:1.0 eth0: No phy led trigger registered for speed(-1)
Signed-off-by: Ivan T. Ivanov <iivanov@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a160e2e9a ]
Certain call sites of get_geristers() did not do proper error handling. This
could be a problem as get_geristers() typically return the data via pointer to a
buffer. If an error occurred the code is carelessly manipulating the wrong data.
Signed-off-by: Petko Manolov <petkan@nucleusys.com>
Reviewed-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1159e25c13 ]
A crash has been observed if rmmod is done while automatic debug
collection in progress. It is due to a race condition between
both of them.
To fix stop the sp_task during unload to avoid running qede_sp_task
even if they are schedule during removal process.
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f0f82e2476 ]
After adding physical volumes to a volume group through vgextend, the
kernel will rescan the partitions. This in turn will cause the device
capacity to be queried.
If the device status is set to offline through sysfs at this time, READ
CAPACITY command will return a result which the host byte is
DID_NO_CONNECT, and the capacity of the device will be set to zero in
read_capacity_error(). After setting device status back to running, the
capacity of the device will remain stuck at zero.
Fix this issue by rescanning device when the device state changes to
SDEV_RUNNING.
Link: https://lore.kernel.org/r/20210727034455.1494960-1-lijinlin3@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70edd2e6f6 ]
Avoid printing a 'target allocation failed' error if the driver
target_alloc() callback function returns -ENXIO. This return value
indicates that the corresponding H:C:T:L entry is empty.
Removing this error reduces the scan time if the user issues SCAN_WILD_CARD
scan operation through sysfs parameter on a host with a lot of empty
H:C:T:L entries.
Avoiding the printk on -ENXIO matches the behavior of the other callback
functions during scanning.
Link: https://lore.kernel.org/r/20210726115402.1936-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>