The dying bit removes the conntrack entry if the netdev that owns this
flow is going down. Instead, use the teardown mechanism to push back the
flow to conntrack to let the classic software path decide what to do
with it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add helper function to allocate and initialize flow offload work and use
it to consolidate existing code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Set on FLOW_DISSECTOR_KEY_META meta key using flow tuple ingress interface.
Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not fetch statistics if flow has expired since it might not in
hardware anymore. After this update, remove the FLOW_OFFLOAD_HW_DYING
check from nf_flow_offload_stats() since this flag is never set on.
Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: wenxu <wenxu@ucloud.cn>
The comment documenting how bitwise expressions work includes a table
which summarizes the mask and xor arguments combined to express the
supported boolean operations. However, the row for OR:
mask xor
0 x
is incorrect.
dreg = (sreg & 0) ^ x
is not equivalent to:
dreg = sreg | x
What the code actually does is:
dreg = (sreg & ~x) ^ x
Update the documentation to match.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add code to check if memory intended for RDMA is FS-DAX-memory. RDS
will fail with error code EOPNOTSUPP if FS-DAX-memory is detected.
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Till recently it was not possible for userspace to specify a different
IOVA, but with the new ibv_reg_mr_iova() library call this can be done.
To compute the user_va we must compute:
user_va = (iova - iova_start) + user_va_start
while being cautious of overflow and other math problems.
The iova is not reliably stored in the mmkey when the MR is created. Only
the cached creation path (the common one) set it, so it must also be set
when creating uncached MRs.
Fix the weird use of iova when computing the starting page index in the
MR. In the normal case, when iova == umem.address:
iova & (~(BIT(page_shift) - 1)) ==
ALIGN_DOWN(umem.address, odp->page_size) ==
ib_umem_start(odp)
And when iova is different using it in math with a user_va is wrong.
Finally, do not allow an implicit ODP to be created with a non-zero IOVA
as we have no support for that.
Fixes: 7bdf65d411 ("IB/mlx5: Handle page faults")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs.
Therefore don't report support in these if query comes from a kernel user.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
One of the steps in ODP page fault handler for WQEs is to read a WQE
from a QP send queue or receive queue buffer at a specific index.
Since the implementation of this buffer is different between kernel and
user QP the implementation of the handler needs to be aware of that and
handle it in a different way.
ODP for kernel MRs is currently supported only for RDMA_READ
and RDMA_WRITE operations so change the handler to
- read a WQE from a kernel QP send queue
- fail if access to receive queue or shared receive queue is
required for a kernel QP
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Allow ULPs to call advise_mr, so they can control ODP regions
in the same way as user space applications.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Add ib_reg_user_mr() for kernel ULPs to register user MRs.
The common use case that uses this function is a userspace application
that allocates memory for HCA access but the responsibility to register
the memory at the HCA is on an kernel ULP. This ULP that acts as an agent
for the userspace application.
This function is intended to be used without a user context so vendor
drivers need to be aware of calling reg_user_mr() device operation with
udata equal to NULL.
Among all drivers, i40iw is the only driver which relies on presence
of udata, so check udata existence for that driver.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.
This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Commit 8303b7e8f0 ("netfilter: nat: fix spurious connection timeouts")
made nf_nat_icmp_reply_translation() use icmp_manip_pkt() as the l4
manipulation function for the outer packet on ICMP errors.
However, icmp_manip_pkt() assumes the packet has an 'id' field which
is not correct for all types of ICMP messages.
This is not correct for ICMP error packets, and leads to bogus bytes
being written the ICMP header, which can be wrongfully regarded as
'length' bytes by RFC 4884 compliant receivers.
Fix by assigning the 'id' field only for ICMP messages that have this
semantic.
Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Fixes: 8303b7e8f0 ("netfilter: nat: fix spurious connection timeouts")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
commit 9cf57731b6 ("watchdog/softlockup: Replace "watchdog/%u" threads
with cpu_stop_work") ensures that the watchdog is reliably touched during
a task switch.
As a result the check for an unnoticed task switch is not longer needed.
Remove the relevant code, which effectively reverts commit b1a8de1f53
("softlockup: make detector be aware of task switch of processes hogging
cpu")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20191024114928.15377-2-pmladek@suse.com
The only callers of the function lapbeth_get_x25_dev()
are lapbeth_rcv() and lapbeth_device_event().
lapbeth_rcv() uses rcu_read_lock() whereas lapbeth_device_event()
is called with RTNL held (As mentioned in the comments).
Therefore, pass lockdep_rtnl_is_held() as cond argument in
list_for_each_entry_rcu();
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported following crash:
list_del corruption, ffff88808c9bb000->prev is LIST_POISON2 (dead000000000122)
[..]
Call Trace:
__list_del_entry include/linux/list.h:131 [inline]
list_del_rcu include/linux/rculist.h:148 [inline]
nf_tables_commit+0x1068/0x3b30 net/netfilter/nf_tables_api.c:7183
[..]
The commit transaction list has:
NFT_MSG_NEWTABLE
NFT_MSG_NEWFLOWTABLE
NFT_MSG_DELFLOWTABLE
NFT_MSG_DELTABLE
A missing generation check during DELTABLE processing causes it to queue
the DELFLOWTABLE operation a second time, so we corrupt the list here:
case NFT_MSG_DELFLOWTABLE:
list_del_rcu(&nft_trans_flowtable(trans)->list);
nf_tables_flowtable_notify(&trans->ctx,
because we have two different DELFLOWTABLE transactions for the same
flowtable. We then call list_del_rcu() twice for the same flowtable->list.
The object handling seems to suffer from the same bug so add a generation
check too and only queue delete transactions for flowtables/objects that
are still active in the next generation.
Reported-by: syzbot+37a6804945a3a13b1572@syzkaller.appspotmail.com
Fixes: 3b49e2e94e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This WARN can trigger because some of the names fed to the module
autoload function can be of arbitrary length.
Remove the WARN and add limits for all NLA_STRING attributes.
Reported-by: syzbot+0e63ae76d117ae1c3a01@syzkaller.appspotmail.com
Fixes: 452238e8d5 ("netfilter: nf_tables: add and use helper for module autoload")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
trace_printk() is used to debug the kernel which includes the tracing
infrastructure. But because it writes to the ring buffer, and so does much
of the tracing infrastructure, the ring buffer's recursive detection will
drop writes to the ring buffer that is in the same context as the current
write is happening (it allows interrupts to write when normal context is
writing, but wont let normal context write while normal context is writing).
This can cause confusion and think that the code is where the trace_printk()
exists is not hit. To solve this, up the recursive nesting of the ring
buffer when trace_printk() is called before it writes to the buffer itself.
Note, this does make it dangerous to use trace_printk() in the ring buffer
code itself, because this basically disables the recursion protection of
trace_printk() buffer writes. But as trace_printk() is only used for
debugging, and if this does occur, the developer will see the cause real
quick (recursive blowing up of the stack). Thus the developer can deal with
that. But having trace_printk() silently ignored is a much bigger problem,
and disabling recursive protection is a small price to pay to fix it.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
ASoC: Fixes for v5.5
This is mostly driver specific fixes, plus an error handling fix
in the core. There is a rather large diffstat for the stm32 SAI
driver, this is a very large but mostly mechanical update which
wraps every register access in the driver to allow a fix to the
locking which avoids circular locks, the active change is much
smaller and more reasonably sized.
DSA subsystem takes care of netdev statistics since commit 4ed70ce9f0
("net: dsa: Refactor transmit path to eliminate duplication"), so
any accounting inside tagger callbacks is redundant and can lead to
messing up the stats.
This bug is present in Qualcomm tagger since day 0.
Fixes: cafdc45c94 ("net-next: dsa: add Qualcomm tag RX/TX handler")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The correct name is GSWIP (Gigabit Switch IP). Typo was introduced in
875138f81d ("dsa: Move tagger name into its ops structure") while
moving tagger names to their structures.
Fixes: 875138f81d ("dsa: Move tagger name into its ops structure")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can extend user ASID space if it turns out that system does not
require KPTI. We start with kernel ASIDs reserved because CPU caps are
not finalized yet and free them up lazily on the next rollover if we
confirm than KPTI is not in use.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When building with PROVE_LOCKING=y, lockdep shows the following
dump message.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
...
Calling device_set_wakeup_enable() directly occurs this issue,
and it isn't necessary for initialization, so this patch creates
internal function __ave_ethtool_set_wol() and replaces with this
in ave_init() and ave_resume().
Fixes: 7200f2e3c9 ("net: ethernet: ave: Set initial wol state to disabled")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware can not handle short frames below or equal to 32
bytes according to the hardware user manual, and it will trigger
a RAS error when the frame's length is below 33 bytes.
This patch pads the SKB when skb->len is below 33 bytes before
sending it to hardware.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I missed the fact that macvlan_broadcast() can be used both
in RX and TX.
skb_eth_hdr() makes only sense in TX paths, so we can not
use it blindly in macvlan_broadcast()
Fixes: 96cc4b6958 ("macvlan: do not assume mac_header is set in macvlan_broadcast()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jurgen Van Ham <juvanham@gmail.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As wsa881x schema mentions #sound-dai-cells as required property,
so update soundwire-controller.yaml example so that dt_bindings_check
does not fail as below:
Documentation/devicetree/bindings/soundwire/soundwire-controller.example.dt.yaml:
speaker@0,1: '#sound-dai-cells' is a required property
Documentation/devicetree/bindings/soundwire/soundwire-controller.example.dt.yaml:
speaker@0,2: '#sound-dai-cells' is a required property
Reported-by: Rob Herring <robh@kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200114094806.15846-1-srinivas.kandagatla@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Turns out that Tiger Lake GPIO will be enumerated using _HID method where
there is only a single ACPI device and multiple BARs so rework the driver
to support that scheme instead.
Fixes: c9ccf71fc8 ("pinctrl: intel: Add Intel Tiger Lake pin controller support")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Coffee Lake-S PCH has the same GPIO hardware than Sunrisepoint-H
PCH but the ACPI ID is different. Add this new ACPI ID to the list of
supported devices.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This patch is to fix clock setting code for different controller
versions. Two of HW changes after vendor version 2.2 are removing
PEREN/HCKEN/IPGEN bits in system control register, and adding SD
clock stable bit in present state register. This patch cleans up
related code too.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Link: https://lore.kernel.org/r/20200108040713.38888-2-yangbo.lu@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
There is an official update for eSDHC tuning erratum A-008171.
This patch is to implement the changes,
- Affect all revisions of SoC.
- Changes for tuning window checking.
- Hardware hits a new condition that tuning succeeds although
the eSDHC might not have tuned properly for type2 SoCs
(soc_tuning_erratum_type2[] array in driver). So check
tuning window after tuning succeeds.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20191212075219.48625-2-yangbo.lu@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>