According to CAAM RM:
-crypto engine reads 4 S/G entries (64 bytes) at a time,
even if the S/G table has fewer entries
-it's the responsibility of the user / programmer to make sure
this HW behaviour has no side effect
The drivers do not take care of this currently, leading to IOMMU faults
when the S/G table ends close to a page boundary - since only one page
is DMA mapped, while CAAM's DMA engine accesses two pages.
Fix this by rounding up the number of allocated S/G table entries
to a multiple of 4.
Note that in case of two *contiguous* S/G tables, only the last table
might needs extra entries.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When enabling IOMMU support, the following issue becomes visible
in the AEAD zero-length case.
Even though the output sequence length is set to zero, the crypto engine
tries to prefetch 4 S/G table entries (since SGF bit is set
in SEQ OUT PTR command - which is either generated in SW in case of
caam/jr or in HW in case of caam/qi, caam/qi2).
The DMA read operation will trigger an IOMMU fault since the address in
the SEQ OUT PTR is "dummy" (set to zero / not obtained via DMA API
mapping).
1. In case of caam/jr, avoid the IOMMU fault by clearing the SGF bit
in SEQ OUT PTR command.
2. In case of caam/qi - setting address, bpid, length to zero for output
entry in the compound frame has a special meaning (cf. CAAM RM):
"Output frame = Unspecified, Input address = Y. A unspecified frame is
indicated by an unused SGT entry (an entry in which the Address, Length,
and BPID fields are all zero). SEC obtains output buffers from BMan as
prescribed by the preheader."
Since no output buffers are needed, modify the preheader by setting
(ABS = 1, ADDBUF = 0):
-"ABS = 1 means obtain the number of buffers in ADDBUF (0 or 1) from
the pool POOL ID"
-ADDBUF: "If ABS is set, ADD BUF specifies whether to allocate
a buffer or not"
3. In case of caam/qi2, since engine:
-does not support FLE[FMT]=2'b11 ("unused" entry) mentioned in DPAA2 RM
-requires output entry to be present, even if not used
the solution chosen is to leave output frame list entry zeroized.
Fixes: 763069ba49 ("crypto: caam - handle zero-length AEAD output")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If an invalid key is provided as input to the setkey function, the
function always failed returning -ENOMEM rather than -EINVAL.
Furthermore, if setkey was called multiple times with an invalid key,
the device instance was getting leaked.
This patch fixes the error paths in the setkey functions by returning
the correct error code in case of error and freeing all the resources
allocated in this function in case of failure.
This problem was found with by the new extra run-time crypto self test.
Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The block size for aes counter mode was improperly set to AES_BLOCK_SIZE.
This sets it to 1 as it is a stream cipher.
This problem was found with by the new extra run-time crypto self test.
Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Allocate a contiguous buffer and instruct the qat hardware to return the
iv at the end of an encryption or decryption operation.
The iv is copied to the array provided by the user in the callback
function.
This problem was found with by the crypto self test.
Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST
Fixes: 3580d04aa6 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
changed kexec_add_buffer() to skip searching for a memory location if
kexec_buf.mem is already set, and use the address that is there.
In powerpc code we reuse a kexec_buf variable for loading both the
kernel and the initramfs by resetting some of the fields between those
uses, but not mem. This causes kexec_add_buffer() to try to load the
kernel at the same address where initramfs will be loaded, which is
naturally rejected:
# kexec -s -l --initrd initramfs vmlinuz
kexec_file_load failed: Invalid argument
Setting the mem field before every call to kexec_add_buffer() fixes
this regression.
Fixes: b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 9cc342f6c4 ("treewide: prefix header search paths with
$(srctree)/") caused a build error for MIPS VDSO.
CC arch/mips/vdso/gettimeofday.o
In file included from ../arch/mips/vdso/vdso.h:26,
from ../arch/mips/vdso/gettimeofday.c:11:
../arch/mips/include/asm/page.h:12:10: fatal error: spaces.h: No such file or directory
#include <spaces.h>
^~~~~~~~~~
The cause of the error is a missing space after the compiler flag -I .
Kbuild used to have a global restriction "no space after -I", but
commit 48f6e3cf5b ("kbuild: do not drop -I without parameter") got
rid of it. Having a space after -I is no longer a big deal as far as
Kbuild is concerned.
It is still a big deal for MIPS because arch/mips/vdso/Makefile
filters the header search paths, like this:
ccflags-vdso := \
$(filter -I%,$(KBUILD_CFLAGS)) \
..., which relies on the assumption that there is no space after -I .
Fixes: 9cc342f6c4 ("treewide: prefix header search paths with $(srctree)/")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
It's easy to have a mismatch of "intended to be public" vs really
exposed API functions. While Makefile does check for this mismatch, if
it actually occurs it's not trivial to determine which functions are
accidentally exposed. This patch dumps out a diff showing what's not
supposed to be exposed facilitating easier fixing.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The DCDC2 regulator output is actually called "VDD_EE" in various
Meson8b board schematics. This matches with what Amlogic names it in the
most part of their vendor kernel (there are a few places where it's
actually called VDDAO, schematics of EC-100 suggest that the regulator
output is used for both signals).
While here, also give the regulator an alias as it supplies the Mali GPU
so a phandle to it will be required later on.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The canvas IP on Meson8, Meson8b and Meson8m2 is mostly identical to the
one on GXBB and newer. The only known difference so far is that that the
"endianness" bits are not supported on Meson8m2 and earlier.
Add new compatible strings and a check in meson_canvas_config() to
validate that the endianness bits cannot be configured on the 32-bit
SoCs.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The canvas IP on Meson8, Meson8b and Meson8m2 is similar to the one
found on GXBB and newer. The only known difference is that the older
SoCs cannot configure the "endianness".
Add a compatible string for each of the older SoCs to make sure we won't
be using unsupported features on these SoCs.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add the canvas module to Meson8b because it's required for the VPU
(video output) and video decoders.
The canvas module is located inside the "DMC bus" (where also some of
the memory controller registers are located). The "DMC bus" itself is
part of the so-called "MMC bus".
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
With the Meson8m2 SoC the canvas module was moved from offset 0x20
(Meson8) to offset 0x48 (same as on Meson8b). The offsets inside the
canvas module are identical.
Correct the offset so the driver uses the correct registers.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add the canvas module to Meson8 because it's required for the VPU
(video output) and video decoders.
The canvas module is located inside thie "DMC bus" (where also some of
the memory controller registers are located). The "DMC bus" itself is
part of the so-called "MMC bus".
Amlogic's vendor kernel has an explicit #define for the "DMC" register
range on Meson8m2 while there's no such #define for Meson8. However, the
canvas and memory controller registers on Meson8 are all expressed as
"0x6000 + actual offset", while Meson8m2 uses "DMC + actual offset".
Thus it's safe to assume that the DMC bus exists on both SoCs even
though the registers inside are slightly different.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Michael Chan says:
===================
bnxt_en: Bug fixes.
There are 4 driver fixes in this series:
1. Fix RX buffer leak during OOM condition.
2. Call pci_disable_msix() under correct conditions to prevent hitting BUG.
3. Reduce unneeded mmeory allocation in kdump kernel to prevent OOM.
4. Don't read device serial number on VFs because it is not supported.
Please queue #1, #2, #3 for -stable as well. Thanks.
===================
Signed-off-by: David S. Miller <davem@davemloft.net>
Skip RDMA context memory allocations, reduce to 1 ring, and disable
TPA when running in the kdump kernel. Without this patch, the driver
fails to initialize with memory allocation errors when running in a
typical kdump kernel.
Fixes: cf6daed098 ("bnxt_en: Increase context memory allocations on 57500 chips for RDMA.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When making configuration changes, the driver calls bnxt_close_nic()
and then bnxt_open_nic() for the changes to take effect. A parameter
irq_re_init is passed to the call sequence to indicate if IRQ
should be re-initialized. This irq_re_init parameter needs to
be included in the bnxt_reserve_rings() call. bnxt_reserve_rings()
can only call pci_disable_msix() if the irq_re_init parameter is
true, otherwise it may hit BUG() because some IRQs may not have been
freed yet.
Fixes: 41e8d79837 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For every RX packet, the driver replenishes all buffers used for that
packet and puts them back into the RX ring and RX aggregation ring.
In one code path where the RX packet has one RX buffer and one or more
aggregation buffers, we missed recycling the aggregation buffer(s) if
we are unable to allocate a new SKB buffer. This leads to the
aggregation ring slowly running out of buffers over time. Fix it
by properly recycling the aggregation buffers.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Rakesh Hemnani <rhemnani@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.
Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.
|--------------------------------------------------------|
| | Current | Send Buffer Loop |
|--------------------------------------------------------|
| | Throughput | CPU sys | Throughput | CPU sys |
| | (MB/s) | time (s) | (MB/s) | time (s) |
|--------------------------------------------------------|
| Min | 407 | 7.048 | 401 | 5.958 |
|--------------------------------------------------------|
| Max | 455 | 7.563 | 542 | 6.993 |
|--------------------------------------------------------|
| Avg | 440 | 7.411 | 451 | 6.639 |
|--------------------------------------------------------|
| Median | 446 | 7.417 | 447 | 6.761 |
|--------------------------------------------------------|
Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the hv_sock buffer size is static and can't scale to the
bandwidth requirements of the application. This change allows the
applications to influence the socket buffer sizes using the SO_SNDBUF and
the SO_RCVBUF socket options.
Few interesting points to note:
1. Since the VMBUS does not allow a resize operation of the ring size, the
socket buffer size option should be set prior to establishing the
connection for it to take effect.
2. Setting the socket option comes with the cost of that much memory being
reserved/allocated by the kernel, for the lifetime of the connection.
Perf data:
Total Data Transfer: 1GB
Single threaded reader/writer
Results below are summarized over 10 iterations.
Linux hvsocket writer + Windows hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size -> | 128B | 1KB | 4KB | 64KB |
|---------------------------------------------------------------------------------------------|
|SO_SNDBUF size | | Throughput in MB/s (min/max/avg/median): |
| v | |
|---------------------------------------------------------------------------------------------|
| Default | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 | 410/491/462/470 |
| 16KB | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 | 592/824/692/676 |
| 32KB | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 |
| 64KB | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 |
|---------------------------------------------------------------------------------------------|
Windows hvsocket writer + Linux hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size -> | 128B | 1KB | 4KB | 64KB |
|---------------------------------------------------------------------------------------------|
|SO_RCVBUF size | | Throughput in MB/s (min/max/avg/median): |
| v | |
|---------------------------------------------------------------------------------------------|
| Default | 69/82/75/73 | 313/343/333/336 | 418/477/446/445 | 659/701/676/678 |
| 16KB | 69/83/76/77 | 350/401/375/382 | 506/548/517/516 | 602/624/615/615 |
| 32KB | 62/83/73/73 | 471/529/496/494 | 830/1046/935/939 | 944/1180/1070/1100 |
| 64KB | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 |
|---------------------------------------------------------------------------------------------|
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 redirect is broken for VRF. __ip6_route_redirect walks the FIB
entries looking for an exact match on ifindex. With VRF the flowi6_oif
is updated by l3mdev_update_flow to the l3mdev index and the
FLOWI_FLAG_SKIP_NH_OIF set in the flags to tell the lookup to skip the
device match. For redirects the device match is requires so use that
flag to know when the oif needs to be reset to the skb device index.
Fixes: ca254490c8 ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing two 4 bytes holes allows to use kmalloc-32
kmem cache instead of kmalloc-64 on 64bit kernels.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tracepoint to __neigh_create to enable debugging of new entries.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The point of the pause-on-fail argument is to leave the setup as is after
a test fails to allow a user to debug why it failed. Move the cleanup
after posting the result to the user to make it so.
Random names for the namespaces are not user friendly when trying to
debug a failure. Make them simpler and more direct for the tests. Run
cleanup at the beginning to ensure they are cleaned up if they already
exist.
Remove cleanup_done. There is no harm in doing cleanup twice; just
ignore any errors related to not existing - which is already done.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by
default. Add getopt parsing of inputs and support for -v (verbose) and
-p (pause on fail).
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New userspace on an older kernel can send unknown and unsupported
attributes resulting in an incompelete config which is almost
always wrong for routing (few exceptions are passthrough settings
like the protocol that installed the route).
Set strict_start_type in the policies for IPv4 and IPv6 routes and
rules to detect new, unsupported attributes and fail the route add.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern says:
====================
net: Export functions for nexthop code
This set exports ipv4 and ipv6 fib functions for use by the nexthop
code. It also adds new ones to send route notifications if a nexthop
configuration changes.
v2
- repost of patches dropped at the end of the last dev window
added patch 8 which exports nh_update_mtu since it is inline with
the other patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the
nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add scope as input argument versus relying on fib_info reference in
fib_nh, and export fib_info_update_nh_saddr.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As nexthops are deleted, fib entries referencing it are marked dead.
Export fib_flush so those entries can be removed in a timely manner.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change fib_check_nh to take net, table and scope as input arguments
over struct fib_config and export for use by nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE
notifications with NLM_F_REPLACE set for entries linked to a fib_info
that have nh_updated flag set. This helper will be used by the nexthop
code to notify userspace of routes that are impacted when a nexthop
config is updated via replace. The new function and its helper are
similar to how fib_flush and fib_table_flush work for address delete
and link down events.
This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.
In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This
helper will be used by the nexthop code to notify userspace of routes
that are impacted when a nexthop config is updated via replace.
This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.
In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add hook to ipv6 stub to bump the sernum up to the root node for a
route. This is needed by the nexthop code when a nexthop config changes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop
code to remove entries linked to a nexthop that is getting deleted.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn says:
====================
net: phy: T1 support
T1 PHYs make use of a single twisted pair, rather than the traditional
2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link
modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in
the list of PHY features used by current T1 drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that there is a link mode for 100BaseT1, use it in
phy_basic_t1_features so T1 PHY drivers will indicate this mode via
the Ethtool API.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>