Commit Graph

858756 Commits

Author SHA1 Message Date
Horia Geantă
a5e5c13398 crypto: caam - fix S/G table passing page boundary
According to CAAM RM:
-crypto engine reads 4 S/G entries (64 bytes) at a time,
even if the S/G table has fewer entries
-it's the responsibility of the user / programmer to make sure
this HW behaviour has no side effect

The drivers do not take care of this currently, leading to IOMMU faults
when the S/G table ends close to a page boundary - since only one page
is DMA mapped, while CAAM's DMA engine accesses two pages.

Fix this by rounding up the number of allocated S/G table entries
to a multiple of 4.
Note that in case of two *contiguous* S/G tables, only the last table
might needs extra entries.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:03 +08:00
Horia Geantă
dcd9c76e5a crypto: caam - avoid S/G table fetching for AEAD zero-length output
When enabling IOMMU support, the following issue becomes visible
in the AEAD zero-length case.

Even though the output sequence length is set to zero, the crypto engine
tries to prefetch 4 S/G table entries (since SGF bit is set
in SEQ OUT PTR command - which is either generated in SW in case of
caam/jr or in HW in case of caam/qi, caam/qi2).
The DMA read operation will trigger an IOMMU fault since the address in
the SEQ OUT PTR is "dummy" (set to zero / not obtained via DMA API
mapping).

1. In case of caam/jr, avoid the IOMMU fault by clearing the SGF bit
in SEQ OUT PTR command.

2. In case of caam/qi - setting address, bpid, length to zero for output
entry in the compound frame has a special meaning (cf. CAAM RM):
"Output frame = Unspecified, Input address = Y. A unspecified frame is
indicated by an unused SGT entry (an entry in which the Address, Length,
and BPID fields are all zero). SEC obtains output buffers from BMan as
prescribed by the preheader."

Since no output buffers are needed, modify the preheader by setting
(ABS = 1, ADDBUF = 0):
-"ABS = 1 means obtain the number of buffers in ADDBUF (0 or 1) from
the pool POOL ID"
-ADDBUF: "If ABS is set, ADD BUF specifies whether to allocate
a buffer or not"

3. In case of caam/qi2, since engine:
-does not support FLE[FMT]=2'b11 ("unused" entry) mentioned in DPAA2 RM
-requires output entry to be present, even if not used
the solution chosen is to leave output frame list entry zeroized.

Fixes: 763069ba49 ("crypto: caam - handle zero-length AEAD output")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:03 +08:00
Giovanni Cabiddu
a3af11399a crypto: qat - do not offload zero length requests
If a zero length request is submitted through the skcipher api,
do not offload it and return success.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:03 +08:00
Giovanni Cabiddu
96ee111a65 crypto: qat - return error for block ciphers for invalid requests
Return -EINVAL if a request for a block cipher is not multiple of the
size of the block.

This problem was found with by the new extra run-time crypto self test.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:03 +08:00
Giovanni Cabiddu
92fec16d1f crypto: qat - return proper error code in setkey
If an invalid key is provided as input to the setkey function, the
function always failed returning -ENOMEM rather than -EINVAL.
Furthermore, if setkey was called multiple times with an invalid key,
the device instance was getting leaked.

This patch fixes the error paths in the setkey functions by returning
the correct error code in case of error and freeing all the resources
allocated in this function in case of failure.

This problem was found with by the new extra run-time crypto self test.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:03 +08:00
Giovanni Cabiddu
51d33c2f05 crypto: qat - fix block size for aes ctr mode
The block size for aes counter mode was improperly set to AES_BLOCK_SIZE.
This sets it to 1 as it is a stream cipher.

This problem was found with by the new extra run-time crypto self test.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:02 +08:00
Giovanni Cabiddu
15b5e9112c crypto: qat - update iv after encryption or decryption operations
Allocate a contiguous buffer and instruct the qat hardware to return the
iv at the end of an encryption or decryption operation.
The iv is copied to the array provided by the user in the callback
function.

This problem was found with by the crypto self test.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:02 +08:00
Shant KumarX Sonnad
c044b62c36 crypto: qat - add check for negative offset in alg precompute function
The offset is calculated based on type of hash algorithum.
If the algorithum is invalid the offset can have negative value.
Hence added negative offset check and return -EFAULT.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Shant KumarX Sonnad <shant.kumarx.sonnad@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:02 +08:00
Xin Zeng
933224985a crypto: qat - remove spin_lock in qat_ablkcipher_setkey
Remove unnecessary spin lock in qat_ablkcipher_setkey.

Reviewed-by: Conor Mcloughlin <conor.mcloughlin@intel.com>
Tested-by: Sergey Portnoy <sergey.portnoy@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:02 +08:00
Eric Dumazet
903869bd10 ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST
ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST

Fixes: 3580d04aa6 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 22:08:06 -07:00
Thiago Jung Bauermann
8b909e3548 powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
Commit b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
changed kexec_add_buffer() to skip searching for a memory location if
kexec_buf.mem is already set, and use the address that is there.

In powerpc code we reuse a kexec_buf variable for loading both the
kernel and the initramfs by resetting some of the fields between those
uses, but not mem. This causes kexec_add_buffer() to try to load the
kernel at the same address where initramfs will be loaded, which is
naturally rejected:

  # kexec -s -l --initrd initramfs vmlinuz
  kexec_file_load failed: Invalid argument

Setting the mem field before every call to kexec_add_buffer() fixes
this regression.

Fixes: b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-23 14:00:32 +10:00
Dave Airlie
6b0538da5a Merge branch 'drm-fixes-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Fixes for 5.2:
- Fix for DMCU firmware issues for stable
- Add missing polaris10 pci id to kfd
- Screen corruption fix on picasso
- Fix for driver reload on vega10
- SR-IOV fixes
- Locking fix in new SMU code
- Compute profile switching fix for KFD

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522205425.3657-1-alexander.deucher@amd.com
2019-05-23 12:04:05 +10:00
Dave Airlie
eab007dd1b Merge tag 'drm-misc-fixes-2019-05-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- sun4i fixes to hdmi phy as well as u16 overflow in dsi (left from -next-fixes)
- gma500 fix to make lvds detection more reliable
- select devfreq for panfrost since it can't probe without it

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522194440.GA22359@art_vandelay
2019-05-23 12:03:41 +10:00
Dave Airlie
27e248c467 Merge branch 'vmwgfx-fixes-5.2' of git://people.freedesktop.org/~thomash/linux into drm-fixes
A set of misc fixes for various issues that have surfaced recently.
All Cc'd stable except the dma iterator fix which shouldn't really cause
any real issues on older kernels.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Thomas Hellstrom (VMware)" <thomas@shipmail.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522115408.33185-1-thomas@shipmail.org
2019-05-23 12:02:39 +10:00
Geert Uytterhoeven
c5eac1f532 MIPS: TXx9: Fix boot crash in free_initmem()
On rbtx4927:

    BUG: Bad page state in process swapper  pfn:00001
    page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
    flags: 0x0()
    raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
    page dumped because: nonzero mapcount
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Not tainted 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756 #141
    Stack : 00000000 10008400 8040dc4c 87c1b974 8044af63 8040dc4c 00000001 804a3490
            00000001 81000000 0030f231 80148558 00000003 10008400 87c1dd80 3d0f9a2c
            00000000 00000000 804b0000 00000000 00000007 00000000 00000081 00000000
            62722d31 00000080 804b0000 39347874 00000000 804b7820 8040ce18 81000010
            00000001 00000007 00000001 81000000 00000018 8021de24 00000000 804a0000
            ...
    Call Trace:
    [<8010adec>] show_stack+0x74/0x104
    [<801a5e44>] bad_page+0x130/0x138
    [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
    [<801a789c>] free_unref_page+0x40/0x68
    [<801120f4>] free_init_pages+0xec/0x104
    [<803bdde8>] free_initmem+0x10/0x58
    [<803bdb8c>] kernel_init+0x20/0x100
    [<801057c8>] ret_from_kernel_thread+0x14/0x1c

As of commit b93ddc4f91 ("mips: Reserve memory for the kernel
image resources"), bootmem_init() no longer reserves the memory below
the kernel, while prom_free_prom_memory() still frees it.

Fix this by reverting commit b6263ff2d6 ("MIPS: TXx9: Implement
prom_free_prom_memory").

Suggested-by: Serge Semin <fancer.lancer@gmail.com>
Fixes: b93ddc4f91 ("mips: Reserve memory for the kernel image resources")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Maciej W . Rozycki <macro@linux-mips.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
2019-05-22 18:46:32 -07:00
Masahiro Yamada
3dd0aade59 MIPS: remove a space after -I to cope with header search paths for VDSO
Commit 9cc342f6c4 ("treewide: prefix header search paths with
$(srctree)/") caused a build error for MIPS VDSO.

  CC      arch/mips/vdso/gettimeofday.o
In file included from ../arch/mips/vdso/vdso.h:26,
                 from ../arch/mips/vdso/gettimeofday.c:11:
../arch/mips/include/asm/page.h:12:10: fatal error: spaces.h: No such file or directory
 #include <spaces.h>
          ^~~~~~~~~~

The cause of the error is a missing space after the compiler flag -I .

Kbuild used to have a global restriction "no space after -I", but
commit 48f6e3cf5b ("kbuild: do not drop -I without parameter") got
rid of it. Having a space after -I is no longer a big deal as far as
Kbuild is concerned.

It is still a big deal for MIPS because arch/mips/vdso/Makefile
filters the header search paths, like this:

  ccflags-vdso := \
          $(filter -I%,$(KBUILD_CFLAGS)) \

..., which relies on the assumption that there is no space after -I .

Fixes: 9cc342f6c4 ("treewide: prefix header search paths with $(srctree)/")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
2019-05-22 18:46:15 -07:00
Masahiro Yamada
6074c33c6b MIPS: mark ginvt() as __always_inline
To meet the 'i' (immediate) constraint for the asm operands,
this function must be always inlined.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-kernel@vger.kernel.org
2019-05-22 18:45:52 -07:00
Andrii Nakryiko
9efc779449 libbpf: emit diff of mismatched public API, if any
It's easy to have a mismatch of "intended to be public" vs really
exposed API functions. While Makefile does check for this mismatch, if
it actually occurs it's not trivial to determine which functions are
accidentally exposed. This patch dumps out a diff showing what's not
supposed to be exposed facilitating easier fixing.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-22 18:34:01 -07:00
Martin Blumenstingl
8ee9ee7423 ARM: dts: meson8m2: mxiii-plus: add the supply for the Mali GPU
The Mali GPU is supplied by VDD_EE which is provided by the DCDC2
regulator.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:29:31 -07:00
Martin Blumenstingl
0b67e66a5f ARM: dts: meson8m2: mxiii-plus: rename the DCDC2 regulator
The DCDC2 regulator output is actually called "VDD_EE" in various
Meson8b board schematics. This matches with what Amlogic names it in the
most part of their vendor kernel (there are a few places where it's
actually called VDDAO, schematics of EC-100 suggest that the regulator
output is used for both signals).
While here, also give the regulator an alias as it supplies the Mali GPU
so a phandle to it will be required later on.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:29:31 -07:00
Martin Blumenstingl
9a98fdf5b6 soc: amlogic: canvas: add support for Meson8, Meson8b and Meson8m2
The canvas IP on Meson8, Meson8b and Meson8m2 is mostly identical to the
one on GXBB and newer. The only known difference so far is that that the
"endianness" bits are not supported on Meson8m2 and earlier.

Add new compatible strings and a check in meson_canvas_config() to
validate that the endianness bits cannot be configured on the 32-bit
SoCs.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:19:14 -07:00
Martin Blumenstingl
a0b2ff5315 dt-bindings: soc: amlogic: canvas: document support for Meson8/8b/8m2
The canvas IP on Meson8, Meson8b and Meson8m2 is similar to the one
found on GXBB and newer. The only known difference is that the older
SoCs cannot configure the "endianness".

Add a compatible string for each of the older SoCs to make sure we won't
be using unsupported features on these SoCs.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:19:14 -07:00
Martin Blumenstingl
872f881e72 ARM: dts: meson8b: add the canvas module
Add the canvas module to Meson8b because it's required for the VPU
(video output) and video decoders.

The canvas module is located inside the "DMC bus" (where also some of
the memory controller registers are located). The "DMC bus" itself is
part of the so-called "MMC bus".

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:18:16 -07:00
Martin Blumenstingl
10256a4755 ARM: dts: meson8m2: update the offset of the canvas module
With the Meson8m2 SoC the canvas module was moved from offset 0x20
(Meson8) to offset 0x48 (same as on Meson8b). The offsets inside the
canvas module are identical.

Correct the offset so the driver uses the correct registers.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:18:09 -07:00
Martin Blumenstingl
47b5818239 ARM: dts: meson8: add the canvas module
Add the canvas module to Meson8 because it's required for the VPU
(video output) and video decoders.

The canvas module is located inside thie "DMC bus" (where also some of
the memory controller registers are located). The "DMC bus" itself is
part of the so-called "MMC bus".

Amlogic's vendor kernel has an explicit #define for the "DMC" register
range on Meson8m2 while there's no such #define for Meson8. However, the
canvas and memory controller registers on Meson8 are all expressed as
"0x6000 + actual offset", while Meson8m2 uses "DMC + actual offset".
Thus it's safe to assume that the DMC bus exists on both SoCs even
though the registers inside are slightly different.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-05-22 18:17:35 -07:00
Eric Dumazet
3580d04aa6 ipv4/igmp: fix another memory leak in igmpv3_del_delrec()
syzbot reported memory leaks [1] that I have back tracked to
a missing cleanup from igmpv3_del_delrec() when
(im->sfmode != MCAST_INCLUDE)

Add ip_sf_list_clear_all() and kfree_pmc() helpers to explicitely
handle the cleanups before freeing.

[1]

BUG: memory leak
unreferenced object 0xffff888123e32b00 (size 64):
  comm "softirq", pid 0, jiffies 4294942968 (age 8.010s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 e0 00 00 01 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000006105011b>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<000000006105011b>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<000000006105011b>] slab_alloc mm/slab.c:3326 [inline]
    [<000000006105011b>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [<000000004bba8073>] kmalloc include/linux/slab.h:547 [inline]
    [<000000004bba8073>] kzalloc include/linux/slab.h:742 [inline]
    [<000000004bba8073>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [<000000004bba8073>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [<00000000a46a65a0>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [<000000005956ca89>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:957
    [<00000000848e2d2f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [<00000000b9db185c>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [<000000003028e438>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
    [<0000000015b65589>] __sys_setsockopt+0x98/0x120 net/socket.c:2078
    [<00000000ac198ef0>] __do_sys_setsockopt net/socket.c:2089 [inline]
    [<00000000ac198ef0>] __se_sys_setsockopt net/socket.c:2086 [inline]
    [<00000000ac198ef0>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [<000000000a770437>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
    [<00000000d3adb93b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 9c8bb163ae ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:04:17 -07:00
David S. Miller
db51a73282 Merge branch 'bnxt_en-Bug-fixes'
Michael Chan says:

===================
bnxt_en: Bug fixes.

There are 4 driver fixes in this series:

1. Fix RX buffer leak during OOM condition.
2. Call pci_disable_msix() under correct conditions to prevent hitting BUG.
3. Reduce unneeded mmeory allocation in kdump kernel to prevent OOM.
4. Don't read device serial number on VFs because it is not supported.

Please queue #1, #2, #3 for -stable as well.  Thanks.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:02:14 -07:00
Vasundhara Volam
2e9217d1e8 bnxt_en: Device serial number is supported only for PFs.
Don't read DSN on VFs that do not have the PCI capability.

Fixes: 03213a9965 ("bnxt: move bp->switch_id initialization to PF probe")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:02:14 -07:00
Michael Chan
d629522e1d bnxt_en: Reduce memory usage when running in kdump kernel.
Skip RDMA context memory allocations, reduce to 1 ring, and disable
TPA when running in the kdump kernel.  Without this patch, the driver
fails to initialize with memory allocation errors when running in a
typical kdump kernel.

Fixes: cf6daed098 ("bnxt_en: Increase context memory allocations on 57500 chips for RDMA.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:02:14 -07:00
Michael Chan
1b3f0b75c3 bnxt_en: Fix possible BUG() condition when calling pci_disable_msix().
When making configuration changes, the driver calls bnxt_close_nic()
and then bnxt_open_nic() for the changes to take effect.  A parameter
irq_re_init is passed to the call sequence to indicate if IRQ
should be re-initialized.  This irq_re_init parameter needs to
be included in the bnxt_reserve_rings() call.  bnxt_reserve_rings()
can only call pci_disable_msix() if the irq_re_init parameter is
true, otherwise it may hit BUG() because some IRQs may not have been
freed yet.

Fixes: 41e8d79837 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:02:14 -07:00
Michael Chan
296d5b5416 bnxt_en: Fix aggregation buffer leak under OOM condition.
For every RX packet, the driver replenishes all buffers used for that
packet and puts them back into the RX ring and RX aggregation ring.
In one code path where the RX packet has one RX buffer and one or more
aggregation buffers, we missed recycling the aggregation buffer(s) if
we are unable to allocate a new SKB buffer.  This leads to the
aggregation ring slowly running out of buffers over time.  Fix it
by properly recycling the aggregation buffers.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Rakesh Hemnani <rhemnani@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:02:14 -07:00
Sunil Muthuswamy
14a1eaa882 hv_sock: perf: loop in send() to maximize bandwidth
Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.

Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.

|--------------------------------------------------------|
|        |        Current        |   Send Buffer Loop    |
|--------------------------------------------------------|
|        | Throughput | CPU sys  | Throughput | CPU sys  |
|        | (MB/s)     | time (s) | (MB/s)     | time (s) |
|--------------------------------------------------------|
| Min    |     407    |   7.048  |    401     |  5.958   |
|--------------------------------------------------------|
| Max    |     455    |   7.563  |    542     |  6.993   |
|--------------------------------------------------------|
| Avg    |     440    |   7.411  |    451     |  6.639   |
|--------------------------------------------------------|
| Median |     446    |   7.417  |    447     |  6.761   |
|--------------------------------------------------------|

Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:00:36 -07:00
Sunil Muthuswamy
ac383f58f3 hv_sock: perf: Allow the socket buffer size options to influence the actual socket buffers
Currently, the hv_sock buffer size is static and can't scale to the
bandwidth requirements of the application. This change allows the
applications to influence the socket buffer sizes using the SO_SNDBUF and
the SO_RCVBUF socket options.

Few interesting points to note:
1. Since the VMBUS does not allow a resize operation of the ring size, the
socket buffer size option should be set prior to establishing the
connection for it to take effect.
2. Setting the socket option comes with the cost of that much memory being
reserved/allocated by the kernel, for the lifetime of the connection.

Perf data:
Total Data Transfer: 1GB
Single threaded reader/writer
Results below are summarized over 10 iterations.

Linux hvsocket writer + Windows hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |      128B       |       1KB       |       4KB       |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_SNDBUF size | |                 Throughput in MB/s (min/max/avg/median):                  |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 |   410/491/462/470   |
|      16KB       | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 |   592/824/692/676   |
|      32KB       | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 |
|      64KB       | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 |
|---------------------------------------------------------------------------------------------|

Windows hvsocket writer + Linux hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |     128B    |      1KB        |          4KB        |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_RCVBUF size | |               Throughput in MB/s (min/max/avg/median):                    |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 69/82/75/73 | 313/343/333/336 |   418/477/446/445   |   659/701/676/678   |
|      16KB       | 69/83/76/77 | 350/401/375/382 |   506/548/517/516   |   602/624/615/615   |
|      32KB       | 62/83/73/73 | 471/529/496/494 |   830/1046/935/939  | 944/1180/1070/1100  |
|      64KB       | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 |
|---------------------------------------------------------------------------------------------|

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 18:00:14 -07:00
David Ahern
31680ac265 ipv6: Fix redirect with VRF
IPv6 redirect is broken for VRF. __ip6_route_redirect walks the FIB
entries looking for an exact match on ifindex. With VRF the flowi6_oif
is updated by l3mdev_update_flow to the l3mdev index and the
FLOWI_FLAG_SKIP_NH_OIF set in the flags to tell the lookup to skip the
device match. For redirects the device match is requires so use that
flag to know when the oif needs to be reset to the skb device index.

Fixes: ca254490c8 ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:58:21 -07:00
Eric Dumazet
0db355d499 ipv4/igmp: shrink struct ip_sf_list
Removing two 4 bytes holes allows to use kmalloc-32
kmem cache instead of kmalloc-64 on 64bit kernels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:57:37 -07:00
David Ahern
fc651001d2 neighbor: Add tracepoint to __neigh_create
Add tracepoint to __neigh_create to enable debugging of new entries.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:50:24 -07:00
David Ahern
a92a0a7b8e selftests: pmtu: Simplify cleanup and namespace names
The point of the pause-on-fail argument is to leave the setup as is after
a test fails to allow a user to debug why it failed. Move the cleanup
after posting the result to the user to make it so.

Random names for the namespaces are not user friendly when trying to
debug a failure. Make them simpler and more direct for the tests. Run
cleanup at the beginning to ensure they are cleaned up if they already
exist.

Remove cleanup_done. There is no harm in doing cleanup twice; just
ignore any errors related to not existing - which is already done.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:50:24 -07:00
David Ahern
9b7e94e6e8 selftests: fib-onlink: Make quiet by default
Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by
default. Add getopt parsing of inputs and support for -v (verbose) and
-p (pause on fail).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:50:24 -07:00
David Ahern
75425657fe net: Set strict_start_type for routes and rules
New userspace on an older kernel can send unknown and unsupported
attributes resulting in an incompelete config which is almost
always wrong for routing (few exceptions are passthrough settings
like the protocol that installed the route).

Set strict_start_type in the policies for IPv4 and IPv6 routes and
rules to detect new, unsupported attributes and fail the route add.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:50:24 -07:00
David S. Miller
e38f7cbd36 Merge branch 'net-Export-functions-for-nexthop-code'
David Ahern says:

====================
net: Export functions for nexthop code

This set exports ipv4 and ipv6 fib functions for use by the nexthop
code. It also adds new ones to send route notifications if a nexthop
configuration changes.

v2
- repost of patches dropped at the end of the last dev window
  added patch 8 which exports nh_update_mtu since it is inline with
  the other patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
06c77c3e67 ipv4: Rename and export nh_update_mtu
Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the
nexthop code.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
c3669486b5 ipv4: export fib_info_update_nh_saddr
Add scope as input argument versus relying on fib_info reference in
fib_nh, and export fib_info_update_nh_saddr.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
9bd8366792 ipv4: export fib_flush
As nexthops are deleted, fib entries referencing it are marked dead.
Export fib_flush so those entries can be removed in a timely manner.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
ac1fab2d13 ipv4: export fib_check_nh
Change fib_check_nh to take net, table and scope as input arguments
over struct fib_config and export for use by nexthop code.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
1bff1a0c9b ipv4: Add function to send route updates
Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE
notifications with NLM_F_REPLACE set for entries linked to a fib_info
that have nh_updated flag set. This helper will be used by the nexthop
code to notify userspace of routes that are impacted when a nexthop
config is updated via replace. The new function and its helper are
similar to how fib_flush and fib_table_flush work for address delete
and link down events.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:44 -07:00
David Ahern
19a3b7eea4 ipv6: export function to send route updates
Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This
helper will be used by the nexthop code to notify userspace of routes
that are impacted when a nexthop config is updated via replace.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:43 -07:00
David Ahern
cdaa16a4f7 ipv6: Add hook to bump sernum for a route to stubs
Add hook to ipv6 stub to bump the sernum up to the root node for a
route. This is needed by the nexthop code when a nexthop config changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:43 -07:00
David Ahern
68a9b13d92 ipv6: Add delete route hook to stubs
Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop
code to remove entries linked to a nexthop that is getting deleted.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:48:43 -07:00
David S. Miller
26b1b8d7f2 Merge branch 'net-phy-T1-support'
Andrew Lunn says:

====================
net: phy: T1 support

T1 PHYs make use of a single twisted pair, rather than the traditional
2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link
modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in
the list of PHY features used by current T1 drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:46:28 -07:00
Andrew Lunn
e5fb32c67c net: phy: Make phy_basic_t1_features use base100t1.
Now that there is a link mode for 100BaseT1, use it in
phy_basic_t1_features so T1 PHY drivers will indicate this mode via
the Ethtool API.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22 17:46:28 -07:00