Commit Graph

950968 Commits

Author SHA1 Message Date
Linus Torvalds
7eac66d045 Merge tag 'vfio-v5.9-rc2' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:

 - Fix lockdep issue reported for recursive read-lock (Alex Williamson)

 - Fix missing unwind in type1 replay function (Alex Williamson)

* tag 'vfio-v5.9-rc2' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Add proper error unwind for vfio_iommu_replay()
  vfio-pci: Avoid recursive read-lock usage
2020-08-19 21:04:35 -07:00
Jiansong Chen
da2446b66b Revert "drm/amdgpu: disable gfxoff for navy_flounder"
This reverts commit 9c9b17a7d1.
Newly released sdma fw (51.52) provides a fix for the issue.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-19 23:55:04 -04:00
Frederic Barrat
e17a7c0e0a powerpc/powernv/pci: Fix possible crash when releasing DMA resources
Fix a typo introduced during recent code cleanup, which could lead to
silently not freeing resources or an oops message (on PCI hotplug or
CAPI reset).

Only impacts ioda2, the code path for ioda1 is correct.

Fixes: 01e12629af ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com
2020-08-20 12:30:40 +10:00
Wang Hai
cf96d97738 net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe()
Replace alloc_etherdev_mq with devm_alloc_etherdev_mqs. In this way,
when probe fails, netdev can be freed automatically.

Fixes: 4d5ae32f5e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:37:18 -07:00
Sebastian Andrzej Siewior
9553b62c1d net: atlantic: Use readx_poll_timeout() for large timeout
Commit
   8dcf2ad39f ("net: atlantic: add hwmon getter for MAC temperature")

implemented a read callback with an udelay(10000U). This fails to
compile on ARM because the delay is >1ms. I doubt that it is needed to
spin for 10ms even if possible on x86.

>From looking at the code, the context appears to be preemptible so using
usleep() should work and avoid busy spinning.

Use readx_poll_timeout() in the poll loop.

Fixes: 8dcf2ad39f ("net: atlantic: add hwmon getter for MAC temperature")
Cc: Mark Starovoytov <mstarovoitov@marvell.com>
Cc: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:25:29 -07:00
Min Li
957ff4278e ptp: ptp_clockmatrix: use i2c_master_send for i2c write
The old code for i2c write would break on some controllers, which fails
at handling Repeated Start Condition. So we will just use i2c_master_send
to handle write in one transanction.

Changes since v1:
- Remove indentation change

Signed-off-by: Min Li <min.li.xe@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:23:22 -07:00
David S. Miller
e5b15f5af2 Merge branch 'ptp-Add-generic-helper-functions'
Kurt Kanzenbach says:

====================
ptp: Add generic helper functions

in order to reduce code duplication (and cut'n'paste errors) in the ptp code of
DSA, Ethernet and Phy drivers, create helper functions and move them to
ptp_classify. This way all drivers can share the same implementation.

This is version four and contains bugfixes. Implemented as discussed [1] [2]
[3] [4].

Previous versions can be found here:

 * https://lkml.kernel.org/netdev/20200723074946.14253-1-kurt@linutronix.de/
 * https://lkml.kernel.org/netdev/20200727090601.6500-1-kurt@linutronix.de/
 * https://lkml.kernel.org/netdev/20200730080048.32553-1-kurt@linutronix.de/

Thanks,
Kurt

Changes sinve v3:

 * Coding style issues (Richard Cochran, Petr Machata)
 * Add better documentation (Grygorii Strashko)
 * Fix cpts code (Grygorii Strashko)
 * Use ntohs() for TI code (Grygorii Strashko)
 * Add tags

Changes since v2:

 * Make ptp_parse_header() work in all scenarios (Russell King)
 * Fix msgtype offset for ptp v1 packets

Changes since v1:

 * Fix Kconfig (Richard Cochran)
 * Include more drivers (Richard Cochran)

[1] - https://lkml.kernel.org/netdev/20200713140112.GB27934@hoboy/
[2] - https://lkml.kernel.org/netdev/20200720142146.GB16001@hoboy/
[3] - https://lkml.kernel.org/netdev/20200723074946.14253-1-kurt@linutronix.de/
[4] - https://lkml.kernel.org/netdev/20200729100257.GX1551@shell.armlinux.org.uk/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:09:33 -07:00
Kurt Kanzenbach
17060fb506 ptp: Remove unused macro
The offset for the control field is not needed anymore. Remove it.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:09:19 -07:00
Kurt Kanzenbach
9087da5dcb ptp: ptp_ines: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:09:19 -07:00
Kurt Kanzenbach
38fa7d039f net: phy: dp83640: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:09:19 -07:00
Kurt Kanzenbach
17de44c2c7 ethernet: ti: cpts: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Suggested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:09:19 -07:00
Kurt Kanzenbach
4bccb5d043 ethernet: ti: am65-cpts: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Kurt Kanzenbach
7b2b28c678 mlxsw: spectrum_ptp: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-and-tested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Kurt Kanzenbach
28fba67ff9 net: dsa: mv88e6xxx: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Kurt Kanzenbach
036c508ba9 ptp: Add generic ptp message type function
The message type is located at different offsets within the ptp header depending
on the ptp version (v1 or v2). Therefore, drivers which also deal with ptp v1
have some code for it.

Extract this into a helper function for drivers to be used.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Kurt Kanzenbach
bdfbb63c31 ptp: Add generic ptp v2 header parsing function
Reason: A lot of the ptp drivers - which implement hardware time stamping - need
specific fields such as the sequence id from the ptp v2 header. Currently all
drivers implement that themselves.

Introduce a generic function to retrieve a pointer to the start of the ptp v2
header.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Johannes Berg
d1fb555929 netlink: fix state reallocation in policy export
Evidently, when I did this previously, we didn't have more than
10 policies and didn't run into the reallocation path, because
it's missing a memset() for the unused policies. Fix that.

Fixes: d07dcf9aad ("netlink: add infrastructure to expose policies to userspace")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:39:36 -07:00
Cristobal Forno
f3ae59c0c0 ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct
Currently the driver reads RX and TX subCRQ handle array directly from
a DMA-mapped buffer address when it needs to make a H_SEND_SUBCRQ
hcall. This patch stores that information in the ibmvnic_sub_crq_queue
structure instead of reading from the buffer received at login. The
overall goal of this patch is to parse relevant information from the
login response buffer and store it in the driver's private data
structures so that we don't need to read directly from the buffer and
can then free up that memory.

Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:38:16 -07:00
David S. Miller
b4c8998be2 Merge branch 'Bug-fixes-for-ENA-ethernet-driver'
Shay Agroskin says:

====================
Bug fixes for ENA ethernet driver

This series adds the following:
- Fix undesired call to ena_restore after returning from suspend
- Fix condition inside a WARN_ON
- Fix overriding previous value when updating missed_tx statistic

v1->v2:
- fix bug when calling reset routine after device resources are freed (Jakub)

v2->v3:
- fix wrong hash in 'Fixes' tag
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:32:58 -07:00
Shay Agroskin
ccd143e515 net: ena: Make missed_tx stat incremental
Most statistics in ena driver are incremented, meaning that a stat's
value is a sum of all increases done to it since driver/queue
initialization.

This patch makes all statistics this way, effectively making missed_tx
statistic incremental.
Also added a comment regarding rx_drops and tx_drops to make it
clearer how these counters are calculated.

Fixes: 11095fdb71 ("net: ena: add statistics for missed tx packets")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:32:58 -07:00
Shay Agroskin
8b147f6f3e net: ena: Change WARN_ON expression in ena_del_napi_in_range()
The ena_del_napi_in_range() function unregisters the napi handler for
rings in a given range.
This function had the following WARN_ON macro:

    WARN_ON(ENA_IS_XDP_INDEX(adapter, i) &&
	    adapter->ena_napi[i].xdp_ring);

This macro prints the call stack if the expression inside of it is
true [1], but the expression inside of it is the wanted situation.
The expression checks whether the ring has an XDP queue and its index
corresponds to a XDP one.

This patch changes the expression to
    !ENA_IS_XDP_INDEX(adapter, i) && adapter->ena_napi[i].xdp_ring
which indicates an unwanted situation.

Also, change the structure of the function. The napi handler is
unregistered for all rings, and so there's no need to check whether the
index is an XDP index or not. By removing this check the code becomes
much more readable.

Fixes: 548c4940b9 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:32:58 -07:00
Shay Agroskin
63d4a4c145 net: ena: Prevent reset after device destruction
The reset work is scheduled by the timer routine whenever it
detects that a device reset is required (e.g. when a keep_alive signal
is missing).
When releasing device resources in ena_destroy_device() the driver
cancels the scheduling of the timer routine without destroying the reset
work explicitly.

This creates the following bug:
    The driver is suspended and the ena_suspend() function is called
	-> This function calls ena_destroy_device() to free the net device
	   resources
	    -> The driver waits for the timer routine to finish
	    its execution and then cancels it, thus preventing from it
	    to be called again.

    If, in its final execution, the timer routine schedules a reset,
    the reset routine might be called afterwards,and a redundant call to
    ena_restore_device() would be made.

By changing the reset routine we allow it to read the device's state
accurately.
This is achieved by checking whether ENA_FLAG_TRIGGER_RESET flag is set
before resetting the device and making both the destruction function and
the flag check are under rtnl lock.
The ENA_FLAG_TRIGGER_RESET is cleared at the end of the destruction
routine. Also surround the flag check with 'likely' because
we expect that the reset routine would be called only when
ENA_FLAG_TRIGGER_RESET flag is set.

The destruction of the timer and reset services in __ena_shutoff() have to
stay, even though the timer routine is destroyed in ena_destroy_device().
This is to avoid a case in which the reset routine is scheduled after
free_netdev() in __ena_shutoff(), which would create an access to freed
memory in adapter->flags.

Fixes: 8c5c7abdeb ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 15:32:57 -07:00
Marc Zyngier
d1ac0002dd of: address: Work around missing device_type property in pcie nodes
Recent changes to the DT PCI bus parsing made it mandatory for
device tree nodes describing a PCI controller to have the
'device_type = "pci"' property for the node to be matched.

Although this follows the letter of the specification, it
breaks existing device-trees that have been working fine
for years.  Rockchip rk3399-based systems are a prime example
of such collateral damage, and have stopped discovering their
PCI bus.

In order to paper over it, let's add a workaround to the code
matching the device type, and accept as PCI any node that is
named "pcie",

A warning will hopefully nudge the user into updating their
DT to a fixed version if they can, but the incentive is
obviously pretty small.

Fixes: 2f96593ecc ("of_address: Add bus type match for pci ranges parser")
Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200819094255.474565-1-maz@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-19 16:30:57 -06:00
Alexei Starovoitov
c1447efdaf Merge branch 'type-and-enum-value-relos'
Andrii Nakryiko says:

====================
This patch set adds libbpf support for two new classes of CO-RE relocations:
type-based (TYPE_EXISTS/TYPE_SIZE/TYPE_ID_LOCAL/TYPE_ID_TARGET) and enum
value-vased (ENUMVAL_EXISTS/ENUMVAL_VALUE):
  - TYPE_EXISTS allows to detect presence in kernel BTF of a locally-recorded
    BTF type. Useful for feature detection (new functionality often comes with
    new internal kernel types), as well as handling type renames and bigger
    refactorings.
  - TYPE_SIZE allows to get the real size (in bytes) of a specified kernel
    type. Useful for dumping internal structure as-is through perfbuf or
    ringbuf.
  - TYPE_ID_LOCAL/TYPE_ID_TARGET allow to capture BTF type ID of a BTF type in
    program's BTF or kernel BTF, respectively. These could be used for
    high-performance and space-efficient generic data dumping/logging by
    relying on small and cheap BTF type ID as a data layout descriptor, for
    post-processing on user-space side.
  - ENUMVAL_EXISTS can be used for detecting the presence of enumerator value
    in kernel's enum type. Most direct application is to detect BPF helper
    support in kernel.
  - ENUMVAL_VALUE allows to relocate real integer value of kernel enumerator
    value, which is subject to change (e.g., always a potential issue for
    internal, non-UAPI, kernel enums).

I've indicated potential applications for these relocations, but relocations
themselves are generic and unassuming and are designed to work correctly even
in unintended applications. Furthermore, relocated values become constants,
known to the verifier and could and would be used for dead branch code
detection and elimination. This makes them ideal to do all sorts of feature
detection and guarding functionality that's not available on some older (but
still supported by BPF program) kernels, while having to compile and maintain
one unified source code.

Selftests are added for all the new features. Selftests utilizing new Clang
built-ins are designed such that they will compile with older Clangs and will
be skipped during test runs. So this shouldn't cause any build and test
failures on systems with slightly outdated Clang compiler.

LLVM patches adding these relocation in Clang:
  - __builtin_btf_type_id() ([0], [1], [2]);
  - __builtin_preserve_type_info(), __builtin_preserve_enum_value() ([3], [4]).

  [0] https://reviews.llvm.org/D74572
  [1] https://reviews.llvm.org/D74668
  [2] https://reviews.llvm.org/D85174
  [3] https://reviews.llvm.org/D83878
  [4] https://reviews.llvm.org/D83242

v2->v3:
  - fix feature detection for __builtin_btf_type_id() test (Yonghong);
  - fix extra empty lines at the end of files (Yonghong);

v1->v2:
  - selftests detect built-in support and are skipped if not found (Alexei).
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-08-19 14:34:34 -07:00
Andrii Nakryiko
3357490555 selftests/bpf: Add tests for ENUMVAL_EXISTS/ENUMVAL_VALUE relocations
Add tests validating existence and value relocations for enum value-based
relocations. If __builtin_preserve_enum_value() built-in is not supported,
skip tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-6-andriin@fb.com
2020-08-19 14:19:39 -07:00
Andrii Nakryiko
eacaaed784 libbpf: Implement enum value-based CO-RE relocations
Implement two relocations of a new enumerator value-based CO-RE relocation
kind: ENUMVAL_EXISTS and ENUMVAL_VALUE.

First, ENUMVAL_EXISTS, allows to detect the presence of a named enumerator
value in the target (kernel) BTF. This is useful to do BPF helper/map/program
type support detection from BPF program side. bpf_core_enum_value_exists()
macro helper is provided to simplify built-in usage.

Second, ENUMVAL_VALUE, allows to capture enumerator integer value and relocate
it according to the target BTF, if it changes. This is useful to have
a guarantee against intentional or accidental re-ordering/re-numbering of some
of the internal (non-UAPI) enumerations, where kernel developers don't care
about UAPI backwards compatiblity concerns. bpf_core_enum_value() allows to
capture this succinctly and use correct enum values in code.

LLVM uses ldimm64 instruction to capture enumerator value-based relocations,
so add support for ldimm64 instruction patching as well.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-5-andriin@fb.com
2020-08-19 14:19:39 -07:00
Andrii Nakryiko
4836bf5e2e selftests/bpf: Add CO-RE relo test for TYPE_ID_LOCAL/TYPE_ID_TARGET
Add tests for BTF type ID relocations. To allow testing this, enhance
core_relo.c test runner to allow dynamic initialization of test inputs.
If Clang doesn't have necessary support for new functionality, test is
skipped.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-4-andriin@fb.com
2020-08-19 14:19:39 -07:00
Andrii Nakryiko
124a892d1c selftests/bpf: Test TYPE_EXISTS and TYPE_SIZE CO-RE relocations
Add selftests for TYPE_EXISTS and TYPE_SIZE relocations, testing correctness
of relocations and handling of type compatiblity/incompatibility.

If __builtin_preserve_type_info() is not supported by compiler, skip tests.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-3-andriin@fb.com
2020-08-19 14:19:39 -07:00
Andrii Nakryiko
3fc32f40c4 libbpf: Implement type-based CO-RE relocations support
Implement support for TYPE_EXISTS/TYPE_SIZE/TYPE_ID_LOCAL/TYPE_ID_REMOTE
relocations. These are examples of type-based relocations, as opposed to
field-based relocations supported already. The difference is that they are
calculating relocation values based on the type itself, not a field within
a struct/union.

Type-based relos have slightly different semantics when matching local types
to kernel target types, see comments in bpf_core_types_are_compat() for
details. Their behavior on failure to find target type in kernel BTF also
differs. Instead of "poisoning" relocatable instruction and failing load
subsequently in kernel, they return 0 (which is rarely a valid return result,
so user BPF code can use that to detect success/failure of the relocation and
deal with it without extra "guarding" relocations). Also, it's always possible
to check existence of the type in target kernel with TYPE_EXISTS relocation,
similarly to a field-based FIELD_EXISTS.

TYPE_ID_LOCAL relocation is a bit special in that it always succeeds (barring
any libbpf/Clang bugs) and resolved to BTF ID using **local** BTF info of BPF
program itself. Tests in subsequent patches demonstrate the usage and
semantics of new relocations.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200819194519.3375898-2-andriin@fb.com
2020-08-19 14:19:39 -07:00
Maciej Żenczykowski
defcffeb51 net-veth: Add type safety to veth_xdp_to_ptr() and veth_ptr_to_xdp()
This reduces likelihood of incorrect use.

Test: builds

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819020027.4072288-1-zenczykowski@gmail.com
2020-08-19 14:10:38 -07:00
Maciej Żenczykowski
596b5ef458 net-tun: Eliminate two tun/xdp related function calls from vhost-net
This provides a minor performance boost by virtue of inlining
instead of cross module function calls.

Test: builds

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819010710.3959310-2-zenczykowski@gmail.com
2020-08-19 14:02:49 -07:00
Maciej Żenczykowski
b558b6c240 net-tun: Add type safety to tun_xdp_to_ptr() and tun_ptr_to_xdp()
This reduces likelihood of incorrect use.

Test: builds

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200819010710.3959310-1-zenczykowski@gmail.com
2020-08-19 14:02:49 -07:00
Geert Uytterhoeven
4364792917 dt: writing-schema: Miscellaneous grammar fixes
- Add missing verb,
  - Fix accidental plural.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200819124850.20543-1-geert+renesas@glider.be
2020-08-19 14:21:04 -06:00
David S. Miller
0b3fc8b2e3 Merge branch 'r8169-use-napi_complete_done-return-value'
Heiner Kallweit says:

====================
r8169: use napi_complete_done return value

Consider the return value of napi_complete_done(), this allows users to
use the gro_flush_timeout sysfs attribute as an alternative to classic
interrupt coalescing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 13:03:04 -07:00
Heiner Kallweit
9e89d71911 r8169: remove member irq_enabled from struct rtl8169_private
After making use of the gro_flush_timeout attribute I once got a
tx timeout due to an interrupt that wasn't handled. Seems using
irq_enabled can be racy, and it's not needed any longer anyway,
so remove it. I've never seen a report about such a race before,
therefore treat the change as an improvement.

There's just one small drawback: If a legacy PCI interrupt is used,
and if this interrupt is shared with a device with high interrupt
rate, then we may handle interrupts even if NAPI disabled them,
and we may see a certain performance decrease under high network load.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 13:03:04 -07:00
Heiner Kallweit
52dbe8465e r8169: use napi_complete_done return value
Consider the return value of napi_complete_done(), this allows users to
use the gro_flush_timeout sysfs attribute as an alternative to classic
interrupt coalescing.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 13:03:03 -07:00
James Chapman
de993be020 Documentation/networking: update l2tp docs
Kernel documentation of L2TP has not been kept up to date and lacks
coverage of some L2TP APIs. While addressing this, refactor to improve
readability, separating the parts which focus on user APIs and
internal implementation into sections.

Changes in v2:

 - fix checkpatch warnings about trailing whitespace and long lines

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:56:27 -07:00
Miaohe Lin
f4ecc74853 net: Stop warning about SO_BSDCOMPAT usage
We've been warning about SO_BSDCOMPAT usage for many years. We may remove
this code completely now.

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:53:48 -07:00
David S. Miller
487eb2b908 Merge branch 'net-dsa-loop-Expose-VLAN-table-through-devlink'
Florian Fainelli says:

====================
net: dsa: loop: Expose VLAN table through devlink

Changes in v2:

- set the DSA configure_vlan_while_not_filtering boolean
- return the actual occupancy
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:01:42 -07:00
Florian Fainelli
142061eba3 net: dsa: loop: Return VLAN table size through devlink
We return the VLAN table size through devlink as a simple parameter, we
do not support altering it at runtime:

devlink resource show mdio_bus/fixed-0:1f
mdio_bus/fixed-0:1f:
  name VTU size 4096 occ 0 unit entry dpipe_tables none

and after configure a bridge with VLAN filtering:

devlink resource show mdio_bus/fixed-0:1f
mdio_bus/fixed-0:1f:
  name VTU size 4096 occ 1 unit entry dpipe_tables none

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:01:42 -07:00
Florian Fainelli
f0408ca45a net: dsa: loop: Configure VLANs while not filtering
Since this is a mock-up driver with no real data path for now, but we
will have one at some point, enable VLANs while not filtering.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:01:42 -07:00
Arvind Sankar
33d0f96ffd lib/string.c: Use freestanding environment
gcc can transform the loop in a naive implementation of memset/memcpy
etc into a call to the function itself.  This optimization is enabled by
-ftree-loop-distribute-patterns.

This has been the case for a while, but gcc-10.x enables this option at
-O2 rather than -O3 as in previous versions.

Add -ffreestanding, which implicitly disables this optimization with
gcc.  It is unclear whether clang performs such optimizations, but
hopefully it will also not do so in a freestanding environment.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-19 11:23:45 -07:00
Arvind Sankar
394b19d6cb x86/boot/compressed: Use builtin mem functions for decompressor
Since commits

  c041b5ad86 ("x86, boot: Create a separate string.h file to provide standard string functions")
  fb4cac573e ("x86, boot: Move memcmp() into string.h and string.c")

the decompressor stub has been using the compiler's builtin memcpy,
memset and memcmp functions, _except_ where it would likely have the
largest impact, in the decompression code itself.

Remove the #undef's of memcpy and memset in misc.c so that the
decompressor code also uses the compiler builtins.

The rationale given in the comment doesn't really apply: just because
some functions use the out-of-line version is no reason to not use the
builtin version in the rest.

Replace the comment with an explanation of why memzero and memmove are
being #define'd.

Drop the suggestion to #undef in boot/string.h as well: the out-of-line
versions are not really optimized versions, they're generic code that's
good enough for the preboot environment. The compiler will likely
generate better code for constant-size memcpy/memset/memcmp if it is
allowed to.

Most decompressors' performance is unchanged, with the exception of LZ4
and 64-bit ZSTD.

	Before	After ARCH
LZ4	  73ms	 10ms   32
LZ4	 120ms	 10ms	64
ZSTD	  90ms	 74ms	64

Measurements on QEMU on 2.2GHz Broadwell Xeon, using defconfig kernels.

Decompressor code size has small differences, with the largest being
that 64-bit ZSTD decreases just over 2k. The largest code size increase
was on 64-bit XZ, of about 400 bytes.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Nick Terrell <nickrterrell@gmail.com>
Tested-by: Nick Terrell <nickrterrell@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-19 11:23:45 -07:00
Wen Gong
3c45f21af8 ath10k: sdio: add firmware coredump support
When firmware crashes it's possible to create a coredump for later analysis,
add support to collect the register and memory info from SDIO devices.

The coredump configuration is different between QCA6174 PCI and QCA6174 SDIO,
so add specific registers and memory regions for the latter.

QCA6174 SDIO has two methods to dump the firmware: fastdump and slowdump.
Fastdump is not supported in olded versions of firmware, and for these ath10k
will automatically select slowdump. If firmware supports fastdump, ath10k will
automatically select it. QCA6174 SDIO firmware version
WLAN.RMH.4.4.1-00017-QCARMSWPZ-2 is the first version supporting fastdump.

For slowdump, ath10k_sdio_hif_diag_read() can not be used as the diag
window has a limit value, it is 4 bytes and the dump's buffer length is larger
than it, it will trigger error. So this patch adds ath10k_sdio_read_mem() to
read 4 bytes for each time.

Example output of a firmware crash:

ath10k_sdio mmc1:0001:1: simulating soft firmware crash
ath10k_sdio mmc1:0001:1: firmware crashed! (guid 413d98b1-84c0-4298-b605-2b10ec0c54a5)
ath10k_sdio mmc1:0001:1: qca6174 hw3.2 sdio target 0x05030000 chip_id 0x00000000 sub 0000:0000
ath10k_sdio mmc1:0001:1: kconfig debug 1 debugfs 1 tracing 1 dfs 0 testmode 1
ath10k_sdio mmc1:0001:1: firmware ver WLAN.RMH4.4.1-00126-QCARMSWP-1 api 6 features wowlan,ignore-otp,raw-mode crc32 b84317cf
ath10k_sdio mmc1:0001:1: board_file api 2 bmi_id 0:4 crc32 6364cfcc
ath10k_sdio mmc1:0001:1: htt-ver 3.69 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
ath10k_sdio mmc1:0001:1: firmware register dump:
ath10k_sdio mmc1:0001:1: [00]: 0x05030000 0x000015B3 0x0099908D 0x00955B31
ath10k_sdio mmc1:0001:1: [04]: 0x0099908D 0x00060730 0x00000018 0x004641A0
ath10k_sdio mmc1:0001:1: [08]: 0x0041FAA4 0x0041FA9C 0x00999070 0x00404490
ath10k_sdio mmc1:0001:1: [12]: 0x00000009 0xFFFFFFFF 0x00952CD0 0x00952CE6
ath10k_sdio mmc1:0001:1: [16]: 0x00952CC4 0x00910712 0x00000000 0x00000000
ath10k_sdio mmc1:0001:1: [20]: 0x4099908D 0x0040E9E8 0x00000001 0x00423AC0
ath10k_sdio mmc1:0001:1: [24]: 0x809F3189 0x0040EA48 0x00426240 0xC099908D
ath10k_sdio mmc1:0001:1: [28]: 0x809143A7 0x0040EA68 0x0041FAA4 0x00423A80
ath10k_sdio mmc1:0001:1: [32]: 0x809F1193 0x0040EA88 0x00411770 0x004117E0
ath10k_sdio mmc1:0001:1: [36]: 0x809F0EEE 0x0040EAA8 0x00000000 0x00000000
ath10k_sdio mmc1:0001:1: [40]: 0x80911210 0x0040EAC8 0x00000008 0x00404130
ath10k_sdio mmc1:0001:1: [44]: 0x80911154 0x0040EB28 0x00400000 0x00000000
ath10k_sdio mmc1:0001:1: [48]: 0x8091122D 0x0040EB48 0x00000000 0x00400600
ath10k_sdio mmc1:0001:1: [52]: 0x40910024 0x0040EB78 0x0040AB98 0x0040AB98
ath10k_sdio mmc1:0001:1: [56]: 0x00000000 0x0040EB98 0x009BB001 0x00040020

Tested-on: QCA6174 SDIO WLAN.RMH.4.4.1-00018-QCARMSWP-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1569310030-834-3-git-send-email-wgong@codeaurora.org
2020-08-19 20:36:19 +03:00
Wen Gong
c796d513c6 ath10k: add bus type for each layout of coredump
For some hw version, it has more than one bus type, it need to add bus
type to distinguish different chip.

Tested-on: QCA6174 SDIO WLAN.RMH.4.4.1-00018-QCARMSWP-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1569310030-834-2-git-send-email-wgong@codeaurora.org
2020-08-19 20:36:15 +03:00
Wang Hai
ad112aa8b1 SUNRPC: remove duplicate include
Remove linux/sunrpc/auth_gss.h which is included more than once

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-08-19 13:19:42 -04:00
Jens Axboe
fc666777da io_uring: use system_unbound_wq for ring exit work
We currently use system_wq, which is unbounded in terms of number of
workers. This means that if we're exiting tons of rings at the same
time, then we'll briefly spawn tons of event kworkers just for a very
short blocking time as the rings exit.

Use system_unbound_wq instead, which has a sane cap on the concurrency
level.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-19 11:10:51 -06:00
Filipe Manana
bbc37d6e47 btrfs: fix space cache memory leak after transaction abort
If a transaction aborts it can cause a memory leak of the pages array of
a block group's io_ctl structure. The following steps explain how that can
happen:

1) Transaction N is committing, currently in state TRANS_STATE_UNBLOCKED
   and it's about to start writing out dirty extent buffers;

2) Transaction N + 1 already started and another task, task A, just called
   btrfs_commit_transaction() on it;

3) Block group B was dirtied (extents allocated from it) by transaction
   N + 1, so when task A calls btrfs_start_dirty_block_groups(), at the
   very beginning of the transaction commit, it starts writeback for the
   block group's space cache by calling btrfs_write_out_cache(), which
   allocates the pages array for the block group's io_ctl with a call to
   io_ctl_init(). Block group A is added to the io_list of transaction
   N + 1 by btrfs_start_dirty_block_groups();

4) While transaction N's commit is writing out the extent buffers, it gets
   an IO error and aborts transaction N, also setting the file system to
   RO mode;

5) Task A has already returned from btrfs_start_dirty_block_groups(), is at
   btrfs_commit_transaction() and has set transaction N + 1 state to
   TRANS_STATE_COMMIT_START. Immediately after that it checks that the
   filesystem was turned to RO mode, due to transaction N's abort, and
   jumps to the "cleanup_transaction" label. After that we end up at
   btrfs_cleanup_one_transaction() which calls btrfs_cleanup_dirty_bgs().
   That helper finds block group B in the transaction's io_list but it
   never releases the pages array of the block group's io_ctl, resulting in
   a memory leak.

In fact at the point when we are at btrfs_cleanup_dirty_bgs(), the pages
array points to pages that were already released by us at
__btrfs_write_out_cache() through the call to io_ctl_drop_pages(). We end
up freeing the pages array only after waiting for the ordered extent to
complete through btrfs_wait_cache_io(), which calls io_ctl_free() to do
that. But in the transaction abort case we don't wait for the space cache's
ordered extent to complete through a call to btrfs_wait_cache_io(), so
that's why we end up with a memory leak - we wait for the ordered extent
to complete indirectly by shutting down the work queues and waiting for
any jobs in them to complete before returning from close_ctree().

We can solve the leak simply by freeing the pages array right after
releasing the pages (with the call to io_ctl_drop_pages()) at
__btrfs_write_out_cache(), since we will never use it anymore after that
and the pages array points to already released pages at that point, which
is currently not a problem since no one will use it after that, but not a
good practice anyway since it can easily lead to use-after-free issues.

So fix this by freeing the pages array right after releasing the pages at
__btrfs_write_out_cache().

This issue can often be reproduced with test case generic/475 from fstests
and kmemleak can detect it and reports it with the following trace:

unreferenced object 0xffff9bbf009fa600 (size 512):
  comm "fsstress", pid 38807, jiffies 4298504428 (age 22.028s)
  hex dump (first 32 bytes):
    00 a0 7c 4d 3d ed ff ff 40 a0 7c 4d 3d ed ff ff  ..|M=...@.|M=...
    80 a0 7c 4d 3d ed ff ff c0 a0 7c 4d 3d ed ff ff  ..|M=.....|M=...
  backtrace:
    [<00000000f4b5cfe2>] __kmalloc+0x1a8/0x3e0
    [<0000000028665e7f>] io_ctl_init+0xa7/0x120 [btrfs]
    [<00000000a1f95b2d>] __btrfs_write_out_cache+0x86/0x4a0 [btrfs]
    [<00000000207ea1b0>] btrfs_write_out_cache+0x7f/0xf0 [btrfs]
    [<00000000af21f534>] btrfs_start_dirty_block_groups+0x27b/0x580 [btrfs]
    [<00000000c3c23d44>] btrfs_commit_transaction+0xa6f/0xe70 [btrfs]
    [<000000009588930c>] create_subvol+0x581/0x9a0 [btrfs]
    [<000000009ef2fd7f>] btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
    [<00000000474e5187>] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
    [<00000000708ee349>] btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs]
    [<00000000ea60106f>] btrfs_ioctl+0x12c/0x3130 [btrfs]
    [<000000005c923d6d>] __x64_sys_ioctl+0x83/0xb0
    [<0000000043ace2c9>] do_syscall_64+0x33/0x80
    [<00000000904efbce>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-19 18:39:46 +02:00
David Sterba
604997b4a3 btrfs: use the correct const function attribute for btrfs_get_num_csums
The build robot reports

compiler: h8300-linux-gcc (GCC) 9.3.0
   In file included from fs/btrfs/tests/extent-map-tests.c:8:
>> fs/btrfs/tests/../ctree.h:2166:8: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
    2166 | size_t __const btrfs_get_num_csums(void);
         |        ^~~~~~~

The function attribute for const does not follow the expected scheme and
in this case is confused with a const type qualifier.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-19 18:39:21 +02:00
Marcos Paulo de Souza
282dd7d771 btrfs: reset compression level for lzo on remount
Currently a user can set mount "-o compress" which will set the
compression algorithm to zlib, and use the default compress level for
zlib (3):

  relatime,compress=zlib:3,space_cache

If the user remounts the fs using "-o compress=lzo", then the old
compress_level is used:

  relatime,compress=lzo:3,space_cache

But lzo does not expose any tunable compression level. The same happens
if we set any compress argument with different level, also with zstd.

Fix this by resetting the compress_level when compress=lzo is
specified.  With the fix applied, lzo is shown without compress level:

  relatime,compress=lzo,space_cache

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-19 18:39:12 +02:00