Commit Graph

122951 Commits

Author SHA1 Message Date
Pavel Begunkov
f2a8d5c7a2 io_uring: add tee(2) support
Add IORING_OP_TEE implementing tee(2) support. Almost identical to
splice bits, but without offsets.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-17 14:10:07 -06:00
Pavel Begunkov
9dafdfc2f0 splice: export do_tee()
export do_tee() for use in io_uring

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-17 14:10:07 -06:00
Florian Westphal
4930f4831b net: allow __skb_ext_alloc to sleep
mptcp calls this from the transmit side, from process context.
Allow a sleeping allocation instead of unconditional GFP_ATOMIC.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-17 12:35:34 -07:00
Linus Torvalds
fb27bc034d Merge tag 'usb-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of USB fixes for 5.7-rc6

  The "largest" in here is a bunch of raw-gadget fixes and api changes
  as the driver just showed up in -rc1 and work has been done to fix up
  some uapi issues found with the original submission, before it shows
  up in a -final release.

  Other than that, a bunch of other small USB gadget fixes, xhci fixes,
  some quirks, andother tiny fixes for reported issues.

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits)
  USB: gadget: fix illegal array access in binding with UDC
  usb: core: hub: limit HUB_QUIRK_DISABLE_AUTOSUSPEND to USB5534B
  USB: usbfs: fix mmap dma mismatch
  usb: host: xhci-plat: keep runtime active when removing host
  usb: xhci: Fix NULL pointer dereference when enqueuing trbs from urb sg list
  usb: cdns3: gadget: make a bunch of functions static
  usb: mtu3: constify struct debugfs_reg32
  usb: gadget: udc: atmel: Make some symbols static
  usb: raw-gadget: fix null-ptr-deref when reenabling endpoints
  usb: raw-gadget: documentation updates
  usb: raw-gadget: support stalling/halting/wedging endpoints
  usb: raw-gadget: fix gadget endpoint selection
  usb: raw-gadget: improve uapi headers comments
  usb: typec: mux: intel: Fix DP_HPD_LVL bit field
  usb: raw-gadget: fix return value of ep read ioctls
  usb: dwc3: select USB_ROLE_SWITCH
  usb: gadget: legacy: fix error return code in gncm_bind()
  usb: gadget: legacy: fix error return code in cdc_bind()
  usb: gadget: legacy: fix redundant initialization warnings
  usb: gadget: tegra-xudc: Fix idle suspend/resume
  ...
2020-05-17 12:31:22 -07:00
Linus Torvalds
43567139f5 Merge tag 'x86_urgent_for_v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
 "A single fix for early boot crashes of kernels built with gcc10 and
  stack protector enabled"

* tag 'x86_urgent_for_v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix early boot crash on gcc-10, third try
2020-05-17 11:08:29 -07:00
Benjamin Thiel
e8da08a088 efi: Pull up arch-specific prototype efi_systab_show_arch()
Pull up arch-specific prototype efi_systab_show_arch() in order to
fix a -Wmissing-prototypes warning:

arch/x86/platform/efi/efi.c:957:7: warning: no previous prototype for
‘efi_systab_show_arch’ [-Wmissing-prototypes]
char *efi_systab_show_arch(char *str)

Signed-off-by: Benjamin Thiel <b.thiel@posteo.de>
Link: https://lore.kernel.org/r/20200516132647.14568-1-b.thiel@posteo.de
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-05-17 11:46:50 +02:00
Christoph Paasch
a0c1d0eafd mptcp: Use 32-bit DATA_ACK when possible
RFC8684 allows to send 32-bit DATA_ACKs as long as the peer is not
sending 64-bit data-sequence numbers. The 64-bit DSN is only there for
extreme scenarios when a very high throughput subflow is combined with a
long-RTT subflow such that the high-throughput subflow wraps around the
32-bit sequence number space within an RTT of the high-RTT subflow.

It is thus a rare scenario and we should try to use the 32-bit DATA_ACK
instead as long as possible. It allows to reduce the TCP-option overhead
by 4 bytes, thus makes space for an additional SACK-block. It also makes
tcpdumps much easier to read when the DSN and DATA_ACK are both either
32 or 64-bit.

Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 13:51:10 -07:00
Linus Torvalds
5d438e071f Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A new testcase for guest debugging (gdbstub) that exposed a bunch of
  bugs, mostly for AMD processors. And a few other x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
  KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c
  KVM: SVM: Disable AVIC before setting V_IRQ
  KVM: Introduce kvm_make_all_cpus_request_except()
  KVM: VMX: pass correct DR6 for GD userspace exit
  KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6
  KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6
  KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on
  KVM: selftests: Add KVM_SET_GUEST_DEBUG test
  KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG
  KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG
  KVM: x86: fix DR6 delivery for various cases of #DB injection
  KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
2020-05-16 13:39:22 -07:00
Christoph Hellwig
2771cefeac block: remove the REQ_NOWAIT_INLINE flag
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-16 14:23:54 -06:00
Takashi Iwai
b9f2d35f05 ALSA: hda: Unexport some local helper functions
snd_hdac_bus_queue_event() and snd_hdac_bus_exec_verb() are used only
internally in HD-audio core.  Let's drop the exports and move the
declarations into local.h.

Link: https://lore.kernel.org/r/20200516062854.22141-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-16 08:29:49 +02:00
Takashi Iwai
6325c7fade ALSA: hda: Drop unused snd_hda_queue_unsol_event()
The inline function is nowhere used.  Drop it.

Link: https://lore.kernel.org/r/20200516062854.22141-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-16 08:29:35 +02:00
Arnd Bergmann
a7f6e07724 Merge tag 'tegra-for-5.8-arm-core' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/soc
ARM: tegra: Core changes for v5.8-rc1

This contains core changes needed for the CPU frequency scaling and CPU
idle drivers on Tegra20 and Tegra30.

* tag 'tegra-for-5.8-arm-core' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  ARM: tegra: Create tegra20-cpufreq platform device on Tegra30
  ARM: tegra: Don't enable PLLX while resuming from LP1 on Tegra30
  ARM: tegra: Switch CPU to PLLP on resume from LP1 on Tegra30/114/124
  ARM: tegra: Correct PL310 Auxiliary Control Register initialization
  ARM: tegra: Do not fully reinitialize L2 on resume
  ARM: tegra: Initialize r0 register for firmware wake-up
  firmware: tf: Different way of L2 cache enabling after LP2 suspend
  firmware: tegra: Make BPMP a regular driver

Link: https://lore.kernel.org/r/20200515145311.1580134-10-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-05-15 23:12:29 +02:00
Arnd Bergmann
a875e0e5a2 Merge tag 'vexpress-modules-for-soc-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into arm/soc
VExpress modularization

This series enables building various Versatile Express platform drivers
as modules. The primary target is the Fast Model FVP which is supported
in Android. As Android is moving towards their GKI, or generic kernel,
the hardware support has to be in modules. Currently ARCH_VEXPRESS
enables several built-in only drivers. Some of these are needed, but
some are only needed for older 32-bit VExpress platforms and can just
be disabled.

* tag 'vexpress-modules-for-soc-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  ARM: vexpress: Don't select VEXPRESS_CONFIG
  bus: vexpress-config: Support building as module
  vexpress: Move setting master site to vexpress-config bus
  bus: vexpress-config: simplify config bus probing
  bus: vexpress-config: Merge vexpress-syscfg into vexpress-config
  mfd: vexpress-sysreg: Support building as a module
  mfd: vexpress-sysreg: Use devres API variants
  mfd: vexpress-sysreg: Drop unused syscon child devices
  mfd: vexpress-sysreg: Drop selecting CONFIG_CLKSRC_MMIO
  clk: vexpress-osc: Support building as a module
  clk: vexpress-osc: Use the devres clock API variants
  clk: versatile: Only enable SP810 on 32-bit by default
  clk: versatile: Rework kconfig structure
  amba: Retry adding deferred devices at late_initcall
  arm64: vexpress: Don't select CONFIG_POWER_RESET_VEXPRESS
  ARM: vexpress: Move vexpress_flags_set() into arch code

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-05-15 23:04:40 +02:00
Rob Herring
e5006671ac clk: versatile: Drop the legacy IM-PD1 clock code
Now that the non-DT IM-PD1 support code has been removed, drop the clock
related code from clk-impd1.c.

Link: https://lore.kernel.org/r/20200428204945.21067-1-robh@kernel.org
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-clk@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-05-15 22:54:23 +02:00
Eric Biggers
8b85996095 linux/parser.h: add include guards
<linux/parser.h> is missing include guards.  Add them.

This is needed to allow declaring a function in <linux/fscrypt.h> that
takes a substring_t parameter.

Link: https://lore.kernel.org/r/20200512233251.118314-2-ebiggers@kernel.org
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-15 13:51:28 -07:00
David S. Miller
da07f52d3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Move the bpf verifier trace check into the new switch statement in
HEAD.

Resolve the overlapping changes in hinic, where bug fixes overlap
the addition of VF support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 13:48:59 -07:00
Linus Torvalds
f85c1598dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix sk_psock reference count leak on receive, from Xiyu Yang.

 2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.

 3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
    from Maciej Żenczykowski.

 4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
    Abeni.

 5) Fix netprio cgroup v2 leak, from Zefan Li.

 6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.

 7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.

 8) Various bpf probe handling fixes, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
  selftests: mptcp: pm: rm the right tmp file
  dpaa2-eth: properly handle buffer size restrictions
  bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
  bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
  bpf: Restrict bpf_probe_read{, str}() only to archs where they work
  MAINTAINERS: Mark networking drivers as Maintained.
  ipmr: Add lockdep expression to ipmr_for_each_table macro
  ipmr: Fix RCU list debugging warning
  drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
  net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
  tcp: fix error recovery in tcp_zerocopy_receive()
  MAINTAINERS: Add Jakub to networking drivers.
  MAINTAINERS: another add of Karsten Graul for S390 networking
  drivers: ipa: fix typos for ipa_smp2p structure doc
  pppoe: only process PADT targeted at local interfaces
  selftests/bpf: Enforce returning 0 for fentry/fexit programs
  bpf: Enforce returning 0 for fentry/fexit progs
  net: stmmac: fix num_por initialization
  security: Fix the default value of secid_to_secctx hook
  libbpf: Fix register naming in PT_REGS s390 macros
  ...
2020-05-15 13:10:06 -07:00
Paolo Abeni
2f8a397d0a inet_connection_sock: factor out destroy helper.
Move the steps to prepare an inet_connection_sock for
forced disposal inside a separate helper. No functional
changes inteded, this will just simplify the next patch.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Paolo Abeni
90bf45134d mptcp: add new sock flag to deal with join subflows
MP_JOIN subflows must not land into the accept queue.
Currently tcp_check_req() calls an mptcp specific helper
to detect such scenario.

Such helper leverages the subflow context to check for
MP_JOIN subflows. We need to deal also with MP JOIN
failures, even when the subflow context is not available
due allocation failure.

A possible solution would be changing the syn_recv_sock()
signature to allow returning a more descriptive action/
error code and deal with that in tcp_check_req().

Since the above need is MPTCP specific, this patch instead
uses a TCP request socket hole to add a MPTCP specific flag.
Such flag is used by the MPTCP syn_recv_sock() to tell
tcp_check_req() how to deal with the request socket.

This change is a no-op for !MPTCP build, and makes the
MPTCP code simpler. It allows also the next patch to deal
correctly with MP JOIN failure.

v1 -> v2:
 - be more conservative on drop_req initialization (Mat)

RFC -> v1:
 - move the drop_req bit inside tcp_request_sock (Eric)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Stefano Garzarella
7e55a19cf6 io_uring: add IORING_CQ_EVENTFD_DISABLED to the CQ ring flags
This new flag should be set/clear from the application to
disable/enable eventfd notifications when a request is completed
and queued to the CQ ring.

Before this patch, notifications were always sent if an eventfd is
registered, so IORING_CQ_EVENTFD_DISABLED is not set during the
initialization.

It will be up to the application to set the flag after initialization
if no notifications are required at the beginning.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-15 12:16:59 -06:00
Stefano Garzarella
0d9b5b3af1 io_uring: add 'cq_flags' field for the CQ ring
This patch adds the new 'cq_flags' field that should be written by
the application and read by the kernel.

This new field is available to the userspace application through
'cq_off.flags'.
We are using 4-bytes previously reserved and set to zero. This means
that if the application finds this field to zero, then the new
functionality is not supported.

In the next patch we will introduce the first flag available.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-15 12:16:59 -06:00
David S. Miller
8e1381049e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-15

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 137 insertions(+), 43 deletions(-).

The main changes are:

1) Fix secid_to_secctx LSM hook default value, from Anders.

2) Fix bug in mmap of bpf array, from Andrii.

3) Restrict bpf_probe_read to archs where they work, from Daniel.

4) Enforce returning 0 for fentry/fexit progs, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:57:21 -07:00
Kevin Lo
b0ed0bbfb3 net: phy: broadcom: add support for BCM54811 PHY
The BCM54811 PHY shares many similarities with the already supported BCM54810
PHY but additionally requires some semi-unique configuration.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:56:31 -07:00
David S. Miller
3430223d39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-05-15

The following pull-request contains BPF updates for your *net-next* tree.

We've added 37 non-merge commits during the last 1 day(s) which contain
a total of 67 files changed, 741 insertions(+), 252 deletions(-).

The main changes are:

1) bpf_xdp_adjust_tail() now allows to grow the tail as well, from Jesper.

2) bpftool can probe CONFIG_HZ, from Daniel.

3) CAP_BPF is introduced to isolate user processes that use BPF infra and
   to secure BPF networking services by dropping CAP_SYS_ADMIN requirement
   in certain cases, from Alexei.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:43:52 -07:00
Vlad Buslov
ca44b738e5 net: sched: implement terse dump support in act
Extend tcf_action_dump() with boolean argument 'terse' that is used to
request terse-mode action dump. In terse mode only essential data needed to
identify particular action (action kind, cookie, etc.) and its stats is put
to resulting skb and everything else is omitted. Implement
tcf_exts_terse_dump() helper in cls API that is intended to be used to
request terse dump of all exts (actions) attached to the filter.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Vlad Buslov
f8ab1807a9 net: sched: introduce terse dump flag
Add new TCA_DUMP_FLAGS attribute and use it in cls API to request terse
filter output from classifiers with TCA_DUMP_FLAGS_TERSE flag. This option
is intended to be used to improve performance of TC filter dump when
userland only needs to obtain stats and not the whole classifier/action
data. Extend struct tcf_proto_ops with new terse_dump() callback that must
be defined by supporting classifier implementations.

Support of the options in specific classifiers and actions is
implemented in following patches in the series.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Linus Torvalds
1742bcd0cb Merge tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Things look good and calming down; the only change to ALSA core is the
  fix for racy rawmidi buffer accesses spotted by syzkaller, and the
  rest are all small device-specific quirks for HD-audio and USB-audio
  devices"

* tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
  ALSA: hda/realtek - Add COEF workaround for ASUS ZenBook UX431DA
  ALSA: hda/realtek: Enable headset mic of ASUS UX581LV with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS UX550GE with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS GL503VM with ALC295
  ALSA: hda/realtek: Add quirk for Samsung Notebook
  ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
  ALSA: usb-audio: add mapping for ASRock TRX40 Creator
  ALSA: hda/realtek - Fix S3 pop noise on Dell Wyse
  Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
  ALSA: firewire-lib: fix 'function sizeof not defined' error of tracepoints format
  ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
2020-05-15 10:06:49 -07:00
Linus Torvalds
e7cea79058 Merge tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "As mentioned last week an i915 PR came in late, but I left it, so the
  i915 bits of this cover 2 weeks, which is why it's likely a bit larger
  than usual.

  Otherwise it's mostly amdgpu fixes, one tegra fix, one meson fix.

  i915:
   - Handle idling during i915_gem_evict_something busy loops (Chris)
   - Mark current submissions with a weak-dependency (Chris)
   - Propagate error from completed fences (Chris)
   - Fixes on execlist to avoid GPU hang situation (Chris)
   - Fixes couple deadlocks (Chris)
   - Timeslice preemption fixes (Chris)
   - Fix Display Port interrupt handling on Tiger Lake (Imre)
   - Reduce debug noise around Frame Buffer Compression (Peter)
   - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
   - Avoid dereferencing a dead context (Chris)

  tegra:
   - tegra120/4 smmu fixes

  amdgpu:
   - Clockgating fixes
   - Fix fbdev with scatter/gather display
   - S4 fix for navi
   - Soft recovery for gfx10
   - Freesync fixes
   - Atomic check cursor fix
   - Add a gfxoff quirk
   - MST fix

  amdkfd:
   - Fix GEM reference counting

  meson:
   - error code propogation fix"

* tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm: (29 commits)
  drm/i915: Handle idling during i915_gem_evict_something busy loops
  drm/meson: pm resume add return errno branch
  drm/amd/amdgpu: Update update_config() logic
  drm/amd/amdgpu: add raven1 part to the gfxoff quirk list
  drm/i915: Mark concurrent submissions with a weak-dependency
  drm/i915: Propagate error from completed fences
  drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest
  drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of inheritance.
  drm/amd/display: add basic atomic check for cursor plane
  drm/amd/display: Fix vblank and pageflip event handling for FreeSync
  drm/amdgpu: implement soft_recovery for gfx10
  drm/amdgpu: enable hibernate support on Navi1X
  drm/amdgpu: Use GEM obj reference for KFD BOs
  drm/amdgpu: force fbdev into vram
  drm/amd/powerplay: perform PG ungate prior to CG ungate
  drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
  drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate
  drm/i915/execlists: Track inflight CCID
  drm/i915/execlists: Avoid reusing the same logical CCID
  drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
  ...
2020-05-15 09:59:49 -07:00
Sami Tolvanen
628d06a48f scs: Add page accounting for shadow call stack allocations
This change adds accounting for the memory allocated for shadow stacks.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-15 16:35:49 +01:00
Sami Tolvanen
d08b9f0ca6 scs: Add support for Clang's Shadow Call Stack (SCS)
This change adds generic support for Clang's Shadow Call Stack,
which uses a shadow stack to protect return addresses from being
overwritten by an attacker. Details are available here:

  https://clang.llvm.org/docs/ShadowCallStack.html

Note that security guarantees in the kernel differ from the ones
documented for user space. The kernel must store addresses of
shadow stacks in memory, which means an attacker capable reading
and writing arbitrary memory may be able to locate them and hijack
control flow by modifying the stacks.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
[will: Numerous cosmetic changes]
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-15 16:35:45 +01:00
Alexei Starovoitov
2c78ee898d bpf: Implement CAP_BPF
Implement permissions as stated in uapi/linux/capability.h
In order to do that the verifier allow_ptr_leaks flag is split
into four flags and they are set as:
  env->allow_ptr_leaks = bpf_allow_ptr_leaks();
  env->bypass_spec_v1 = bpf_bypass_spec_v1();
  env->bypass_spec_v4 = bpf_bypass_spec_v4();
  env->bpf_capable = bpf_capable();

The first three currently equivalent to perfmon_capable(), since leaking kernel
pointers and reading kernel memory via side channel attacks is roughly
equivalent to reading kernel memory with cap_perfmon.

'bpf_capable' enables bounded loops, precision tracking, bpf to bpf calls and
other verifier features. 'allow_ptr_leaks' enable ptr leaks, ptr conversions,
subtraction of pointers. 'bypass_spec_v1' disables speculative analysis in the
verifier, run time mitigations in bpf array, and enables indirect variable
access in bpf programs. 'bypass_spec_v4' disables emission of sanitation code
by the verifier.

That means that the networking BPF program loaded with CAP_BPF + CAP_NET_ADMIN
will have speculative checks done by the verifier and other spectre mitigation
applied. Such networking BPF program will not be able to leak kernel pointers
and will not be able to access arbitrary kernel memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-3-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Alexei Starovoitov
a17b53c4a4 bpf, capability: Introduce CAP_BPF
Split BPF operations that are allowed under CAP_SYS_ADMIN into
combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN.
For backward compatibility include them in CAP_SYS_ADMIN as well.

The end result provides simple safety model for applications that use BPF:
- to load tracing program types
  BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
  use CAP_BPF and CAP_PERFMON
- to load networking program types
  BPF_PROG_TYPE_{SCHED_CLS, XDP, SK_SKB, etc}
  use CAP_BPF and CAP_NET_ADMIN

There are few exceptions from this rule:
- bpf_trace_printk() is allowed in networking programs, but it's using
  tracing mechanism, hence this helper needs additional CAP_PERFMON
  if networking program is using this helper.
- BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
  to discourage production use.
- BPF HW offload is allowed under CAP_SYS_ADMIN.
- bpf_probe_write_user() is allowed under CAP_SYS_ADMIN only.

CAPs are not checked at attach/detach time with two exceptions:
- loading BPF_PROG_TYPE_CGROUP_SKB is allowed for unprivileged users,
  hence CAP_NET_ADMIN is required at attach time.
- flow_dissector detach doesn't check prog FD at detach,
  hence CAP_NET_ADMIN is required at detach time.

CAP_SYS_ADMIN is required to iterate BPF objects (progs, maps, links) via get_next_id
command and convert them to file descriptor via GET_FD_BY_ID command.
This restriction guarantees that mutliple tasks with CAP_BPF are not able to
affect each other. That leads to clean isolation of tasks. For example:
task A with CAP_BPF and CAP_NET_ADMIN loads and attaches a firewall via bpf_link.
task B with the same capabilities cannot detach that firewall unless
task A explicitly passed link FD to task B via scm_rights or bpffs.
CAP_SYS_ADMIN can still detach/unload everything.

Two networking user apps with CAP_SYS_ADMIN and CAP_NET_ADMIN can
accidentely mess with each other programs and maps.
Two networking user apps with CAP_NET_ADMIN and CAP_BPF cannot affect each other.

CAP_NET_ADMIN + CAP_BPF allows networking programs access only packet data.
Such networking progs cannot access arbitrary kernel memory or leak pointers.

bpftool, bpftrace, bcc tools binaries should NOT be installed with
CAP_BPF and CAP_PERFMON, since unpriv users will be able to read kernel secrets.
But users with these two permissions will be able to use these tracing tools.

CAP_PERFMON is least secure, since it allows kprobes and kernel memory access.
CAP_NET_ADMIN can stop network traffic via iproute2.
CAP_BPF is the safest from security point of view and harmless on its own.

Having CAP_BPF and/or CAP_NET_ADMIN is not enough to write into arbitrary map
and if that map is used by firewall-like bpf prog.
CAP_BPF allows many bpf prog_load commands in parallel. The verifier
may consume large amount of memory and significantly slow down the system.

Existing unprivileged BPF operations are not affected.
In particular unprivileged users are allowed to load socket_filter and cg_skb
program types and to create array, hash, prog_array, map-in-map map types.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-2-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Saravana Kannan
716a7a2596 driver core: fw_devlink: Add support for batching fwnode parsing
The amount of time spent parsing fwnodes of devices can become really
high if the devices are added in an non-ideal order. Worst case can be
O(N^2) when N devices are added. But this can be optimized to O(N) by
adding all the devices and then parsing all their fwnodes in one batch.

This commit adds fw_devlink_pause() and fw_devlink_resume() to allow
doing this.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200515053500.215929-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 16:34:52 +02:00
Gabriel Krisman Bertazi
087615bf3a dm mpath: pass IO start time to path selector
The HST path selector needs this information to perform path
prediction. For request-based mpath, struct request's io_start_time_ns
is used, while for bio-based, use the start_time stored in dm_io.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-05-15 10:29:36 -04:00
Mikulas Patocka
6fbeb0048e dm bufio: implement discard
Add functions dm_bufio_issue_discard and dm_bufio_discard_buffers.
dm_bufio_issue_discard sends discard request to the underlying device.
dm_bufio_discard_buffers frees buffers in the range and then calls
dm_bufio_issue_discard.

Also, factor out block_to_sector for reuse in dm_bufio_issue_discard.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-05-15 10:29:35 -04:00
Greg Kroah-Hartman
cef077e6aa Merge tag 'iio-for-5.8b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:

Second set of new device support, cleanups and features for IIO in the 5.8 cycle

Usual mixed back but with a few subsystem wide or device type
wide cleanups.

New device support

* adis16475
  - New driver supporting adis16470, adis16475, adis16477, adis16465,
    adis16467, adis16500, adis16505 and adis16507.
    Includes some rework of the adis library to simplify using it
    for this new driver.
* ak8974
  - Add support for Alps hscdt008a. ID only. Related patches add support
    for scale.
* atlas-sensor
  - Add support for RTD-SM OEM temperature sensor.
* cm32181
  - Add support for CM3218 including support for SMBUS alert via
    ACPI resources.
* ltc2632
  - Add support for ltc2634-12/10/8 DACS including handling per
    device type numbers of channels.

Major Features

* cm32181
  - ACPI bindings including parsing CPM0 and CPM1 custom ACPI tables.
    Includes minor tidy ups and fixes.
* vcnl4000
  - Add event support
  - Add buffered data capture support
  - Add control of sampling frequency

Cleanups and minor fixes.

* core
  - Trivial rework of iio_device_alloc to use an early return and
    improve readability.
  - Precursors to addition of multiple buffer support. So far
    minor refactoring.
* subsystem wide
  - Use get_unaligned_be24 slightly improve readability over open
    coding it.
* adis drivers
  - Use iio_get_debugfs_dentry access function.
* bh1780, cm32181, cm3232, gp2ap02a00f, opt3001, st_uvis25, vl6180,
  dmard06, kxsd9
  - Drop use of of_match_ptr to allow ACPI based probing via PRP0001.
    Part of clear out of this to avoid cut and paste into new drivers.
* ad5592r, ad5593r
  - Fix typos
* ad5933
  - Use managed interfaces to automate error handling and remove.
* ak8974
  - Fix wrong number of 'real bits' for buffered data.
  - Refactor to pull measurement code out as separate function.
    bmp280
  - Fix lack of clamp on range during data capture.
* at91-sama5d2_adc
  - Handle unfinished conversions correctly.
  - Allow use of triggers other than it's own.
  - Reorganize buffer setup and tear down as part of long running
    subsystem wide rework.
* ccs811
  - Add DT binding docs and match table.
  - Support external reset and wakeup pins.
* hid-sensors
  - Reorganize buffer setup and tear down as part of long running
    subsystem wide rework.
* ltr501
  - Constify some structs.
* vcnl4000
  - Fix an endian issue by using explicit byte swapped i2c accessors.

* tag 'iio-for-5.8b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (74 commits)
  iio: light: ltr501: Constify structs
  staging: iio: ad5933: attach life-cycle of kfifo buffer to parent device and use managed calls throughout
  iio: bmp280: fix compensation of humidity
  iio: light: cm32181: Fix integartion time typo
  iio: light: cm32181: Add support for parsing CPM0 and CPM1 ACPI tables
  iio: light: cm32181: Make lux_per_bit and lux_per_bit_base_it runtime settings
  iio: light: cm32181: Use units of 1/100000th for calibscale and lux_per_bit
  iio: light: cm32181: Change reg_init to use a bitmap of which registers to init
  iio: light: cm32181: Handle CM3218 ACPI devices with 2 I2C resources
  iio: light: cm32181: Clean up the probe function a bit
  iio: light: cm32181: Add support for the CM3218
  iio: light: cm32181: Add some extra register defines
  iio: light: cm32181: Add support for ACPI enumeration
  iio: light: cm32181: Switch to new style i2c-driver probe function
  iio: hid-sensors: move triggered buffer setup into hid_sensor_setup_trigger
  iio: vcnl4000: Add buffer support for VCNL4010/20.
  iio: vcnl4000: Add sampling frequency support for VCNL4010/20.
  iio: vcnl4000: Add event support for VCNL4010/20.
  iio: vcnl4000: Factorize data reading and writing.
  iio: vcnl4000: Fix i2c swapped word reading.
  ...
2020-05-15 16:03:28 +02:00
Vinod Koul
ff4c65ca48 usb: hci: add hc_driver as argument for usb_hcd_pci_probe
usb_hcd_pci_probe expects users to call this with driver_data set as
hc_driver, that limits the possibility of using the driver_data for
driver data.

Add hc_driver as argument to usb_hcd_pci_probe and modify the callers
ehci/ohci/xhci/uhci to pass hc_driver as argument and freeup the
driver_data used

Tested xhci driver on Dragon-board RB3, compile tested ehci, ohci and
uhci.

[For all but the xHCI parts]
[For the xhci part]

Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20200514122039.300417-2-vkoul@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 15:44:34 +02:00
Emil Velikov
7fffe31d3e tty/sysrq: constify the the sysrq_key_op(s)
All the users threat them as immutable - annotate them as such.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-3-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 14:53:19 +02:00
Emil Velikov
23cbedf812 tty/sysrq: constify the sysrq API
The user is not supposed to thinker with the underlying sysrq_key_op.
Make that explicit by adding a handful of const notations.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-2-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 14:53:19 +02:00
Emil Velikov
0f1c9688a1 tty/sysrq: alpha: export and use __sysrq_get_key_op()
Export a pointer to the sysrq_get_key_op(). This way we can cleanly
unregister it, instead of the current solutions of modifuing it inplace.

Since __sysrq_get_key_op() is no longer used externally, let's make it
a static function.

This patch will allow us to limit access to each and every sysrq op and
constify the sysrq handling.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-1-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 14:53:18 +02:00
Lukas Wunner
c150c0f362 serial: Allow uart_get_rs485_mode() to return errno
We're about to amend uart_get_rs485_mode() to support a GPIO pin for
rs485 bus termination.  Retrieving the GPIO descriptor may fail, so
allow uart_get_rs485_mode() to return an errno and change all callers
to check for failure.

The GPIO descriptor is going to be stored in struct uart_port.  Pass
that struct to uart_get_rs485_mode() in lieu of a struct device and
struct serial_rs485, both of which are directly accessible from struct
uart_port.

A few drivers call uart_get_rs485_mode() before setting the struct
device pointer in struct uart_port.  Shuffle those calls around where
necessary.

[Heiko Stuebner did the ar933x_uart.c portion, hence his Signed-off-by.]

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/271e814af4b0db3bffbbb74abf2b46b75add4516.1589285873.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15 14:47:05 +02:00
Sai Praneeth Prakhya
69cf449166 iommu: Remove functions that support private domain
After moving iommu_group setup to iommu core code [1][2] and removing
private domain support in vt-d [3], there are no users for functions such
as iommu_request_dm_for_dev(), iommu_request_dma_domain_for_dev() and
request_default_domain_for_dev(). So, remove these functions.

[1] commit dce8d6964e ("iommu/amd: Convert to probe/release_device()
    call-backs")
[2] commit e5d1841f18 ("iommu/vt-d: Convert to probe/release_device()
    call-backs")
[3] commit 327d5b2fee ("iommu/vt-d: Allow 32bit devices to uses DMA
    domain")

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200513224721.20504-1-sai.praneeth.prakhya@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-15 12:00:36 +02:00
Borislav Petkov
a9a3ed1eff x86: Fix early boot crash on gcc-10, third try
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.

The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:

  Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
  Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
  Call Trace:
    dump_stack
    panic
    ? start_secondary
    __stack_chk_fail
    start_secondary
    secondary_startup_64
  -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary

This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.

To fix that, the initial attempt was to mark the one function which
generates the stack canary with:

  __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)

however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.

The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.

The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, was to add an empty asm("").

This current solution was short and sweet, and reportedly, is supported
by both compilers but we didn't get very far this time: future (LTO?)
optimization passes could potentially eliminate this, which leads us
to the third attempt: having an actual memory barrier there which the
compiler cannot ignore or move around etc.

That should hold for a long time, but hey we said that about the other
two solutions too so...

Reported-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org
2020-05-15 11:48:01 +02:00
Gustavo A. R. Silva
8695e0b1b9 i2c: mux: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-05-15 11:23:49 +02:00
Georgi Djakov
b35da2e86f Merge branch 'icc-get-by-index' into icc-next
This is an immutable branch shared with the OPP tree. It contains also
the patches to convert the interconnect framework from tristate to bool
after Greg agreed with that. This will make the integration between
the OPP layer and interconnect much easier.

* icc-get-by-index:
  interconnect: Add of_icc_get_by_index() helper function
  interconnect: Disallow interconnect core to be built as a module
  interconnect: Remove unused module exit code from core

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
2020-05-15 10:46:18 +03:00
Jesper Dangaard Brouer
c8741e2bfe xdp: Allow bpf_xdp_adjust_tail() to grow packet size
Finally, after all drivers have a frame size, allow BPF-helper
bpf_xdp_adjust_tail() to grow or extend packet size at frame tail.

Remember that helper/macro xdp_data_hard_end have reserved some
tailroom.  Thus, this helper makes sure that the BPF-prog don't have
access to this tailroom area.

V2: Remove one chicken check and use WARN_ONCE for other

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945348530.97035.12577148209134239291.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
2a637c5b1a xdp: For Intel AF_XDP drivers add XDP frame_sz
Intel drivers implement native AF_XDP zerocopy in separate C-files,
that have its own invocation of bpf_prog_run_xdp(). The setup of
xdp_buff is also handled in separately from normal code path.

This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice
and ixgbe, as the code changes needed are very similar.  Introduce a
helper function xsk_umem_xdp_frame_sz() for calculating frame size.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
34cc0b338a xdp: Xdp_frame add member frame_sz and handle in convert_to_xdp_frame
Use hole in struct xdp_frame, when adding member frame_sz, which keeps
same sizeof struct (32 bytes)

Drivers ixgbe and sfc had bug cases where the necessary/expected
tailroom was not reserved. This can lead to some hard to catch memory
corruption issues. Having the drivers frame_sz this can be detected when
packet length/end via xdp->data_end exceed the xdp_data_hard_end
pointer, which accounts for the reserved the tailroom.

When detecting this driver issue, simply fail the conversion with NULL,
which results in feedback to driver (failing xdp_do_redirect()) causing
driver to drop packet. Given the lack of consistent XDP stats, this can
be hard to troubleshoot. And given this is a driver bug, we want to
generate some more noise in form of a WARN stack dump (to ID the driver
code that inlined convert_to_xdp_frame).

Inlining the WARN macro is problematic, because it adds an asm
instruction (on Intel CPUs ud2) what influence instruction cache
prefetching. Thus, introduce xdp_warn and macro XDP_WARN, to avoid this
and at the same time make identifying the function and line of this
inlined function easier.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158945337313.97035.10015729316710496600.stgit@firesoul
2020-05-14 21:21:54 -07:00
Jesper Dangaard Brouer
f95f0f95cf xdp: Add frame size to xdp_buff
XDP have evolved to support several frame sizes, but xdp_buff was not
updated with this information. The frame size (frame_sz) member of
xdp_buff is introduced to know the real size of the memory the frame is
delivered in.

When introducing this also make it clear that some tailroom is
reserved/required when creating SKBs using build_skb().

It would also have been an option to introduce a pointer to
data_hard_end (with reserved offset). The advantage with frame_sz is
that (like rxq) drivers only need to setup/assign this value once per
NAPI cycle. Due to XDP-generic (and some drivers) it's not possible to
store frame_sz inside xdp_rxq_info, because it's varies per packet as it
can be based/depend on packet length.

V2: nitpick: deduct -> deduce

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158945334261.97035.555255657490688547.stgit@firesoul
2020-05-14 21:21:54 -07:00
David S. Miller
d00f26b623 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-05-14

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Merged tag 'perf-for-bpf-2020-05-06' from tip tree that includes CAP_PERFMON.

2) support for narrow loads in bpf_sock_addr progs and additional
   helpers in cg-skb progs, from Andrey.

3) bpf benchmark runner, from Andrii.

4) arm and riscv JIT optimizations, from Luke.

5) bpf iterator infrastructure, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 20:31:21 -07:00