Commit Graph

1072983 Commits

Author SHA1 Message Date
Krzysztof Kozlowski
dded08927c nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind()
Syzbot detected a NULL pointer dereference of nfc_llcp_sock->dev pointer
(which is a 'struct nfc_dev *') with calls to llcp_sock_sendmsg() after
a failed llcp_sock_bind(). The message being sent is a SOCK_DGRAM.

KASAN report:

  BUG: KASAN: null-ptr-deref in nfc_alloc_send_skb+0x2d/0xc0
  Read of size 4 at addr 00000000000005c8 by task llcp_sock_nfc_a/899

  CPU: 5 PID: 899 Comm: llcp_sock_nfc_a Not tainted 5.16.0-rc6-next-20211224-00001-gc6437fbf18b0 #125
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x45/0x59
   ? nfc_alloc_send_skb+0x2d/0xc0
   __kasan_report.cold+0x117/0x11c
   ? mark_lock+0x480/0x4f0
   ? nfc_alloc_send_skb+0x2d/0xc0
   kasan_report+0x38/0x50
   nfc_alloc_send_skb+0x2d/0xc0
   nfc_llcp_send_ui_frame+0x18c/0x2a0
   ? nfc_llcp_send_i_frame+0x230/0x230
   ? __local_bh_enable_ip+0x86/0xe0
   ? llcp_sock_connect+0x470/0x470
   ? llcp_sock_connect+0x470/0x470
   sock_sendmsg+0x8e/0xa0
   ____sys_sendmsg+0x253/0x3f0
   ...

The issue was visible only with multiple simultaneous calls to bind() and
sendmsg(), which resulted in most of the bind() calls to fail.  The
bind() was failing on checking if there is available WKS/SDP/SAP
(respective bit in 'struct nfc_llcp_local' fields).  When there was no
available WKS/SDP/SAP, the bind returned error but the sendmsg() to such
socket was able to trigger mentioned NULL pointer dereference of
nfc_llcp_sock->dev.

The code looks simply racy and currently it protects several paths
against race with checks for (!nfc_llcp_sock->local) which is NULL-ified
in error paths of bind().  The llcp_sock_sendmsg() did not have such
check but called function nfc_llcp_send_ui_frame() had, although not
protected with lock_sock().

Therefore the race could look like (same socket is used all the time):
  CPU0                                     CPU1
  ====                                     ====
  llcp_sock_bind()
  - lock_sock()
    - success
  - release_sock()
  - return 0
                                           llcp_sock_sendmsg()
                                           - lock_sock()
                                           - release_sock()
  llcp_sock_bind(), same socket
  - lock_sock()
    - error
                                           - nfc_llcp_send_ui_frame()
                                             - if (!llcp_sock->local)
    - llcp_sock->local = NULL
    - nfc_put_device(dev)
                                             - dereference llcp_sock->dev
  - release_sock()
  - return -ERRNO

The nfc_llcp_send_ui_frame() checked llcp_sock->local outside of the
lock, which is racy and ineffective check.  Instead, its caller
llcp_sock_sendmsg(), should perform the check inside lock_sock().

Reported-and-tested-by: syzbot+7f23bcddf626e0593a39@syzkaller.appspotmail.com
Fixes: b874dec21d ("NFC: Implement LLCP connection less Tx path")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 14:11:30 +00:00
David S. Miller
8c8963b27e Merge branch 'axienet-fixes'
Robert Hancock says:

====================
Xilinx axienet fixes

Various fixes for the Xilinx AXI Ethernet driver.

Changed since v2:
-added Reviewed-by tags, added some explanation to commit
messages, no code changes

Changed since v1:
-corrected a Fixes tag to point to mainline commit
-split up reset changes into 3 patches
-added ratelimit on netdev_warn in TX busy case
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:15 +00:00
Robert Hancock
2d19c3fd80 net: axienet: increase default TX ring size to 128
With previous changes to make the driver handle the TX ring size more
correctly, the default TX ring size of 64 appears to significantly
bottleneck TX performance to around 600 Mbps on a 1 Gbps link on ZynqMP.
Increasing this to 128 seems to bring performance up to near line rate and
shouldn't cause excess bufferbloat (this driver doesn't yet support modern
byte-based queue management).

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
bb193e3db8 net: axienet: fix for TX busy handling
Network driver documentation indicates we should be avoiding returning
NETDEV_TX_BUSY from ndo_start_xmit in normal cases, since it requires
the packets to be requeued. Instead the queue should be stopped after
a packet is added to the TX ring when there may not be enough room for an
additional one. Also, when TX ring entries are completed, we should only
wake the queue if we know there is room for another full maximally
fragmented packet.

Print a warning if there is insufficient space at the start of start_xmit,
since this should no longer happen.

Combined with increasing the default TX ring size (in a subsequent
patch), this appears to recover the TX performance lost by previous changes
to actually manage the TX ring state properly.

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
aba57a823d net: axienet: fix number of TX ring slots for available check
The check for the number of available TX ring slots was off by 1 since a
slot is required for the skb header as well as each fragment. This could
result in overwriting a TX ring slot that was still in use.

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
996defd7f8 net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
70f5817ded net: axienet: limit minimum TX ring size
The driver will not work properly if the TX ring size is set to below
MAX_SKB_FRAGS + 1 since it needs to hold at least one full maximally
fragmented packet in the TX ring. Limit setting the ring size to below
this value.

Fixes: 8b09ca823f ("net: axienet: Make RX/TX ring sizes configurable")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
95978df6fa net: axienet: add missing memory barriers
This driver was missing some required memory barriers:

Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.

Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
04cc2da396 net: axienet: reset core on initialization prior to MDIO access
In some cases where the Xilinx Ethernet core was used in 1000Base-X or
SGMII modes, which use the internal PCS/PMA PHY, and the MGT
transceiver clock source for the PCS was not running at the time the
FPGA logic was loaded, the core would come up in a state where the
PCS could not be found on the MDIO bus. To fix this, the Ethernet core
(including the PCS) should be reset after enabling the clocks, prior to
attempting to access the PCS using of_mdio_find_device.

Fixes: 1a02556086 (net: axienet: Properly handle PCS/PMA PHY for 1000BaseX mode)
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
b400c2f4f4 net: axienet: Wait for PhyRstCmplt after core reset
When resetting the device, wait for the PhyRstCmplt bit to be set
in the interrupt status register before continuing initialization, to
ensure that the core is actually ready. When using an external PHY, this
also ensures we do not start trying to access the PHY while it is still
in reset. The PHY reset is initiated by the core reset which is
triggered just above, but remains asserted for 5ms after the core is
reset according to the documentation.

The MgtRdy bit could also be waited for, but unfortunately when using
7-series devices, the bit does not appear to work as documented (it
seems to behave as some sort of link state indication and not just an
indication the transceiver is ready) so it can't really be relied on for
this purpose.

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Robert Hancock
2e5644b1ba net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 11:29:14 +00:00
Ard Biesheuvel
9f80ccda53 ARM: 9180/1: Thumb2: align ALT_UP() sections in modules sufficiently
When building for Thumb2, the .alt.smp.init sections that are emitted by
the ALT_UP() patching code may not be 32-bit aligned, even though the
fixup_smp_on_up() routine expects that. This results in alignment faults
at module load time, which need to be fixed up by the fault handler.

So let's align those sections explicitly, and prevent this from occurring.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-19 11:10:54 +00:00
Ard Biesheuvel
15420269b0 ARM: 9179/1: uaccess: avoid alignment faults in copy_[from|to]_kernel_nofault
The helpers that are used to implement copy_from_kernel_nofault() and
copy_to_kernel_nofault() cast a void* to a pointer to a wider type,
which may result in alignment faults on ARM if the compiler decides to
use double-word or multiple-word load/store instructions.

Only configurations that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
are affected, given that commit 2423de2e6f ("ARM: 9115/1: mm/maccess:
fix unaligned copy_{from,to}_kernel_nofault") ensures that dst and src
are sufficiently aligned otherwise.

So use the unaligned accessors for accessing dst and src in cases where
they may be misaligned.

Cc: <stable@vger.kernel.org> # depends on 2423de2e6f
Fixes: 2df4c9a741 ("ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-19 11:10:53 +00:00
sparkhuang
8b59b0a53c ARM: 9170/1: fix panic when kasan and kprobe are enabled
arm32 uses software to simulate the instruction replaced
by kprobe. some instructions may be simulated by constructing
assembly functions. therefore, before executing instruction
simulation, it is necessary to construct assembly function
execution environment in C language through binding registers.
after kasan is enabled, the register binding relationship will
be destroyed, resulting in instruction simulation errors and
causing kernel panic.

the kprobe emulate instruction function is distributed in three
files: actions-common.c actions-arm.c actions-thumb.c, so disable
KASAN when compiling these files.

for example, use kprobe insert on cap_capable+20 after kasan
enabled, the cap_capable assembly code is as follows:
<cap_capable>:
e92d47f0	push	{r4, r5, r6, r7, r8, r9, sl, lr}
e1a05000	mov	r5, r0
e280006c	add	r0, r0, #108    ; 0x6c
e1a04001	mov	r4, r1
e1a06002	mov	r6, r2
e59fa090	ldr	sl, [pc, #144]  ;
ebfc7bf8	bl	c03aa4b4 <__asan_load4>
e595706c	ldr	r7, [r5, #108]  ; 0x6c
e2859014	add	r9, r5, #20
......
The emulate_ldr assembly code after enabling kasan is as follows:
c06f1384 <emulate_ldr>:
e92d47f0	push	{r4, r5, r6, r7, r8, r9, sl, lr}
e282803c	add	r8, r2, #60     ; 0x3c
e1a05000	mov	r5, r0
e7e37855	ubfx	r7, r5, #16, #4
e1a00008	mov	r0, r8
e1a09001	mov	r9, r1
e1a04002	mov	r4, r2
ebf35462	bl	c03c6530 <__asan_load4>
e357000f	cmp	r7, #15
e7e36655	ubfx	r6, r5, #12, #4
e205a00f	and	sl, r5, #15
0a000001	beq	c06f13bc <emulate_ldr+0x38>
e0840107	add	r0, r4, r7, lsl #2
ebf3545c	bl	c03c6530 <__asan_load4>
e084010a	add	r0, r4, sl, lsl #2
ebf3545a	bl	c03c6530 <__asan_load4>
e2890010	add	r0, r9, #16
ebf35458	bl	c03c6530 <__asan_load4>
e5990010	ldr	r0, [r9, #16]
e12fff30	blx	r0
e356000f	cm	r6, #15
1a000014	bne	c06f1430 <emulate_ldr+0xac>
e1a06000	mov	r6, r0
e2840040	add	r0, r4, #64     ; 0x40
......

when running in emulate_ldr to simulate the ldr instruction, panic
occurred, and the log is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
00000090
pgd = ecb46400
[00000090] *pgd=2e0fa003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP ARM
PC is at cap_capable+0x14/0xb0
LR is at emulate_ldr+0x50/0xc0
psr: 600d0293 sp : ecd63af8  ip : 00000004  fp : c0a7c30c
r10: 00000000  r9 : c30897f4  r8 : ecd63cd4
r7 : 0000000f  r6 : 0000000a  r5 : e59fa090  r4 : ecd63c98
r3 : c06ae294  r2 : 00000000  r1 : b7611300  r0 : bf4ec008
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 32c5387d  Table: 2d546400  DAC: 55555555
Process bash (pid: 1643, stack limit = 0xecd60190)
(cap_capable) from (kprobe_handler+0x218/0x340)
(kprobe_handler) from (kprobe_trap_handler+0x24/0x48)
(kprobe_trap_handler) from (do_undefinstr+0x13c/0x364)
(do_undefinstr) from (__und_svc_finish+0x0/0x30)
(__und_svc_finish) from (cap_capable+0x18/0xb0)
(cap_capable) from (cap_vm_enough_memory+0x38/0x48)
(cap_vm_enough_memory) from
(security_vm_enough_memory_mm+0x48/0x6c)
(security_vm_enough_memory_mm) from
(copy_process.constprop.5+0x16b4/0x25c8)
(copy_process.constprop.5) from (_do_fork+0xe8/0x55c)
(_do_fork) from (SyS_clone+0x1c/0x24)
(SyS_clone) from (__sys_trace_return+0x0/0x10)
Code: 0050a0e1 6c0080e2 0140a0e1 0260a0e1 (f801f0e7)

Fixes: 35aa1df432 ("ARM kprobes: instruction single-stepping support")
Fixes: 421015713b ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: huangshaobo <huangshaobo6@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-19 11:10:53 +00:00
Linus Torvalds
1d1df41c5a Merge tag 'f2fs-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've tried to address some performance issues in
  f2fs_checkpoint and direct IO flows. Also, there was a work to enhance
  the page cache management used for compression. Other than them, we've
  done typical work including sysfs, code clean-ups, tracepoint, sanity
  check, in addition to bug fixes on corner cases.

  Enhancements:
   - use iomap for direct IO
   - try to avoid lock contention to improve f2fs_ckpt speed
   - avoid unnecessary memory allocation in compression flow
   - POSIX_FADV_DONTNEED drops the page cache containing compression
     pages
   - add some sysfs entries (gc_urgent_high_remaining, pending_discard)

  Bug fixes:
   - try not to expose unwritten blocks to user by DIO (this was added
     to avoid merge conflict; another patch is coming to address other
     missing case)
   - relax minor error condition for file pinning feature used in
     Android OTA
   - fix potential deadlock case in compression flow
   - should not truncate any block on pinned file

  In addition, we've done some code clean-ups and tracepoint/sanity
  check improvement"

* tag 'f2fs-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
  f2fs: do not allow partial truncation on pinned file
  f2fs: remove redunant invalidate compress pages
  f2fs: Simplify bool conversion
  f2fs: don't drop compressed page cache in .{invalidate,release}page
  f2fs: fix to reserve space for IO align feature
  f2fs: fix to check available space of CP area correctly in update_ckpt_flags()
  f2fs: support fault injection to f2fs_trylock_op()
  f2fs: clean up __find_inline_xattr() with __find_xattr()
  f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr()
  f2fs: do not bother checkpoint by f2fs_get_node_info
  f2fs: avoid down_write on nat_tree_lock during checkpoint
  f2fs: compress: fix potential deadlock of compress file
  f2fs: avoid EINVAL by SBI_NEED_FSCK when pinning a file
  f2fs: add gc_urgent_high_remaining sysfs node
  f2fs: fix to do sanity check in is_alive()
  f2fs: fix to avoid panic in is_alive() if metadata is inconsistent
  f2fs: fix to do sanity check on inode type during garbage collection
  f2fs: avoid duplicate call of mark_inode_dirty
  f2fs: show number of pending discard commands
  f2fs: support POSIX_FADV_DONTNEED drop compressed page cache
  ...
2022-01-19 11:50:20 +02:00
Linus Torvalds
e9f5cbc0c8 Merge tag 'trace-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "tracing/scripts: Possible uninitialized variable

  The 0day bot discovered a possible uninitialized path in the scripts
  that sort the mcount sections at build time. Just needed to initialize
  that variable"

* tag 'trace-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  script/sorttable: Fix some initialization problems
2022-01-19 11:44:34 +02:00
Linus Torvalds
f1b744f65e Merge tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the DA9063 as used on the HiFive Unmatched.

 - Support for relative extables, which puts us in line with other
   architectures and save some space in vmlinux.

 - A handful of kexec fixes/improvements, including the ability to run
   crash kernels from PCI-addressable memory on the HiFive Unmatched.

 - Support for the SBI SRST extension, which allows systems that do not
   have an explicit driver in Linux to reboot.

 - A handful of fixes and cleanups, including to the defconfigs and
   device trees.

* tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  RISC-V: Use SBI SRST extension when available
  riscv: mm: fix wrong phys_ram_base value for RV64
  RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n
  riscv: head: remove useless __PAGE_ALIGNED_BSS and .balign
  riscv: errata: alternative: mark vendor_patch_func __initdata
  riscv: head: make secondary_start_common() static
  riscv: remove cpu_stop()
  riscv: try to allocate crashkern region from 32bit addressible memory
  riscv: use hart id instead of cpu id on machine_kexec
  riscv: Don't use va_pa_offset on kdump
  riscv: dts: sifive: fu540-c000: Fix PLIC node
  riscv: dts: sifive: fu540-c000: Drop bogus soc node compatible values
  riscv: dts: sifive: Group tuples in register properties
  riscv: dts: sifive: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Fix clock controller node
  riscv: dts: microchip: mpfs: Fix reference clock node
  riscv: dts: microchip: mpfs: Fix PLIC node
  riscv: dts: microchip: mpfs: Drop empty chosen node
  riscv: dts: canaan: Group tuples in interrupt properties
  ...
2022-01-19 11:38:21 +02:00
Linus Torvalds
fd6f57bfda Merge tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Add new kconfig target 'make mod2noconfig', which will be useful to
   speed up the build and test iteration.

 - Raise the minimum supported version of LLVM to 11.0.0

 - Refactor certs/Makefile

 - Change the format of include/config/auto.conf to stop double-quoting
   string type CONFIG options.

 - Fix ARCH=sh builds in dash

 - Separate compression macros for general purposes (cmd_bzip2 etc.) and
   the ones for decompressors (cmd_bzip2_with_size etc.)

 - Misc Makefile cleanups

* tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
  kbuild: add cmd_file_size
  arch: decompressor: remove useless vmlinux.bin.all-y
  kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}
  kbuild: drop $(size_append) from cmd_zstd
  sh: rename suffix-y to suffix_y
  doc: kbuild: fix default in `imply` table
  microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV}
  certs: move scripts/extract-cert to certs/
  kbuild: do not quote string values in include/config/auto.conf
  kbuild: do not include include/config/auto.conf from shell scripts
  certs: simplify $(srctree)/ handling and remove config_filename macro
  kbuild: stop using config_filename in scripts/Makefile.modsign
  certs: remove misleading comments about GCC PR
  certs: refactor file cleaning
  certs: remove unneeded -I$(srctree) option for system_certificates.o
  certs: unify duplicated cmd_extract_certs and improve the log
  certs: use $< and $@ to simplify the key generation rule
  kbuild: remove headers_check stub
  kbuild: move headers_check.pl to usr/include/
  certs: use if_changed to re-generate the key when the key type is changed
  ...
2022-01-19 11:15:19 +02:00
Padmanabha Srinivasaiah
0a3d12ab50 drm/vc4: Fix deadlock on DSI device attach error
DSI device attach to DSI host will be done with host device's lock
held.

Un-registering host in "device attach" error path (ex: probe retry)
will result in deadlock with below call trace and non operational
DSI display.

Startup Call trace:
[   35.043036]  rt_mutex_slowlock.constprop.21+0x184/0x1b8
[   35.043048]  mutex_lock_nested+0x7c/0xc8
[   35.043060]  device_del+0x4c/0x3e8
[   35.043075]  device_unregister+0x20/0x40
[   35.043082]  mipi_dsi_remove_device_fn+0x18/0x28
[   35.043093]  device_for_each_child+0x68/0xb0
[   35.043105]  mipi_dsi_host_unregister+0x40/0x90
[   35.043115]  vc4_dsi_host_attach+0xf0/0x120 [vc4]
[   35.043199]  mipi_dsi_attach+0x30/0x48
[   35.043209]  tc358762_probe+0x128/0x164 [tc358762]
[   35.043225]  mipi_dsi_drv_probe+0x28/0x38
[   35.043234]  really_probe+0xc0/0x318
[   35.043244]  __driver_probe_device+0x80/0xe8
[   35.043254]  driver_probe_device+0xb8/0x118
[   35.043263]  __device_attach_driver+0x98/0xe8
[   35.043273]  bus_for_each_drv+0x84/0xd8
[   35.043281]  __device_attach+0xf0/0x150
[   35.043290]  device_initial_probe+0x1c/0x28
[   35.043300]  bus_probe_device+0xa4/0xb0
[   35.043308]  deferred_probe_work_func+0xa0/0xe0
[   35.043318]  process_one_work+0x254/0x700
[   35.043330]  worker_thread+0x4c/0x448
[   35.043339]  kthread+0x19c/0x1a8
[   35.043348]  ret_from_fork+0x10/0x20

Shutdown Call trace:
[  365.565417] Call trace:
[  365.565423]  __switch_to+0x148/0x200
[  365.565452]  __schedule+0x340/0x9c8
[  365.565467]  schedule+0x48/0x110
[  365.565479]  schedule_timeout+0x3b0/0x448
[  365.565496]  wait_for_completion+0xac/0x138
[  365.565509]  __flush_work+0x218/0x4e0
[  365.565523]  flush_work+0x1c/0x28
[  365.565536]  wait_for_device_probe+0x68/0x158
[  365.565550]  device_shutdown+0x24/0x348
[  365.565561]  kernel_restart_prepare+0x40/0x50
[  365.565578]  kernel_restart+0x20/0x70
[  365.565591]  __do_sys_reboot+0x10c/0x220
[  365.565605]  __arm64_sys_reboot+0x2c/0x38
[  365.565619]  invoke_syscall+0x4c/0x110
[  365.565634]  el0_svc_common.constprop.3+0xfc/0x120
[  365.565648]  do_el0_svc+0x2c/0x90
[  365.565661]  el0_svc+0x4c/0xf0
[  365.565671]  el0t_64_sync_handler+0x90/0xb8
[  365.565682]  el0t_64_sync+0x180/0x184

Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220118005127.29015-1-treasure4paddy@gmail.com
2022-01-19 10:08:48 +01:00
Linus Torvalds
0ed9059756 Merge branch 'random-5.17-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:

 - Some Kconfig changes resulted in BIG_KEYS being unselectable, which
   Justin sent a patch to fix.

 - Geert pointed out that moving to BLAKE2s bloated vmlinux on little
   machines, like m68k, so we now compensate for this.

 - Numerous style and house cleaning fixes, meant to have a cleaner base
   for future changes.

* 'random-5.17-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: simplify arithmetic function flow in account()
  random: selectively clang-format where it makes sense
  random: access input_pool_data directly rather than through pointer
  random: cleanup fractional entropy shift constants
  random: prepend remaining pool constants with POOL_
  random: de-duplicate INPUT_POOL constants
  random: remove unused OUTPUT_POOL constants
  random: rather than entropy_store abstraction, use global
  random: remove unused extract_entropy() reserved argument
  random: remove incomplete last_data logic
  random: cleanup integer types
  random: cleanup poolinfo abstraction
  random: fix typo in comments
  lib/crypto: sha1: re-roll loops to reduce code size
  lib/crypto: blake2s: move hmac construction into wireguard
  lib/crypto: add prompts back to crypto libraries
2022-01-19 10:39:11 +02:00
Linus Torvalds
39b419eaf0 Merge tag 'hwlock-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull hwspinlock updates from Bjorn Andersson:
 "This contains a change to the stm32 hwspinlock driver to ensure that
  the hardware is operational even without CONFIG_PM"

* tag 'hwlock-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  hwspinlock: stm32: enable clock at probe
2022-01-19 10:29:20 +02:00
Kalle Valo
a1222ca068 MAINTAINERS: remove extra wireless section
There's an unneeded and almost empty wireless section in MAINTAINERS, seems to
be leftovers from commit 0e324cf640 ("MAINTAINERS: changes for wireless"). I
don't see any need for that so let's remove it.

Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220117181958.3509-2-kvalo@kernel.org
2022-01-19 10:05:07 +02:00
Kalle Valo
51b667a32d MAINTAINERS: add common wireless and wireless-next trees
For easier maintenance we have decided to create common wireless and
wireless-next trees for all wireless patches. Old mac80211 and wireless-drivers
trees will not be used anymore.

While at it, add a wiki link to wireless drivers section and a patchwork link
to 802.11, mac80211 and rfkill sections. Also use https in patchwork links.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220117181958.3509-1-kvalo@kernel.org
2022-01-19 10:05:07 +02:00
Jakub Kicinski
99845220d3 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2022-01-19

We've added 12 non-merge commits during the last 8 day(s) which contain
a total of 12 files changed, 262 insertions(+), 64 deletions(-).

The main changes are:

1) Various verifier fixes mainly around register offset handling when
   passed to helper functions, from Daniel Borkmann.

2) Fix XDP BPF link handling to assert program type,
   from Toke Høiland-Jørgensen.

3) Fix regression in mount parameter handling for BPF fs,
   from Yafang Shao.

4) Fix incorrect integer literal when marking scratched stack slots
   in verifier, from Christy Lee.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, selftests: Add ringbuf memory type confusion test
  bpf, selftests: Add various ringbuf tests with invalid offset
  bpf: Fix ringbuf memory type confusion when passing to helpers
  bpf: Fix out of bounds access for ringbuf helpers
  bpf: Generally fix helper register offset check
  bpf: Mark PTR_TO_FUNC register initially with zero offset
  bpf: Generalize check_ctx_reg for reuse with other types
  bpf: Fix incorrect integer literal used for marking scratched stack.
  bpf/selftests: Add check for updating XDP bpf_link with wrong program type
  bpf/selftests: convert xdp_link test to ASSERT_* macros
  xdp: check prog type before updating BPF link
  bpf: Fix mount source show for bpffs
====================

Link: https://lore.kernel.org/r/20220119011825.9082-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-18 19:28:29 -08:00
Jens Axboe
ccbf726171 io_uring: perform poll removal even if async work removal is successful
An active work can have poll armed, hence it's not enough to just do
the async work removal and return the value if it's different from "not
found". Rather than make poll removal special, just fall through to do
the remaining type lookups and removals.

Reported-by: Florian Fischer <florian.fl.fischer@fau.de>
Link: https://lore.kernel.org/io-uring/20220118151337.fac6cthvbnu7icoc@pasture/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:28:43 -07:00
Jens Axboe
361aee450c io-wq: add intermediate work step between pending list and active work
We have a gap where a worker removes an item from the work list and to
when it gets added as the workers active work. In this state, the work
item cannot be found by cancelations. This is a small window, but it does
exist.

Add a temporary pointer to a work item that isn't on the pending work
list anymore, but also not the active work. This is needed as we need
to drop the wqe lock in between grabbing the work item and marking it
as active, to ensure that signal based cancelations are properly
ordered.

Reported-by: Florian Fischer <florian.fl.fischer@fau.de>
Link: https://lore.kernel.org/io-uring/20220118151337.fac6cthvbnu7icoc@pasture/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:28:12 -07:00
Jens Axboe
efdf518459 io-wq: perform both unstarted and started work cancelations in one go
Rather than split these into two separate lookups and matches, combine
them into one loop. This will become important when we can guarantee
that we don't have a window where a pending work item isn't discoverable
in either state.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:27:59 -07:00
Jens Axboe
36e4c58bf0 io-wq: invoke work cancelation with wqe->lock held
io_wqe_cancel_pending_work() grabs it internally, grab it upfront
instead. For the running work cancelation, grab the lock around it as
well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:27:59 -07:00
Jens Axboe
081b582046 io-wq: make io_worker lock a raw spinlock
In preparation to nesting it under the wqe lock (which is raw due to
being acquired from the scheduler side), change the io_worker lock from
a normal spinlock to a raw spinlock.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:27:59 -07:00
Jens Axboe
ea6e7ceeda io-wq: remove useless 'work' argument to __io_worker_busy()
We don't use 'work' anymore in the busy logic, remove the dead argument.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 19:10:11 -07:00
Daniel Borkmann
37c8d4807d bpf, selftests: Add ringbuf memory type confusion test
Add two tests, one which asserts that ring buffer memory can be passed to
other helpers for populating its entry area, and another one where verifier
rejects different type of memory passed to bpf_ringbuf_submit().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:27:03 +01:00
Daniel Borkmann
722e4db3ae bpf, selftests: Add various ringbuf tests with invalid offset
Assert that the verifier is rejecting invalid offsets on the ringbuf entries:

  # ./test_verifier | grep ring
  #947/u ringbuf: invalid reservation offset 1 OK
  #947/p ringbuf: invalid reservation offset 1 OK
  #948/u ringbuf: invalid reservation offset 2 OK
  #948/p ringbuf: invalid reservation offset 2 OK

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:49 +01:00
Daniel Borkmann
a672b2e36a bpf: Fix ringbuf memory type confusion when passing to helpers
The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument, and thus both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.

While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other
helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now
only enforce a register type of PTR_TO_MEM.

This can lead to potential type confusion since it would allow other PTR_TO_MEM
memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve().

Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that:

 - bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC
 - bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC
   but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM
 - however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM
   to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their
   func proto description

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:46 +01:00
Daniel Borkmann
64620e0a1e bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.

Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.

The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.

The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang@gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:39 +01:00
Daniel Borkmann
6788ab2350 bpf: Generally fix helper register offset check
Right now the assertion on check_ptr_off_reg() is only enforced for register
types PTR_TO_CTX (and open coded also for PTR_TO_BTF_ID), however, this is
insufficient since many other PTR_TO_* register types such as PTR_TO_FUNC do
not handle/expect register offsets when passed to helper functions.

Given this can slip-through easily when adding new types, make this an explicit
allow-list and reject all other current and future types by default if this is
encountered.

Also, extend check_ptr_off_reg() to handle PTR_TO_BTF_ID as well instead of
duplicating it. For PTR_TO_BTF_ID, reg->off is used for BTF to match expected
BTF ids if struct offset is used. This part still needs to be allowed, but the
dynamic off from the tnum must be rejected.

Fixes: 69c087ba62 ("bpf: Add bpf_for_each_map_elem() helper")
Fixes: eaa6bcb71e ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:34 +01:00
Daniel Borkmann
d400a6cf1c bpf: Mark PTR_TO_FUNC register initially with zero offset
Similar as with other pointer types where we use ldimm64, clear the register
content to zero first, and then populate the PTR_TO_FUNC type and subprogno
number. Currently this is not done, and leads to reuse of stale register
tracking data.

Given for special ldimm64 cases we always clear the register offset, make it
common for all cases, so it won't be forgotten in future.

Fixes: 69c087ba62 ("bpf: Add bpf_for_each_map_elem() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:29 +01:00
Daniel Borkmann
be80a1d3f9 bpf: Generalize check_ctx_reg for reuse with other types
Generalize the check_ctx_reg() helper function into a more generic named one
so that it can be reused for other register types as well to check whether
their offset is non-zero. No functional change.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19 01:21:24 +01:00
Christian König
4722f46389 drm/radeon: fix error handling in radeon_driver_open_kms
The return value was never initialized so the cleanup code executed when
it isn't even necessary.

Just add proper error handling.

Fixes: ab50cb9df8 ("drm/radeon/radeon_kms: Fix a NULL pointer dereference in radeon_driver_open_kms()")
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 18:00:58 -05:00
Jingwen Chen
9a458402fb drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV
[Why]
This fixes 892deb4826 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange").
we should read pf2vf data based at mman.fw_vram_usage_va after gmc
sw_init. commit 892deb4826 breaks this logic.

[How]
calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to
set the right base in the right sequence.

v2:
call amdgpu_virt_init_data_exchange after gmc sw_init to make data
exchange workqueue run

v3:
clean up the code logic

v4:
add some comment and make the code more readable

Fixes: 892deb4826 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange")
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 18:00:58 -05:00
Guchun Chen
520d9cd267 drm/amdgpu: apply vcn harvest quirk
This is a following patch to apply the workaround only on
those boards with a bad harvest table in ip discovery.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 18:00:58 -05:00
Namjae Jeon
ac090d9c90 ksmbd: fix guest connection failure with nautilus
MS-SMB2 describe session sign like the following.
Session.SigningRequired MUST be set to TRUE under the following conditions:
 - If the SMB2_NEGOTIATE_SIGNING_REQUIRED bit is set in the SecurityMode
   field of the client request.
 - If the SMB2_SESSION_FLAG_IS_GUEST bit is not set in the SessionFlags
   field and Session.IsAnonymous is FALSE and either Connection.ShouldSign
   or global RequireMessageSigning is TRUE.

When trying guest account connection using nautilus, The login failure
happened on session setup. ksmbd does not allow this connection
when the user is a guest and the connection sign is set. Just do not set
session sign instead of error response as described in the specification.
And this change improves the guest connection in Nautilus.

Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-18 16:53:20 -06:00
Dan Carpenter
b207602fb0 ksmbd: uninitialized variable in create_socket()
The "ksmbd_socket" variable is not initialized on this error path.

Cc: stable@vger.kernel.org
Fixes: 0626e6641f ("cifsd: add server handler for central processing and tranport layers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-18 16:53:14 -06:00
Hyunchul Lee
2fd5dcb1c8 ksmbd: smbd: fix missing client's memory region invalidation
if the Channel of a SMB2 WRITE request is
SMB2_CHANNEL_RDMA_V1_INVALIDTE, a client
does not invalidate its memory regions but
ksmbd must do it by sending a SMB2 WRITE response
with IB_WR_SEND_WITH_INV.

But if errors occur while processing a SMB2
READ/WRITE request, ksmbd sends a response
with IB_WR_SEND. So a client could use memory
regions already in use.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-18 16:53:08 -06:00
Steve French
e4e2787bef smb3: add new defines from protocol specification
In the October updates to MS-SMB2 two additional FSCTLs
were described.  Add the missing defines for these,
as well as fix a typo in an earlier define.

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-18 16:50:47 -06:00
J. Bruce Fields
6e7f90d163 lockd: fix server crash on reboot of client holding lock
I thought I was iterating over the array when actually the iteration is
over the values contained in the array?

Ugh, keep it simple.

Symptoms were a null deference in vfs_lock_file() when an NFSv3 client
that previously held a lock came back up and sent a notify.

Reported-by: Jonathan Woithe <jwoithe@just42.net>
Fixes: 7f024fcd5c ("Keep read and write fds with each nlm_file")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-18 17:07:48 -05:00
Lucas De Marchi
9c494ca4d3 x86/gpu: Reserve stolen memory for first integrated Intel GPU
"Stolen memory" is memory set aside for use by an Intel integrated GPU.
The intel_graphics_quirks() early quirk reserves this memory when it is
called for a GPU that appears in the intel_early_ids[] table of integrated
GPUs.

Previously intel_graphics_quirks() was marked as QFLAG_APPLY_ONCE, so it
was called only for the first Intel GPU found.  If a discrete GPU happened
to be enumerated first, intel_graphics_quirks() was called for it but not
for any integrated GPU found later.  Therefore, stolen memory for such an
integrated GPU was never reserved.

For example, this problem occurs in this Alderlake-P (integrated) + DG2
(discrete) topology where the DG2 is found first, but stolen memory is
associated with the integrated GPU:

  - 00:01.0 Bridge
    `- 03:00.0 DG2 discrete GPU
  - 00:02.0 Integrated GPU (with stolen memory)

Remove the QFLAG_APPLY_ONCE flag and call intel_graphics_quirks() for every
Intel GPU.  Reserve stolen memory for the first GPU that appears in
intel_early_ids[].

[bhelgaas: commit log, add code comment, squash in
https://lore.kernel.org/r/20220118190558.2ququ4vdfjuahicm@ldmartin-desk2]
Link: https://lore.kernel.org/r/20220114002843.2083382-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2022-01-18 14:19:06 -06:00
Darrick J. Wong
a8e422af69 xfs: remove unused xfs_ioctl32.h declarations
Remove these unused ia32 compat declarations; all the bits involved have
either been withdrawn or hoisted to the VFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2022-01-18 10:18:36 -08:00
Yinan Liu
35140d399d script/sorttable: Fix some initialization problems
elf_mcount_loc and mcount_sort_thread definitions are not
initialized immediately within the function, which can cause
the judgment logic to use uninitialized values when the
initialization logic of subsequent code fails.

Link: https://lkml.kernel.org/r/20211212113358.34208-2-yinan@linux.alibaba.com
Link: https://lkml.kernel.org/r/20220118065241.42364-1-yinan@linux.alibaba.com

Fixes: 72b3942a17 ("scripts: ftrace - move the sort-processing in ftrace_init")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yinan Liu <yinan@linux.alibaba.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-18 10:17:18 -05:00
Eric Dumazet
2836615aa2 netns: add schedule point in ops_exit_list()
When under stress, cleanup_net() can have to dismantle
netns in big numbers. ops_exit_list() currently calls
many helpers [1] that have no schedule point, and we can
end up with soft lockups, particularly on hosts
with many cpus.

Even for moderate amount of netns processed by cleanup_net()
this patch avoids latency spikes.

[1] Some of these helpers like fib_sync_up() and fib_sync_down_dev()
are very slow because net/ipv4/fib_semantics.c uses host-wide hash tables,
and ifindex is used as the only input of two hash functions.
    ifindexes tend to be the same for all netns (lo.ifindex==1 per instance)
    This will be fixed in a separate patch.

Fixes: 72ad937abd ("net: Add support for batching network namespace cleanups")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-18 13:40:58 +00:00
Christoph Hellwig
fd9f4e62a3 block: assign bi_bdev for cloned bios in blk_rq_prep_clone
bio_clone_fast() sets the cloned bio to have the same ->bi_bdev as the
source bio. This means that when request-based dm called setup_clone(),
the cloned bio had its ->bi_bdev pointing to the dm device. After Commit
0b6e522cdc ("blk-mq: use ->bi_bdev for I/O accounting")
__blk_account_io_start() started using the request's ->bio->bi_bdev for
I/O accounting, if it was set. This caused IO going to the underlying
devices to use the dm device for their I/O accounting.

Set up the proper ->bi_bdev in blk_rq_prep_clone based on the whole
device bdev for the queue the request is cloned onto.

Fixes: 0b6e522cdc ("blk-mq: use ->bi_bdev for I/O accounting")
Reported-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[hch: the commit message is mostly from a different patch from Benjamin]
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
Link: https://lore.kernel.org/r/20220118070444.1241739-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-18 06:34:05 -07:00