Commit Graph

801180 Commits

Author SHA1 Message Date
Stefan Chulski
86d05ee593 net: mvpp2: TCAM entry enable should be written after SRAM data
[ Upstream commit 43f4a20a12 ]

Last TCAM data contains TCAM enable bit.
It should be written after SRAM data before entry enabled.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1612172139-28343-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10 09:21:06 +01:00
Xie He
8b8e0d0b71 net: lapb: Copy the skb before sending a packet
[ Upstream commit 88c7a9fd9b ]

When sending a packet, we will prepend it with an LAPB header.
This modifies the shared parts of a cloned skb, so we should copy the
skb rather than just clone it, before we prepend the header.

In "Documentation/networking/driver.rst" (the 2nd point), it states
that drivers shouldn't modify the shared parts of a cloned skb when
transmitting.

The "dev_queue_xmit_nit" function in "net/core/dev.c", which is called
when an skb is being sent, clones the skb and sents the clone to
AF_PACKET sockets. Because the LAPB drivers first remove a 1-byte
pseudo-header before handing over the skb to us, if we don't copy the
skb before prepending the LAPB header, the first byte of the packets
received on AF_PACKET sockets can be corrupted.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Link: https://lore.kernel.org/r/20210201055706.415842-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10 09:21:06 +01:00
Zyta Szpak
37605bb23b arm64: dts: ls1046a: fix dcfg address range
[ Upstream commit aa880c6f3e ]

Dcfg was overlapping with clockgen address space which resulted
in failure in memory allocation for dcfg. According regs description
dcfg size should not be bigger than 4KB.

Signed-off-by: Zyta Szpak <zr@semihalf.com>
Fixes: 8126d88162 ("arm64: dts: add QorIQ LS1046A SoC support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10 09:21:06 +01:00
David Howells
d154251980 rxrpc: Fix deadlock around release of dst cached on udp tunnel
[ Upstream commit 5399d52233 ]

AF_RXRPC sockets use UDP ports in encap mode.  This causes socket and dst
from an incoming packet to get stolen and attached to the UDP socket from
whence it is leaked when that socket is closed.

When a network namespace is removed, the wait for dst records to be cleaned
up happens before the cleanup of the rxrpc and UDP socket, meaning that the
wait never finishes.

Fix this by moving the rxrpc (and, by dependence, the afs) private
per-network namespace registrations to the device group rather than subsys
group.  This allows cached rxrpc local endpoints to be cleared and their
UDP sockets closed before we try waiting for the dst records.

The symptom is that lines looking like the following:

	unregister_netdevice: waiting for lo to become free

get emitted at regular intervals after running something like the
referenced syzbot test.

Thanks to Vadim for tracking this down and work out the fix.

Reported-by: syzbot+df400f2f24a1677cd7e0@syzkaller.appspotmail.com
Reported-by: Vadim Fedorenko <vfedorenko@novek.ru>
Fixes: 5271953cad ("rxrpc: Use the UDP encap_rcv hook")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/161196443016.3868642.5577440140646403533.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10 09:21:06 +01:00
Alexey Dobriyan
e923e82dfa Input: i8042 - unbreak Pegatron C15B
[ Upstream commit a3a9060eca ]

g++ reports

	drivers/input/serio/i8042-x86ia64io.h:225:3: error: ‘.matches’ designator used multiple times in the same initializer list

C99 semantics is that last duplicated initialiser wins,
so DMI entry gets overwritten.

Fixes: a48491c65b ("Input: i8042 - add ByteSpeed touchpad to noloop table")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Link: https://lore.kernel.org/r/20201228072335.GA27766@localhost.localdomain
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10 09:21:06 +01:00
Arnd Bergmann
2a6e9df111 elfcore: fix building with clang
commit 6e7b64b9dd upstream.

kernel/elfcore.c only contains weak symbols, which triggers a bug with
clang in combination with recordmcount:

  Cannot find symbol for section 2: .text.
  kernel/elfcore.o: failed

Move the empty stubs into linux/elfcore.h as inline functions.  As only
two architectures use these, just use the architecture specific Kconfig
symbols to key off the declaration.

Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Barret Rhoden <brho@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10 09:21:06 +01:00
Christoph Schemmel
ae9aff4a09 USB: serial: option: Adding support for Cinterion MV31
commit e478d6029d upstream.

Adding support for Cinterion device MV31 for enumeration with
PID 0x00B3 and 0x00B7.

usb-devices output for 0x00B3
T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  6 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b3 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00B3 USB Mobile Broadband
S:  SerialNumber=b3246eed
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=cdc_wdm
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

usb-devices output for 0x00B7
T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  5 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b7 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00B3 USB Mobile Broadband
S:  SerialNumber=b3246eed
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

Signed-off-by: Christoph Schemmel <christoph.schemmel@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10 09:21:06 +01:00
Chenxin Jin
e2e3794367 USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
commit 43377df704 upstream.

Teraoka AD2000 uses the CP210x driver, but the chip VID/PID is
customized with 0988/0578. We need the driver to support the new
VID/PID.

Signed-off-by: Chenxin Jin <bg4akv@hotmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10 09:21:05 +01:00
Pho Tran
f8609939e6 USB: serial: cp210x: add pid/vid for WSDA-200-USB
commit 3c4f6ecd93 upstream.

Information pid/vid of WSDA-200-USB, Lord corporation company:
vid: 199b
pid: ba30

Signed-off-by: Pho Tran <pho.tran@silabs.com>
[ johan: amend comment with product name ]
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10 09:21:05 +01:00
Greg Kroah-Hartman
b6032ab67a Linux 4.19.174
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20210205140649.825180779@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:38 +01:00
Peter Zijlstra
1746d1dcae workqueue: Restrict affinity change to rescuer
[ Upstream commit 640f17c824 ]

create_worker() will already set the right affinity using
kthread_bind_mask(), this means only the rescuer will need to change
it's affinity.

Howveer, while in cpu-hot-unplug a regular task is not allowed to run
on online&&!active as it would be pushed away quite agressively. We
need KTHREAD_IS_PER_CPU to survive in that environment.

Therefore set the affinity after getting that magic flag.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.826629830@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:38 +01:00
Peter Zijlstra
fbad32181a kthread: Extract KTHREAD_IS_PER_CPU
[ Upstream commit ac687e6e8c ]

There is a need to distinguish geniune per-cpu kthreads from kthreads
that happen to have a single CPU affinity.

Geniune per-cpu kthreads are kthreads that are CPU affine for
correctness, these will obviously have PF_KTHREAD set, but must also
have PF_NO_SETAFFINITY set, lest userspace modify their affinity and
ruins things.

However, these two things are not sufficient, PF_NO_SETAFFINITY is
also set on other tasks that have their affinities controlled through
other means, like for instance workqueues.

Therefore another bit is needed; it turns out kthread_create_per_cpu()
already has such a bit: KTHREAD_IS_PER_CPU, which is used to make
kthread_park()/kthread_unpark() work correctly.

Expose this flag and remove the implicit setting of it from
kthread_create_on_cpu(); the io_uring usage of it seems dubious at
best.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.557620262@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:38 +01:00
Josh Poimboeuf
52b4c58bac objtool: Don't fail on missing symbol table
[ Upstream commit 1d489151e9 ]

Thanks to a recent binutils change which doesn't generate unused
symbols, it's now possible for thunk_64.o be completely empty without
CONFIG_PREEMPTION: no text, no data, no symbols.

We could edit the Makefile to only build that file when
CONFIG_PREEMPTION is enabled, but that will likely create confusion
if/when the thunks end up getting used by some other code again.

Just ignore it and move on.

Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1254
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:38 +01:00
Michael Ellerman
1036500139 selftests/powerpc: Only test lwm/stmw on big endian
[ Upstream commit dd3a44c06f ]

Newer binutils (>= 2.36) refuse to assemble lmw/stmw when building in
little endian mode. That breaks compilation of our alignment handler
test:

  /tmp/cco4l14N.s: Assembler messages:
  /tmp/cco4l14N.s:1440: Error: `lmw' invalid when little-endian
  /tmp/cco4l14N.s:1814: Error: `stmw' invalid when little-endian
  make[2]: *** [../../lib.mk:139: /output/kselftest/powerpc/alignment/alignment_handler] Error 1

These tests do pass on little endian machines, as the kernel will
still emulate those instructions even when running little
endian (which is arguably a kernel bug).

But we don't really need to test that case, so ifdef those
instructions out to get the alignment test building again.

Reported-by: Libor Pechacek <lpechacek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Link: https://lore.kernel.org/r/20210119041800.3093047-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:38 +01:00
Brian King
a7a479e168 scsi: ibmvfc: Set default timeout to avoid crash during migration
[ Upstream commit 764907293e ]

While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:38 +01:00
Felix Fietkau
b70798906c mac80211: fix fast-rx encryption check
[ Upstream commit 622d3b4e39 ]

When using WEP, the default unicast key needs to be selected, instead of
the STA PTK.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20201218184718.93650-5-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Javed Hasan
d29232ebc8 scsi: libfc: Avoid invoking response handler twice if ep is already completed
[ Upstream commit b2b0f16fa6 ]

A race condition exists between the response handler getting called because
of exchange_mgr_reset() (which clears out all the active XIDs) and the
response we get via an interrupt.

Sequence of events:

	 rport ba0200: Port timeout, state PLOGI
	 rport ba0200: Port entered PLOGI state from PLOGI state
	 xid 1052: Exchange timer armed : 20000 msecs      xid timer armed here
	 rport ba0200: Received LOGO request while in state PLOGI
	 rport ba0200: Delete port
	 rport ba0200: work event 3
	 rport ba0200: lld callback ev 3
	 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200
	 bnx2fc: ba0200 - rport not created Yet!!
	 /* Here we reset any outstanding exchanges before
	 freeing rport using the exch_mgr_reset() */
	 xid 1052: Exchange timer canceled
	 /* Here we got two responses for one xid */
	 xid 1052: invoking resp(), esb 20000000 state 3
	 xid 1052: invoking resp(), esb 20000000 state 3
	 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2
	 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2

Skip the response if the exchange is already completed.

Link: https://lore.kernel.org/r/20201215194731.2326-1-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Martin Wilck
2b27b3f8e5 scsi: scsi_transport_srp: Don't block target in failfast state
[ Upstream commit 72eeb7c715 ]

If the port is in SRP_RPORT_FAIL_FAST state when srp_reconnect_rport() is
entered, a transition to SDEV_BLOCK would be illegal, and a kernel WARNING
would be triggered. Skip scsi_target_block() in this case.

Link: https://lore.kernel.org/r/20210111142541.21534-1-mwilck@suse.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Peter Zijlstra
3b9aacf27e x86: __always_inline __{rd,wr}msr()
[ Upstream commit 66a425011c ]

When the compiler choses to not inline the trivial MSR helpers:

  vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0xce: call to __wrmsr.constprop.14() leaves .noinstr.text section

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/X/bf3gV+BW7kGEsB@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Arnold Gozum
96c5db6cf4 platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
[ Upstream commit fcd38f178b ]

The Dell Inspiron 7352 is a 2-in-1 model that has chassis-type "Notebook".
Add this model to the dmi_switches_allow_list.

Signed-off-by: Arnold Gozum <arngozum@gmail.com>
Link: https://lore.kernel.org/r/20201226205307.249659-1-arngozum@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Hans de Goede
ac518835f2 platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
[ Upstream commit 46c54cf270 ]

The Estar Beauty HD (MID 7316R) tablet uses a Goodix touchscreen,
with the X and Y coordinates swapped compared to the LCD panel.

Add a touchscreen_dmi entry for this adding a "touchscreen-swapped-x-y"
device-property to the i2c-client instantiated for this device before
the driver binds.

This is the first entry of a Goodix touchscreen to touchscreen_dmi.c,
so far DMI quirks for Goodix touchscreen's have been added directly
to drivers/input/touchscreen/goodix.c. Currently there are 3
DMI tables in goodix.c:
1. rotated_screen[] for devices where the touchscreen is rotated
   180 degrees vs the LCD panel
2. inverted_x_screen[] for devices where the X axis is inverted
3. nine_bytes_report[] for devices which use a non standard touch
   report size

Arguably only 3. really needs to be inside the driver and the other
2 cases are better handled through the generic touchscreen DMI quirk
mechanism from touchscreen_dmi.c, which allows adding device-props to
any i2c-client. Esp. now that goodix.c is using the generic
touchscreen_properties code.

Alternative to the approach from this patch we could add a 4th
dmi_system_id table for devices with swapped-x-y axis to goodix.c,
but that seems undesirable.

Cc: Bastien Nocera <hadess@hadess.net>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201224135158.10976-1-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Tony Lindgren
836e842b67 phy: cpcap-usb: Fix warning for missing regulator_disable
[ Upstream commit 764257d906 ]

On deferred probe, we will get the following splat:

cpcap-usb-phy cpcap-usb-phy.0: could not initialize VBUS or ID IIO: -517
WARNING: CPU: 0 PID: 21 at drivers/regulator/core.c:2123 regulator_put+0x68/0x78
...
(regulator_put) from [<c068ebf0>] (release_nodes+0x1b4/0x1fc)
(release_nodes) from [<c068a9a4>] (really_probe+0x104/0x4a0)
(really_probe) from [<c068b034>] (driver_probe_device+0x58/0xb4)

Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20201230102105.11826-1-tony@atomide.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 14:48:37 +01:00
Eric Dumazet
69874c3152 net_sched: gen_estimator: support large ewma log
commit dd5e073381 upstream

syzbot report reminded us that very big ewma_log were supported in the past,
even if they made litle sense.

tc qdisc replace dev xxx root est 1sec 131072sec ...

While fixing the bug, also add boundary checks for ewma_log, in line
with range supported by iproute2.

UBSAN: shift-out-of-bounds in net/core/gen_estimator.c:83:38
shift exponent -1 is negative
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 est_timer.cold+0xbb/0x12d net/core/gen_estimator.c:83
 call_timer_fn+0x1a5/0x710 kernel/time/timer.c:1417
 expire_timers kernel/time/timer.c:1462 [inline]
 __run_timers.part.0+0x692/0xa80 kernel/time/timer.c:1731
 __run_timers kernel/time/timer.c:1712 [inline]
 run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1744
 __do_softirq+0x2bc/0xa77 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu+0x17f/0x200 kernel/softirq.c:420
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:169 [inline]
RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516

Fixes: 1c0d32fde5 ("net_sched: gen_estimator: complete rewrite of rate estimators")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20210114181929.1717985-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:37 +01:00
Christian Brauner
5cbe06fe63 sysctl: handle overflow in proc_get_long
commit 7f2923c4f7 upstream.

proc_get_long() is a funny function.  It uses simple_strtoul() and for a
good reason.  proc_get_long() wants to always succeed the parse and
return the maybe incorrect value and the trailing characters to check
against a pre-defined list of acceptable trailing values.  However,
simple_strtoul() explicitly ignores overflows which can cause funny
things like the following to happen:

  echo 18446744073709551616 > /proc/sys/fs/file-max
  cat /proc/sys/fs/file-max
  0

(Which will cause your system to silently die behind your back.)

On the other hand kstrtoul() does do overflow detection but does not
return the trailing characters, and also fails the parse when anything
other than '\n' is a trailing character whereas proc_get_long() wants to
be more lenient.

Now, before adding another kstrtoul() function let's simply add a static
parse strtoul_lenient() which:
 - fails on overflow with -ERANGE
 - returns the trailing characters to the caller

The reason why we should fail on ERANGE is that we already do a partial
fail on overflow right now.  Namely, when the TMPBUFLEN is exceeded.  So
we already reject values such as 184467440737095516160 (21 chars) but
accept values such as 18446744073709551616 (20 chars) but both are
overflows.  So we should just always reject 64bit overflows and not
special-case this based on the number of chars.

Link: http://lkml.kernel.org/r/20190107222700.15954-2-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Joerg Vehlow <lkml@jv-coder.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:37 +01:00
Rafael J. Wysocki
26ec09650f ACPI: thermal: Do not call acpi_thermal_check() directly
commit 81b704d3e4 upstream.

Calling acpi_thermal_check() from acpi_thermal_notify() directly
is problematic if _TMP triggers Notify () on the thermal zone for
which it has been evaluated (which happens on some systems), because
it causes a new acpi_thermal_notify() invocation to be queued up
every time and if that takes place too often, an indefinite number of
pending work items may accumulate in kacpi_notify_wq over time.

Besides, it is not really useful to queue up a new invocation of
acpi_thermal_check() if one of them is pending already.

For these reasons, rework acpi_thermal_notify() to queue up a thermal
check instead of calling acpi_thermal_check() directly and only allow
one thermal check to be pending at a time.  Moreover, only allow one
acpi_thermal_check_fn() instance at a time to run
thermal_zone_device_update() for one thermal zone and make it return
early if it sees other instances running for the same thermal zone.

While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
as it is only called from there after the other changes made here.

[This issue appears to have been exposed by commit 6d25be5782
 ("sched/core, workqueues: Distangle worker accounting from rq
 lock"), but it is unclear why it was not visible earlier.]

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
Reported-by: Stephen Berman <stephen.berman@gmx.net>
Diagnosed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Stephen Berman <stephen.berman@gmx.net>
Cc: All applicable <stable@vger.kernel.org>
[bigeasy: Backported to v5.4.y]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:37 +01:00
Lijun Pan
55288f9a4c ibmvnic: Ensure that CRQ entry read are correctly ordered
commit e41aec79e6 upstream.

Ensure that received Command-Response Queue (CRQ) entries are
properly read in order by the driver. dma_rmb barrier has
been added before accessing the CRQ descriptor to ensure
the entire descriptor is read before processing.

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20210128013442.88319-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:36 +01:00
Pan Bian
cf6bb58f80 net: dsa: bcm_sf2: put device node before return
commit cf3c46631e upstream.

Put the device node dn before return error code on failure path.

Fixes: 461cd1b03e ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Link: https://lore.kernel.org/r/20210121123343.26330-1-bianpan2016@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 14:48:36 +01:00
Greg Kroah-Hartman
a0b26b9f53 Linux 4.19.173
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210202132942.915040339@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:27 +01:00
Pengcheng Yang
dc7bc439a8 tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
commit 62d9f1a694 upstream.

Upon receiving a cumulative ACK that changes the congestion state from
Disorder to Open, the TLP timer is not set. If the sender is app-limited,
it can only wait for the RTO timer to expire and retransmit.

The reason for this is that the TLP timer is set before the congestion
state changes in tcp_ack(), so we delay the time point of calling
tcp_set_xmit_timer() until after tcp_fastretrans_alert() returns and
remove the FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer
is set.

This commit has two additional benefits:
1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
avoid spurious RTO caused by RTO timer early expires.
2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
timer is set.

Fixes: df92c8394e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")
Link: https://lore.kernel.org/netdev/1611311242-6675-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/1611464834-23030-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:27 +01:00
Ivan Vecera
ed130d6329 team: protect features update by RCU to avoid deadlock
commit f0947d0d21 upstream.

Function __team_compute_features() is protected by team->lock
mutex when it is called from team_compute_features() used when
features of an underlying device is changed. This causes
a deadlock when NETDEV_FEAT_CHANGE notifier for underlying device
is fired due to change propagated from team driver (e.g. MTU
change). It's because callbacks like team_change_mtu() or
team_vlan_rx_{add,del}_vid() protect their port list traversal
by team->lock mutex.

Example (r8169 case where this driver disables TSO for certain MTU
values):
...
[ 6391.348202]  __mutex_lock.isra.6+0x2d0/0x4a0
[ 6391.358602]  team_device_event+0x9d/0x160 [team]
[ 6391.363756]  notifier_call_chain+0x47/0x70
[ 6391.368329]  netdev_update_features+0x56/0x60
[ 6391.373207]  rtl8169_change_mtu+0x14/0x50 [r8169]
[ 6391.378457]  dev_set_mtu_ext+0xe1/0x1d0
[ 6391.387022]  dev_set_mtu+0x52/0x90
[ 6391.390820]  team_change_mtu+0x64/0xf0 [team]
[ 6391.395683]  dev_set_mtu_ext+0xe1/0x1d0
[ 6391.399963]  do_setlink+0x231/0xf50
...

In fact team_compute_features() called from team_device_event()
does not need to be protected by team->lock mutex and rcu_read_lock()
is sufficient there for port list traversal.

Fixes: 3d249d4ca7 ("net: introduce ethernet teaming device")
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20210125074416.4056484-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:27 +01:00
Pan Bian
cff34a4337 NFC: fix possible resource leak
commit d8f923c3ab upstream.

Put the device to avoid resource leak on path that the polling flag is
invalid.

Fixes: a831b91320 ("NFC: Do not return EBUSY when stopping a poll that's already stopped")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Link: https://lore.kernel.org/r/20210121153745.122184-1-bianpan2016@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:27 +01:00
Pan Bian
044c4a5b5b NFC: fix resource leak when target index is invalid
commit 3a30537cee upstream.

Goto to the label put_dev instead of the label error to fix potential
resource leak on path that the target index is invalid.

Fixes: c4fbb6515a ("NFC: The core part should generate the target index")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Link: https://lore.kernel.org/r/20210121152748.98409-1-bianpan2016@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:26 +01:00
Takeshi Misawa
a231312093 rxrpc: Fix memory leak in rxrpc_lookup_local
commit b8323f7288 upstream.

Commit 9ebeddef58 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
Then release ref in __rxrpc_put_peer and rxrpc_put_peer_locked.

	struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp)
	-               peer->local = local;
	+               peer->local = rxrpc_get_local(local);

rxrpc_discard_prealloc also need ref release in discarding.

syzbot report:
BUG: memory leak
unreferenced object 0xffff8881080ddc00 (size 256):
  comm "syz-executor339", pid 8462, jiffies 4294942238 (age 12.350s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 0a 00 00 00 00 c0 00 08 81 88 ff ff  ................
  backtrace:
    [<000000002b6e495f>] kmalloc include/linux/slab.h:552 [inline]
    [<000000002b6e495f>] kzalloc include/linux/slab.h:682 [inline]
    [<000000002b6e495f>] rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
    [<000000002b6e495f>] rxrpc_lookup_local+0x1c1/0x760 net/rxrpc/local_object.c:244
    [<000000006b43a77b>] rxrpc_bind+0x174/0x240 net/rxrpc/af_rxrpc.c:149
    [<00000000fd447a55>] afs_open_socket+0xdb/0x200 fs/afs/rxrpc.c:64
    [<000000007fd8867c>] afs_net_init+0x2b4/0x340 fs/afs/main.c:126
    [<0000000063d80ec1>] ops_init+0x4e/0x190 net/core/net_namespace.c:152
    [<00000000073c5efa>] setup_net+0xde/0x2d0 net/core/net_namespace.c:342
    [<00000000a6744d5b>] copy_net_ns+0x19f/0x3e0 net/core/net_namespace.c:483
    [<0000000017d3aec3>] create_new_namespaces+0x199/0x4f0 kernel/nsproxy.c:110
    [<00000000186271ef>] unshare_nsproxy_namespaces+0x9b/0x120 kernel/nsproxy.c:226
    [<000000002de7bac4>] ksys_unshare+0x2fe/0x5c0 kernel/fork.c:2957
    [<00000000349b12ba>] __do_sys_unshare kernel/fork.c:3025 [inline]
    [<00000000349b12ba>] __se_sys_unshare kernel/fork.c:3023 [inline]
    [<00000000349b12ba>] __x64_sys_unshare+0x12/0x20 kernel/fork.c:3023
    [<000000006d178ef7>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    [<00000000637076d4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 9ebeddef58 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com>
Reported-and-tested-by: syzbot+305326672fed51b205f7@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/161183091692.3506637.3206605651502458810.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:26 +01:00
Bartosz Golaszewski
115d2addec iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
commit 9def3b1a07 upstream.

Since commit c40aaaac10 ("iommu/vt-d: Gracefully handle DMAR units
with no supported address widths") dmar.c needs struct iommu_device to
be selected. We can drop this dependency by not dereferencing struct
iommu_device if IOMMU_API is not selected and by reusing the information
stored in iommu->drhd->ignored instead.

This fixes the following build error when IOMMU_API is not selected:

drivers/iommu/dmar.c: In function ‘free_iommu’:
drivers/iommu/dmar.c:1139:41: error: ‘struct iommu_device’ has no member named ‘ops’
 1139 |  if (intel_iommu_enabled && iommu->iommu.ops) {
                                                ^

Fixes: c40aaaac10 ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20201013073055.11262-1-brgl@bgdev.pl
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[ - context change due to moving drivers/iommu/dmar.c to
    drivers/iommu/intel/dmar.c
  - set the drhr in the iommu like in upstream commit b1012ca8dc
    ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu") ]
Signed-off-by: Filippo Sironi <sironi@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:26 +01:00
David Woodhouse
34f0e90856 iommu/vt-d: Gracefully handle DMAR units with no supported address widths
commit c40aaaac10 upstream.

Instead of bailing out completely, such a unit can still be used for
interrupt remapping.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/549928db2de6532117f36c9c810373c14cf76f51.camel@infradead.org/
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[ context change due to moving drivers/iommu/dmar.c to
  drivers/iommu/intel/dmar.c ]
Signed-off-by: Filippo Sironi <sironi@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:26 +01:00
Dan Carpenter
184666c9a3 can: dev: prevent potential information leak in can_fill_info()
[ Upstream commit b552766c87 ]

The "bec" struct isn't necessarily always initialized. For example, the
mcp251xfd_get_berr_counter() function doesn't initialize anything if the
interface is down.

Fixes: 52c793f240 ("can: netlink support for bus-error reporting and counters")
Link: https://lore.kernel.org/r/YAkaRdRJncsJO8Ve@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Roi Dayan
8cbe9e3567 net/mlx5: Fix memory leak on flow table creation error flow
[ Upstream commit 487c6ef81e ]

When we create the ft object we also init rhltable in ft->fgs_hash.
So in error flow before kfree of ft we need to destroy that rhltable.

Fixes: 693c6883bb ("net/mlx5: Add hash table for flow groups in flow table")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Johannes Berg
b26b5e0861 mac80211: pause TX while changing interface type
[ Upstream commit 054c9939b4 ]

syzbot reported a crash that happened when changing the interface
type around a lot, and while it might have been easy to fix just
the symptom there, a little deeper investigation found that really
the reason is that we allowed packets to be transmitted while in
the middle of changing the interface type.

Disallow TX by stopping the queues while changing the type.

Fixes: 34d4bc4d41 ("mac80211: support runtime interface type changes")
Reported-by: syzbot+d7a3b15976bf7de2238a@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20210122171115.b321f98f4d4f.I6997841933c17b093535c31d29355be3c0c39628@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Johannes Berg
2f48393f90 iwlwifi: pcie: reschedule in long-running memory reads
[ Upstream commit 3d372c4edf ]

If we spin for a long time in memory reads that (for some reason in
hardware) take a long time, then we'll eventually get messages such
as

  watchdog: BUG: soft lockup - CPU#2 stuck for 24s! [kworker/2:2:272]

This is because the reading really does take a very long time, and
we don't schedule, so we're hogging the CPU with this task, at least
if CONFIG_PREEMPT is not set, e.g. with CONFIG_PREEMPT_VOLUNTARY=y.

Previously I misinterpreted the situation and thought that this was
only going to happen if we had interrupts disabled, and then fixed
this (which is good anyway, however), but that didn't always help;
looking at it again now I realized that the spin unlock will only
reschedule if CONFIG_PREEMPT is used.

In order to avoid this issue, change the code to cond_resched() if
we've been spinning for too long here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 04516706bb ("iwlwifi: pcie: limit memory read spin time")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210115130253.217a9d6a6a12.If964cb582ab0aaa94e81c4ff3b279eaafda0fd3f@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Johannes Berg
413f388045 iwlwifi: pcie: use jiffies for memory read spin time limit
[ Upstream commit 6701317476 ]

There's no reason to use ktime_get() since we don't need any better
precision than jiffies, and since we no longer disable interrupts
around this code (when grabbing NIC access), jiffies will work fine.
Use jiffies instead of ktime_get().

This cleanup is preparation for the following patch "iwlwifi: pcie: reschedule
in long-running memory reads". The code gets simpler with the weird clock use
etc. removed before we add cond_resched().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210115130253.621c948b1fad.I3ee9f4bc4e74a0c9125d42fb7c35cd80df4698a1@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Trond Myklebust
8a9ccffb3c pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process()
[ Upstream commit 814b849713 ]

If the server returns a new stateid that does not match the one in our
cache, then pnfs_layout_process() will leak the layout segments returned
by pnfs_mark_layout_stateid_invalid().

Fixes: 9888d837f3 ("pNFS: Force a retry of LAYOUTGET if the stateid doesn't match our cache")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:26 +01:00
Kamal Heib
f518b796f4 RDMA/cxgb4: Fix the reported max_recv_sge value
[ Upstream commit a372173bf3 ]

The max_recv_sge value is wrongly reported when calling query_qp, This is
happening due to a typo when assigning the max_recv_sge value, the value
of sq_max_sges was assigned instead of rq_max_sges.

Fixes: 3e5c02c9ef ("iw_cxgb4: Support query_qp() verb")
Link: https://lore.kernel.org/r/20210114191423.423529-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:25 +01:00
Eyal Birger
5292ab5172 xfrm: fix disable_xfrm sysctl when used on xfrm interfaces
[ Upstream commit 9f8550e4bd ]

The disable_xfrm flag signals that xfrm should not be performed during
routing towards a device before reaching device xmit.

For xfrm interfaces this is usually desired as they perform the outbound
policy lookup as part of their xmit using their if_id.

Before this change enabling this flag on xfrm interfaces prevented them
from xmitting as xfrm_lookup_with_ifid() would not perform a policy lookup
in case the original dst had the DST_NOXFRM flag.

This optimization is incorrect when the lookup is done by the xfrm
interface xmit logic.

Fix by performing policy lookup when invoked by xfrmi as if_id != 0.

Similarly it's unlikely for the 'no policy exists on net' check to yield
any performance benefits when invoked from xfrmi.

Fixes: f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:25 +01:00
Shmulik Ladkani
6790d6f3cb xfrm: Fix oops in xfrm_replay_advance_bmp
[ Upstream commit 56ce7c25ae ]

When setting xfrm replay_window to values higher than 32, a rare
page-fault occurs in xfrm_replay_advance_bmp:

  BUG: unable to handle page fault for address: ffff8af350ad7920
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD ad001067 P4D ad001067 PUD 0
  Oops: 0002 [#1] SMP PTI
  CPU: 3 PID: 30 Comm: ksoftirqd/3 Kdump: loaded Not tainted 5.4.52-050452-generic #202007160732
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  RIP: 0010:xfrm_replay_advance_bmp+0xbb/0x130
  RSP: 0018:ffffa1304013ba40 EFLAGS: 00010206
  RAX: 000000000000010d RBX: 0000000000000002 RCX: 00000000ffffff4b
  RDX: 0000000000000018 RSI: 00000000004c234c RDI: 00000000ffb3dbff
  RBP: ffffa1304013ba50 R08: ffff8af330ad7920 R09: 0000000007fffffa
  R10: 0000000000000800 R11: 0000000000000010 R12: ffff8af29d6258c0
  R13: ffff8af28b95c700 R14: 0000000000000000 R15: ffff8af29d6258fc
  FS:  0000000000000000(0000) GS:ffff8af339ac0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff8af350ad7920 CR3: 0000000015ee4000 CR4: 00000000001406e0
  Call Trace:
   xfrm_input+0x4e5/0xa10
   xfrm4_rcv_encap+0xb5/0xe0
   xfrm4_udp_encap_rcv+0x140/0x1c0

Analysis revealed offending code is when accessing:

	replay_esn->bmp[nr] |= (1U << bitnr);

with 'nr' being 0x07fffffa.

This happened in an SMP system when reordering of packets was present;
A packet arrived with a "too old" sequence number (outside the window,
i.e 'diff > replay_window'), and therefore the following calculation:

			bitnr = replay_esn->replay_window - (diff - pos);

yields a negative result, but since bitnr is u32 we get a large unsigned
quantity (in crash dump above: 0xffffff4b seen in ecx).

This was supposed to be protected by xfrm_input()'s former call to:

		if (x->repl->check(x, skb, seq)) {

However, the state's spinlock x->lock is *released* after '->check()'
is performed, and gets re-acquired before '->advance()' - which gives a
chance for a different core to update the xfrm state, e.g. by advancing
'replay_esn->seq' when it encounters more packets - leading to a
'diff > replay_window' situation when original core continues to
xfrm_replay_advance_bmp().

An attempt to fix this issue was suggested in commit bcf66bf54a
("xfrm: Perform a replay check after return from async codepaths"),
by calling 'x->repl->recheck()' after lock is re-acquired, but fix
applied only to asyncronous crypto algorithms.

Augment the fix, by *always* calling 'recheck()' - irrespective if we're
using async crypto.

Fixes: 0ebea8ef35 ("[IPSEC]: Move state lock into x->type->input")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:23:25 +01:00
Pablo Neira Ayuso
95b1e6e1ff netfilter: nft_dynset: add timeout extension to template
commit 0c5b7a501e upstream.

Otherwise, the newly create element shows no timeout when listing the
ruleset. If the set definition does not specify a default timeout, then
the set element only shows the expiration time, but not the timeout.
This is a problem when restoring a stateful ruleset listing since it
skips the timeout policy entirely.

Fixes: 22fe54d5fe ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00
Max Krummenacher
ccdabbf516 ARM: imx: build suspend-imx6.S with arm instruction set
commit a88afa46b8 upstream.

When the kernel is configured to use the Thumb-2 instruction set
"suspend-to-memory" fails to resume. Observed on a Colibri iMX6ULL
(i.MX 6ULL) and Apalis iMX6 (i.MX 6Q).

It looks like the CPU resumes unconditionally in ARM instruction mode
and then chokes on the presented Thumb-2 code it should execute.

Fix this by using the arm instruction set for all code in
suspend-imx6.S.

Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
Fixes: df595746fa ("ARM: imx: add suspend in ocram support for i.mx6q")
Acked-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00
Roger Pau Monne
a4af550350 xen-blkfront: allow discard-* nodes to be optional
commit 0549cd67b0 upstream.

This is inline with the specification described in blkif.h:

 * discard-granularity: should be set to the physical block size if
   node is not present.
 * discard-alignment, discard-secure: should be set to 0 if node not
   present.

This was detected as QEMU would only create the discard-granularity
node but not discard-alignment, and thus the setup done in
blkfront_setup_discard would fail.

Fix blkfront_setup_discard to not fail on missing nodes, and also fix
blkif_set_queue_limits to set the discard granularity to the physical
block size if none is specified in xenbus.

Fixes: ed30bf317c ('xen-blkfront: Handle discard requests.')
Reported-by: Arthur Borsboom <arthurborsboom@gmail.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-By: Arthur Borsboom <arthurborsboom@gmail.com>
Link: https://lore.kernel.org/r/20210119105727.95173-1-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00
Lorenzo Bianconi
6974d9cf9d mt7601u: fix rx buffer refcounting
commit d24c790577 upstream.

Fix the following crash due to erroneous page refcounting:

[   32.445919] BUG: Bad page state in process swapper/1  pfn:11f65a
[   32.447409] page:00000000938f0632 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x11f65a
[   32.449605] flags: 0x8000000000000000()
[   32.450421] raw: 8000000000000000 ffffffff825b0148 ffffea00045ae988 0000000000000000
[   32.451795] raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
[   32.452999] page dumped because: nonzero mapcount
[   32.453888] Modules linked in:
[   32.454492] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-rc2+ #1976
[   32.455695] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[   32.457157] Call Trace:
[   32.457636]  <IRQ>
[   32.457993]  dump_stack+0x77/0x97
[   32.458576]  bad_page.cold+0x65/0x96
[   32.459198]  get_page_from_freelist+0x46a/0x11f0
[   32.460008]  __alloc_pages_nodemask+0x10a/0x2b0
[   32.460794]  mt7601u_rx_tasklet+0x651/0x720
[   32.461505]  tasklet_action_common.constprop.0+0x6b/0xd0
[   32.462343]  __do_softirq+0x152/0x46c
[   32.462928]  asm_call_irq_on_stack+0x12/0x20
[   32.463610]  </IRQ>
[   32.463953]  do_softirq_own_stack+0x5b/0x70
[   32.464582]  irq_exit_rcu+0x9f/0xe0
[   32.465028]  common_interrupt+0xae/0x1a0
[   32.465536]  asm_common_interrupt+0x1e/0x40
[   32.466071] RIP: 0010:default_idle+0x18/0x20
[   32.468981] RSP: 0018:ffffc90000077f00 EFLAGS: 00000246
[   32.469648] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   32.470550] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81aac3dd
[   32.471463] RBP: ffff88810022ab00 R08: 0000000000000001 R09: 0000000000000001
[   32.472335] R10: 0000000000000046 R11: 0000000000005aa0 R12: 0000000000000000
[   32.473235] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   32.474139]  ? default_idle_call+0x4d/0x200
[   32.474681]  default_idle_call+0x74/0x200
[   32.475192]  do_idle+0x1d5/0x250
[   32.475612]  cpu_startup_entry+0x19/0x20
[   32.476114]  secondary_startup_64_no_verify+0xb0/0xbb
[   32.476765] Disabling lock debugging due to kernel taint

Fixes: c869f77d6a ("add mt7601u driver")
Co-developed-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/62b2380c8c2091834cfad05e1059b55f945bd114.1610643952.git.lorenzo@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00
Lorenzo Bianconi
845f178572 mt7601u: fix kernel crash unplugging the device
commit 0acb20a543 upstream.

The following crash log can occur unplugging the usb dongle since,
after the urb poison in mt7601u_free_tx_queue(), usb_submit_urb() will
always fail resulting in a skb kfree while the skb has been already
queued.

Fix the issue enqueuing the skb only if usb_submit_urb() succeed.

Hardware name: Hewlett-Packard 500-539ng/2B2C, BIOS 80.06 04/01/2015
Workqueue: usb_hub_wq hub_event
RIP: 0010:skb_trim+0x2c/0x30
RSP: 0000:ffffb4c88005bba8 EFLAGS: 00010206
RAX: 000000004ad483ee RBX: ffff9a236625dee0 RCX: 000000000000662f
RDX: 000000000000000c RSI: 0000000000000000 RDI: ffff9a2343179300
RBP: ffff9a2343179300 R08: 0000000000000001 R09: 0000000000000000
R10: ffff9a23748f7840 R11: 0000000000000001 R12: ffff9a236625e4d4
R13: ffff9a236625dee0 R14: 0000000000001080 R15: 0000000000000008
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd410a34ef8 CR3: 00000001416ee001 CR4: 00000000001706f0
Call Trace:
 mt7601u_tx_status+0x3e/0xa0 [mt7601u]
 mt7601u_dma_cleanup+0xca/0x110 [mt7601u]
 mt7601u_cleanup+0x22/0x30 [mt7601u]
 mt7601u_disconnect+0x22/0x60 [mt7601u]
 usb_unbind_interface+0x8a/0x270
 ? kernfs_find_ns+0x35/0xd0
 __device_release_driver+0x17a/0x230
 device_release_driver+0x24/0x30
 bus_remove_device+0xdb/0x140
 device_del+0x18b/0x430
 ? kobject_put+0x98/0x1d0
 usb_disable_device+0xc6/0x1f0
 usb_disconnect.cold+0x7e/0x20a
 hub_event+0xbf3/0x1870
 process_one_work+0x1b6/0x350
 worker_thread+0x53/0x3e0
 ? process_one_work+0x350/0x350
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Fixes: 23377c200b ("mt7601u: fix possible memory leak when the device is disconnected")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/3b85219f669a63a8ced1f43686de05915a580489.1610919247.git.lorenzo@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00
Andrea Righi
62b47c35c2 leds: trigger: fix potential deadlock with libata
commit 27af8e2c90 upstream.

We have the following potential deadlock condition:

 ========================================================
 WARNING: possible irq lock inversion dependency detected
 5.10.0-rc2+ #25 Not tainted
 --------------------------------------------------------
 swapper/3/0 just changed the state of lock:
 ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
 but this lock took another, HARDIRQ-READ-unsafe lock in the past:
  (&trig->leddev_list_lock){.+.?}-{2:2}

 and interrupts could create inverse lock ordering between them.

 other info that might help us debug this:
  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&trig->leddev_list_lock);
                                local_irq_disable();
                                lock(&host->lock);
                                lock(&trig->leddev_list_lock);
   <Interrupt>
     lock(&host->lock);

  *** DEADLOCK ***

 no locks held by swapper/3/0.

 the shortest dependencies between 2nd lock and 1st lock:
  -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
     HARDIRQ-ON-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       rfkill_global_led_trigger_worker+0x94/0xb0
                       process_one_work+0x240/0x560
                       worker_thread+0x58/0x3d0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     IN-SOFTIRQ-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       kbd_bh+0x9e/0xc0
                       tasklet_action_common.constprop.0+0xe9/0x100
                       tasklet_action+0x22/0x30
                       __do_softirq+0xcc/0x46d
                       run_ksoftirqd+0x3f/0x70
                       smpboot_thread_fn+0x116/0x1f0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     SOFTIRQ-ON-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       rfkill_global_led_trigger_worker+0x94/0xb0
                       process_one_work+0x240/0x560
                       worker_thread+0x58/0x3d0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     INITIAL READ USE at:
                           lock_acquire+0x15f/0x420
                           _raw_read_lock+0x42/0x90
                           led_trigger_event+0x2b/0x70
                           rfkill_global_led_trigger_worker+0x94/0xb0
                           process_one_work+0x240/0x560
                           worker_thread+0x58/0x3d0
                           kthread+0x151/0x170
                           ret_from_fork+0x1f/0x30
   }
   ... key      at: [<ffffffff83da4c00>] __key.0+0x0/0x10
   ... acquired at:
    _raw_read_lock+0x42/0x90
    led_trigger_blink_oneshot+0x3b/0x90
    ledtrig_disk_activity+0x3c/0xa0
    ata_qc_complete+0x26/0x450
    ata_do_link_abort+0xa3/0xe0
    ata_port_freeze+0x2e/0x40
    ata_hsm_qc_complete+0x94/0xa0
    ata_sff_hsm_move+0x177/0x7a0
    ata_sff_pio_task+0xc7/0x1b0
    process_one_work+0x240/0x560
    worker_thread+0x58/0x3d0
    kthread+0x151/0x170
    ret_from_fork+0x1f/0x30

 -> (&host->lock){-...}-{2:2} ops: 69 {
    IN-HARDIRQ-W at:
                     lock_acquire+0x15f/0x420
                     _raw_spin_lock_irqsave+0x52/0xa0
                     ata_bmdma_interrupt+0x27/0x200
                     __handle_irq_event_percpu+0xd5/0x2b0
                     handle_irq_event+0x57/0xb0
                     handle_edge_irq+0x8c/0x230
                     asm_call_irq_on_stack+0xf/0x20
                     common_interrupt+0x100/0x1c0
                     asm_common_interrupt+0x1e/0x40
                     native_safe_halt+0xe/0x10
                     arch_cpu_idle+0x15/0x20
                     default_idle_call+0x59/0x1c0
                     do_idle+0x22c/0x2c0
                     cpu_startup_entry+0x20/0x30
                     start_secondary+0x11d/0x150
                     secondary_startup_64_no_verify+0xa6/0xab
    INITIAL USE at:
                    lock_acquire+0x15f/0x420
                    _raw_spin_lock_irqsave+0x52/0xa0
                    ata_dev_init+0x54/0xe0
                    ata_link_init+0x8b/0xd0
                    ata_port_alloc+0x1f1/0x210
                    ata_host_alloc+0xf1/0x130
                    ata_host_alloc_pinfo+0x14/0xb0
                    ata_pci_sff_prepare_host+0x41/0xa0
                    ata_pci_bmdma_prepare_host+0x14/0x30
                    piix_init_one+0x21f/0x600
                    local_pci_probe+0x48/0x80
                    pci_device_probe+0x105/0x1c0
                    really_probe+0x221/0x490
                    driver_probe_device+0xe9/0x160
                    device_driver_attach+0xb2/0xc0
                    __driver_attach+0x91/0x150
                    bus_for_each_dev+0x81/0xc0
                    driver_attach+0x1e/0x20
                    bus_add_driver+0x138/0x1f0
                    driver_register+0x91/0xf0
                    __pci_register_driver+0x73/0x80
                    piix_init+0x1e/0x2e
                    do_one_initcall+0x5f/0x2d0
                    kernel_init_freeable+0x26f/0x2cf
                    kernel_init+0xe/0x113
                    ret_from_fork+0x1f/0x30
  }
  ... key      at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
  ... acquired at:
    __lock_acquire+0x9da/0x2370
    lock_acquire+0x15f/0x420
    _raw_spin_lock_irqsave+0x52/0xa0
    ata_bmdma_interrupt+0x27/0x200
    __handle_irq_event_percpu+0xd5/0x2b0
    handle_irq_event+0x57/0xb0
    handle_edge_irq+0x8c/0x230
    asm_call_irq_on_stack+0xf/0x20
    common_interrupt+0x100/0x1c0
    asm_common_interrupt+0x1e/0x40
    native_safe_halt+0xe/0x10
    arch_cpu_idle+0x15/0x20
    default_idle_call+0x59/0x1c0
    do_idle+0x22c/0x2c0
    cpu_startup_entry+0x20/0x30
    start_secondary+0x11d/0x150
    secondary_startup_64_no_verify+0xa6/0xab

This lockdep splat is reported after:
commit e918188611 ("locking: More accurate annotations for read_lock()")

To clarify:
 - read-locks are recursive only in interrupt context (when
   in_interrupt() returns true)
 - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
   write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
   that holds trig->leddev_list_lock in read-mode
 - when CPU1 (ata_ac_complete()) tries to read-lock
   trig->leddev_list_lock, it would be blocked by the write-lock waiter
   on CPU2 (because we are not in interrupt context, so the read-lock is
   not recursive)
 - at this point if an interrupt happens on CPU0 and
   ata_bmdma_interrupt() is executed it will try to acquire host->lock,
   that is held by CPU1, that is currently blocked by CPU2, so:

   * CPU0 blocked by CPU1
   * CPU1 blocked by CPU2
   * CPU2 blocked by CPU0

     *** DEADLOCK ***

The deadlock scenario is better represented by the following schema
(thanks to Boqun Feng <boqun.feng@gmail.com> for the schema and the
detailed explanation of the deadlock condition):

 CPU 0:                          CPU 1:                        CPU 2:
 -----                           -----                         -----
 led_trigger_event():
   read_lock(&trig->leddev_list_lock);
 				<workqueue>
 				ata_hsm_qc_complete():
 				  spin_lock_irqsave(&host->lock);
 								write_lock(&trig->leddev_list_lock);
 				  ata_port_freeze():
 				    ata_do_link_abort():
 				      ata_qc_complete():
 					ledtrig_disk_activity():
 					  led_trigger_blink_oneshot():
 					    read_lock(&trig->leddev_list_lock);
 					    // ^ not in in_interrupt() context, so could get blocked by CPU 2
 <interrupt>
   ata_bmdma_interrupt():
     spin_lock_irqsave(&host->lock);

Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
that no interrupt can happen in between, preventing the deadlock
condition.

Apply the same change to led_trigger_blink_setup() as well, since the
same deadlock scenario can also happen in power_supply_update_bat_leds()
-> led_trigger_blink() -> led_trigger_blink_setup() (workqueue context),
and potentially prevent other similar usages.

Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
Fixes: eb25cb9956 ("leds: convert IDE trigger to common disk trigger")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:23:25 +01:00