commit f9619d5e51 upstream.
Depending on the number of online CPUs in the original kernel, it is
likely for CPU #0 to be offline in a kdump kernel. The associated IRQs
in the affinity mappings provided by irq_create_affinity_masks() are
thus not started by irq_startup(), as per-design with managed IRQs.
This can be a problem with multi-queue block devices driven by blk-mq :
such a non-started IRQ is very likely paired with the single queue
enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This
causes the device to remain silent and likely hangs the guest at
some point.
This is a regression caused by commit 9ea69a55b3 ("powerpc/pseries:
Pass MSI affinity to irq_create_mapping()"). Note that this only happens
with the XIVE interrupt controller because XICS has a workaround to bypass
affinity, which is activated during kdump with the "noirqdistrib" kernel
parameter.
The issue comes from a combination of factors:
- discrepancy between the number of queues detected by the multi-queue
block driver, that was used to create the MSI vectors, and the single
queue mode enforced later on by blk-mq because of kdump (i.e. keeping
all queues fixes the issue)
- CPU#0 offline (i.e. kdump always succeed with CPU#0)
Given that I couldn't reproduce on x86, which seems to always have CPU#0
online even during kdump, I'm not sure where this should be fixed. Hence
going for another approach : fine-grained affinity is for performance
and we don't really care about that during kdump. Simply revert to the
previous working behavior of ignoring affinity masks in this case only.
Fixes: 9ea69a55b3 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ae5fbd210 upstream.
Running "perf mem record" in powerpc platforms with selinux enabled
resulted in soft lockup's. Below call-trace was seen in the logs:
CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000
REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+)
...
NIP _raw_spin_lock_irqsave+0x94/0x120
LR _raw_spin_lock_irqsave+0x90/0x120
Call Trace:
0xc00000000fd47260 (unreliable)
skb_queue_tail+0x3c/0x90
audit_log_end+0x6c/0x180
common_lsm_audit+0xb0/0xe0
slow_avc_audit+0xa4/0x110
avc_has_perm+0x1c4/0x260
selinux_perf_event_open+0x74/0xd0
security_perf_event_open+0x68/0xc0
record_and_restart+0x6e8/0x7f0
perf_event_interrupt+0x22c/0x560
performance_monitor_exception0x4c/0x60
performance_monitor_common_virt+0x1c8/0x1d0
interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0
REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+)
...
NIP _raw_spin_lock_irqsave+0x38/0x120
LR skb_queue_tail+0x3c/0x90
interrupt: f00
0x38 (unreliable)
0xc00000000aae6200
audit_log_end+0x6c/0x180
audit_log_exit+0x344/0xf80
__audit_syscall_exit+0x2c0/0x320
do_syscall_trace_leave+0x148/0x200
syscall_exit_prepare+0x324/0x390
system_call_common+0xfc/0x27c
The above trace shows that while the CPU was handling a performance
monitor exception, there was a call to security_perf_event_open()
function. In powerpc core-book3s, this function is called from
perf_allow_kernel() check during recording of data address in the
sample via perf_get_data_addr().
Commit da97e18458 ("perf_event: Add support for LSM and SELinux
checks") introduced security enhancements to perf. As part of this
commit, the new security hook for perf_event_open() was added in all
places where perf paranoid check was previously used. In powerpc
core-book3s code, originally had paranoid checks in
perf_get_data_addr() and power_pmu_bhrb_read(). So
perf_paranoid_kernel() checks were replaced with perf_allow_kernel()
in these PMU helper functions as well.
The intention of paranoid checks in core-book3s was to verify
privilege access before capturing some of the sample data. Along with
paranoid checks, perf_allow_kernel() also does a
security_perf_event_open(). Since these functions are accessed while
recording a sample, we end up calling selinux_perf_event_open() in PMI
context. Some of the security functions use spinlock like
sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and
if we end up in calling selinux hook functions in PMI handler, this
could cause a dead lock.
Since the purpose of this security hook is to control access to
perf_event_open(), it is not right to call this in interrupt context.
The paranoid checks in powerpc core-book3s were done at interrupt time
which is also not correct.
Reference commits:
Commit cd1231d703 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()")
Commit bb19af8160 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")
We only allow creation of events that have already passed the
privilege checks in perf_event_open(). So these paranoid checks are
not needed at event time. As a fix, patch uses
'event->attr.exclude_kernel' check to prevent exposing kernel address
for userspace only sampling.
Fixes: cd1231d703 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()")
Cc: stable@vger.kernel.org # v4.17+
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c33cb0020e upstream.
Apparently, <linux/netfilter/nfnetlink_cthelper.h> and
<linux/netfilter/nfnetlink_acct.h> could not be included into the same
compilation unit because of a cut-and-paste typo in the former header.
Fixes: 12f7a50533 ("netfilter: add user-space connection tracking helper infrastructure")
Cc: <stable@vger.kernel.org> # v3.6
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9e46f6c6c9 ]
This problem was reported on a SVM guest while executing kexec.
Kexec fails to load the new kernel when the PCID feature is enabled.
When kexec starts loading the new kernel, it starts the process by
resetting the vCPU's and then bringing each vCPU online one by one.
The vCPU reset is supposed to reset all the register states before the
vCPUs are brought online. However, the CR4 register is not reset during
this process. If this register is already setup during the last boot,
all the flags can remain intact. The X86_CR4_PCIDE bit can only be
enabled in long mode. So, it must be enabled much later in SMP
initialization. Having the X86_CR4_PCIDE bit set during SMP boot can
cause a boot failures.
Fix the issue by resetting the CR4 register in init_vmcb().
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <161471109108.30811.6392805173629704166.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fee03efc69 ]
This commit adds mixer quirks for the Pioneer DJM-900NXS2 mixer. This
device has 6 capture channels, 5 of them allow setting the signal
source. This adds controls for these, similar to the DJM-250Mk2.
However, playpack channels are not controllable via software like on the
250Mk2, as they can only be set manually on the mixing console.
Read-only controls showing the currently selected playback channels are
omitted.
Signed-off-by: Fabian Lesniak <fabian@lesniak-it.de>
Link: https://lore.kernel.org/r/20210205215116.258724-2-fabian@lesniak-it.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a07df82c79 ]
This allows for N different devices to use the pioneer mixer quirk for
setting capture/record type and recording level. The impementation has
not changed much with the exception of an additional mask on
private_value to allow storing of a device index:
DEVICE MASK 0xff000000
GROUP_MASK 0x00ff0000
VALUE_MASK 0x0000ffff
This could be improved by changing the arrays of wValues for each
channel to contain named definitions (e.g. SND_DJM_CAP_LINE). It would
improve readability and perhaps would allow using the same array for
multiple channels. The channel number can be specified on the control
next to the wIndex.
Feedback is very much appreciated as I'm not the most proficient C
programmer but am learning as I go.
Signed-off-by: Olivia Mackintosh <livvy@base.nu>
Link: https://lore.kernel.org/r/20210205184256.10201-2-livvy@base.nu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc6a31b007 ]
The ITE8568 EC on the Voyo Winpad A15 presents itself as an I2C-HID
attached keyboard and mouse (which seems to never send any events).
This needs the I2C_HID_QUIRK_NO_IRQ_AFTER_RESET quirk, otherwise we get
the following errors:
[ 3688.770850] i2c_hid i2c-ITE8568:00: failed to reset device.
[ 3694.915865] i2c_hid i2c-ITE8568:00: failed to reset device.
[ 3701.059717] i2c_hid i2c-ITE8568:00: failed to reset device.
[ 3707.205944] i2c_hid i2c-ITE8568:00: failed to reset device.
[ 3708.227940] i2c_hid i2c-ITE8568:00: can't add hid device: -61
[ 3708.236518] i2c_hid: probe of i2c-ITE8568:00 failed with error -61
Which leads to a significant boot delay.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f03c30cb8 ]
The PC_DBG_ECO_CNTL register on the Adreno A5xx family gets
programmed to some different values on a per-model basis.
At least, this is what we intend to do here;
Unfortunately, though, this register is being overwritten with a
static magic number, right after applying the GPU-specific
configuration (including the GPU-specific quirks) and that is
effectively nullifying the efforts.
Let's remove the redundant and wrong write to the PC_DBG_ECO_CNTL
register in order to retain the wanted configuration for the
target GPU.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b2bfc8aa5 ]
Some SoCs require a single scatterlist entry for smaller than page size,
i.e. 4KB. When dispatching commands with more than one scatterlist entry
under 4KB in size the following behavior is observed:
A command to read a block range is dispatched with two scatterlist entries
that are named AAA and BBB. After dispatching, the host builds two PRDT
entries and during transmission, device sends just one DATA IN because
device doesn't care about host DMA. The host then transfers the combined
amount of data from start address of the area named AAA. As a consequence,
the area that follows AAA in memory would be corrupted.
|<------------->|
+-------+------------ +-------+
+ AAA + (corrupted) ... + BBB +
+-------+------------ +-------+
To avoid this we need to enforce page size alignment for sg entries.
Link: https://lore.kernel.org/r/56dddef94f60bd9466fd77e69f64bbbd657ed2a1.1611026909.git.kwmad.kim@samsung.com
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1d0d2eb89 ]
The UniPro specification states that attribute IDs of the following
parameters are vendor-specific so some SoCs could have no regions at the
defined addresses:
- DME_LocalFC0ProtectionTimeOutVal
- DME_LocalTC0ReplayTimeOutVal
- DME_LocalAFC0ReqTimeOutVal
In addition, the following parameters should be set considering the
compatibility between host and device.
- PA_PWRMODEUSERDATA0
- PA_PWRMODEUSERDATA1
- PA_PWRMODEUSERDATA2
- PA_PWRMODEUSERDATA3
- PA_PWRMODEUSERDATA4
- PA_PWRMODEUSERDATA5
Introduce a quirk to allow vendor drivers to override the UniPro defaults.
Link: https://lore.kernel.org/r/1fedd3dea0ccc980913a5995a10510d86a5b01b9.1608513782.git.kwmad.kim@samsung.com
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4683d758f4 ]
Commit 7a873e4555 ("KVM: selftests: Verify supported CR4 bits can be set
before KVM_SET_CPUID2") reveals that KVM allows to set X86_CR4_PCIDE even
when PCID support is missing:
==== Test Assertion Failure ====
x86_64/set_sregs_test.c:41: rc
pid=6956 tid=6956 - Invalid argument
1 0x000000000040177d: test_cr4_feature_bit at set_sregs_test.c:41
2 0x00000000004014fc: main at set_sregs_test.c:119
3 0x00007f2d9346d041: ?? ??:0
4 0x000000000040164d: _start at ??:?
KVM allowed unsupported CR4 bit (0x20000)
Add X86_FEATURE_PCID feature check to __cr4_reserved_bits() to make
kvm_is_valid_cr4() fail.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210201142843.108190-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 448373d9db ]
Some platforms (e.g. TI) will not have any platform data which will
lead to NULL pointer dereference if we don't check for NULL pdata.
Fixes: 7cea965775 ("usb: cdns3: add quirk for enable runtime pm by default")
Reported-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7cea965775 ]
Some vendors (eg: NXP) may want to enable runtime pm by default for
power saving, add one quirk for it.
Reviewed-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed22764847 ]
cdns3 has some special PM sequence between xhci_bus_suspend and
xhci_suspend, add quirk to implement it.
Reviewed-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25417185e9 ]
The GIGABYTE GB-BXBT-2807 is a mini-PC which uses off the shelf
components, like an Intel GPU which is meant for mobile systems.
As such, it, by default, has a backlight controller exposed.
Unfortunately, the backlight controller only confuses userspace, which
sees the existence of a backlight device node and has the unrealistic
belief that there is actually a backlight there!
Add a DMI quirk to force the backlight off on this system.
Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net>
Reviewed-by: Chris Chiu <chiu@endlessos.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbf0b3a7b7 ]
On AMD Family 15h (Models 30h-3fh), I/O Memory Management Unit
RiSC engine sometimes stalls, requiring a reset.
As result, MythTV and w-scan won't scan channels on the AMD Kaveri
APU with the Hauppauge QuadHD TV tuner card.
For the solution I added the Input/Output Memory Management Unit's PCI
Identity of 0x1423 to the broken_dev_id[] array, which is used by
a quirks logic meant to fix similar problems with other AMD
chipsets.
Signed-off-by: Daniel Lee Kruse <daniel.lee.kruse@protonmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1008230f2a ]
Mayflash/Dragonrise seems to have yet another device ID for one of their
Gamecube controller adapters. Previous to this commit, the adapter
registered only one /dev/input/js* device, and all controller inputs (from
any controller) were mapped to this device. This patch defines the 1846
USB device ID and enables the HID_QUIRK_MULTI_INPUT quirk for it, which
fixes that (with the patch, four /dev/input/js* devices are created, one
for each of the four controller ports).
Signed-off-by: Ethan Warth <redyoshi49q@gmail.com>
Tested-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c54cb6c62 ]
Add support for SW_TABLET_MODE on the Acer Switch 10 (SW5-012) and the
acer Switch 10 (S1003) models.
There is no way to detect if this is supported, so this uses DMI based
quirks setting force_caps to ACER_CAP_KBD_DOCK (these devices have no
other acer-wmi based functionality).
The new SW_TABLET_MODE functionality can be tested on devices which
are not in the DMI table by passing acer_wmi.force_caps=0x40 on the
kernel commandline.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201019185628.264473-6-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 82cb8a5c39 ]
Not all devices supporting WMID_GUID3 support the wmid3_set_function_mode()
call, leading to errors like these:
[ 60.138358] acer_wmi: Enabling RF Button failed: 0x1 - 0xff
[ 60.140036] acer_wmi: Enabling Launch Manager failed: 0x1 - 0xff
Add an ACER_CAP_SET_FUNCTION_MODE capability flag, so that these calls
can be disabled through the new force_caps mechanism.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201019185628.264473-5-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39aa009bb6 ]
Add a new force_caps module parameter to allow overriding the drivers
builtin capability detection mechanism.
This can be used to for example:
-Disable rfkill functionality on devices where there is an AA OEM DMI
record advertising non functional rfkill switches
-Force loading of the driver on devices with a missing AA OEM DMI record
Note that force_caps is -1 when unset, this allows forcing the
capability field to 0, which results in acer-wmi only providing WMI
hotkey handling while disabling all other (led, rfkill, backlight)
functionality.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201019185628.264473-4-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9feb0763e4 ]
Cleanup accelerometer device handling:
-Drop acer_wmi_accel_destroy instead directly call input_unregister_device.
-The information tracked by the CAP_ACCEL flag mirrors acer_wmi_accel_dev
being NULL. Drop the CAP flag, this is a preparation change for allowing
users to override the capability flags. Dropping the flag stops users
from causing a NULL pointer dereference by forcing the capability.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201019185628.264473-3-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cfeeea60af ]
We need to enable no-reset-on-init quirk for GPMC if the config
option for CONFIG_OMAP_GPMC_DEBUG is set. Otherwise the GPMC
driver code is unable to show the bootloader configured timings.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4add4d988f ]
If a reset is performed, but even the reset fails for some reasons (e.g.,
on Surface devices, the fw reset requires another quirks),
cancel_work_sync() hangs in mwifiex_cleanup_pcie().
# firmware went into a bad state
[...]
[ 1608.281690] mwifiex_pcie 0000:03:00.0: info: shutdown mwifiex...
[ 1608.282724] mwifiex_pcie 0000:03:00.0: rx_pending=0, tx_pending=1, cmd_pending=0
[ 1608.292400] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed
[ 1608.292405] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed
# reset performed after firmware went into a bad state
[ 1609.394320] mwifiex_pcie 0000:03:00.0: WLAN FW already running! Skip FW dnld
[ 1609.394335] mwifiex_pcie 0000:03:00.0: WLAN FW is active
# but even the reset failed
[ 1619.499049] mwifiex_pcie 0000:03:00.0: mwifiex_cmd_timeout_func: Timeout cmd id = 0xfa, act = 0xe000
[ 1619.499094] mwifiex_pcie 0000:03:00.0: num_data_h2c_failure = 0
[ 1619.499103] mwifiex_pcie 0000:03:00.0: num_cmd_h2c_failure = 0
[ 1619.499110] mwifiex_pcie 0000:03:00.0: is_cmd_timedout = 1
[ 1619.499117] mwifiex_pcie 0000:03:00.0: num_tx_timeout = 0
[ 1619.499124] mwifiex_pcie 0000:03:00.0: last_cmd_index = 0
[ 1619.499133] mwifiex_pcie 0000:03:00.0: last_cmd_id: fa 00 07 01 07 01 07 01 07 01
[ 1619.499140] mwifiex_pcie 0000:03:00.0: last_cmd_act: 00 e0 00 00 00 00 00 00 00 00
[ 1619.499147] mwifiex_pcie 0000:03:00.0: last_cmd_resp_index = 3
[ 1619.499155] mwifiex_pcie 0000:03:00.0: last_cmd_resp_id: 07 81 07 81 07 81 07 81 07 81
[ 1619.499162] mwifiex_pcie 0000:03:00.0: last_event_index = 2
[ 1619.499169] mwifiex_pcie 0000:03:00.0: last_event: 58 00 58 00 58 00 58 00 58 00
[ 1619.499177] mwifiex_pcie 0000:03:00.0: data_sent=0 cmd_sent=1
[ 1619.499185] mwifiex_pcie 0000:03:00.0: ps_mode=0 ps_state=0
[ 1619.499215] mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device
# mwifiex_pcie_work hang happening
[ 1823.233923] INFO: task kworker/3:1:44 blocked for more than 122 seconds.
[ 1823.233932] Tainted: G WC OE 5.10.0-rc1-1-mainline #1
[ 1823.233935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1823.233940] task:kworker/3:1 state:D stack: 0 pid: 44 ppid: 2 flags:0x00004000
[ 1823.233960] Workqueue: events mwifiex_pcie_work [mwifiex_pcie]
[ 1823.233965] Call Trace:
[ 1823.233981] __schedule+0x292/0x820
[ 1823.233990] schedule+0x45/0xe0
[ 1823.233995] schedule_timeout+0x11c/0x160
[ 1823.234003] wait_for_completion+0x9e/0x100
[ 1823.234012] __flush_work.isra.0+0x156/0x210
[ 1823.234018] ? flush_workqueue_prep_pwqs+0x130/0x130
[ 1823.234026] __cancel_work_timer+0x11e/0x1a0
[ 1823.234035] mwifiex_cleanup_pcie+0x28/0xd0 [mwifiex_pcie]
[ 1823.234049] mwifiex_free_adapter+0x24/0xe0 [mwifiex]
[ 1823.234060] _mwifiex_fw_dpc+0x294/0x560 [mwifiex]
[ 1823.234074] mwifiex_reinit_sw+0x15d/0x300 [mwifiex]
[ 1823.234080] mwifiex_pcie_reset_done+0x50/0x80 [mwifiex_pcie]
[ 1823.234087] pci_try_reset_function+0x5c/0x90
[ 1823.234094] process_one_work+0x1d6/0x3a0
[ 1823.234100] worker_thread+0x4d/0x3d0
[ 1823.234107] ? rescuer_thread+0x410/0x410
[ 1823.234112] kthread+0x142/0x160
[ 1823.234117] ? __kthread_bind_mask+0x60/0x60
[ 1823.234124] ret_from_fork+0x22/0x30
[...]
This is a deadlock caused by calling cancel_work_sync() in
mwifiex_cleanup_pcie():
- Device resets are done via mwifiex_pcie_card_reset()
- which schedules card->work to call mwifiex_pcie_card_reset_work()
- which calls pci_try_reset_function().
- This leads to mwifiex_pcie_reset_done() be called on the same workqueue,
which in turn calls
- mwifiex_reinit_sw() and that calls
- _mwifiex_fw_dpc().
The problem is now that _mwifiex_fw_dpc() calls mwifiex_free_adapter()
in case firmware initialization fails. That ends up calling
mwifiex_cleanup_pcie().
Note that all those calls are still running on the workqueue. So when
mwifiex_cleanup_pcie() now calls cancel_work_sync(), it's really waiting
on itself to complete, causing a deadlock.
This commit fixes the deadlock by skipping cancel_work_sync() on a reset
failure path.
After this commit, when reset fails, the following output is
expected to be shown:
kernel: mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device
kernel: mwifiex: Failed to bring up adapter: -5
kernel: mwifiex_pcie 0000:03:00.0: reinit failed: -5
To reproduce this issue, for example, try putting the root port of wifi
into D3 (replace "00:1d.3" with your setup).
# put into D3 (root port)
sudo setpci -v -s 00:1d.3 CAP_PM+4.b=0b
Cc: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201028142346.18355-1-kitakar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 547801380e ]
WCN3991 supports connectable advertisements so we need to add the valid
le states quirk so the 'central-peripheral' role is exposed in
userspace.
Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4d14c5cde5 upstream
Calling btrfs_qgroup_reserve_meta_prealloc from
btrfs_delayed_inode_reserve_metadata can result in flushing delalloc
while holding a transaction and delayed node locks. This is deadlock
prone. In the past multiple commits:
* ae5e070eac ("btrfs: qgroup: don't try to wait flushing if we're
already holding a transaction")
* 6f23277a49 ("btrfs: qgroup: don't commit transaction when we already
hold the handle")
Tried to solve various aspects of this but this was always a
whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock
scenario involving btrfs_delayed_node::mutex. Namely, one thread
can call btrfs_dirty_inode as a result of reading a file and modifying
its atime:
PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test"
#0 __schedule at ffffffffa529e07d
#1 schedule at ffffffffa529e4ff
#2 schedule_timeout at ffffffffa52a1bdd
#3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held
#4 start_delalloc_inodes at ffffffffc0380db5
#5 btrfs_start_delalloc_snapshot at ffffffffc0393836
#6 try_flush_qgroup at ffffffffc03f04b2
#7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes.
#8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex
#9 btrfs_update_inode at ffffffffc0385ba8
#10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED
#11 touch_atime at ffffffffa4cf0000
#12 generic_file_read_iter at ffffffffa4c1f123
#13 new_sync_read at ffffffffa4ccdc8a
#14 vfs_read at ffffffffa4cd0849
#15 ksys_read at ffffffffa4cd0bd1
#16 do_syscall_64 at ffffffffa4a052eb
#17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c
This will cause an asynchronous work to flush the delalloc inodes to
happen which can try to acquire the same delayed_node mutex:
PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30"
#0 __schedule at ffffffffa529e07d
#1 schedule at ffffffffa529e4ff
#2 schedule_preempt_disabled at ffffffffa529e80a
#3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up.
#4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex
#5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding
#6 cow_file_range_inline.constprop.78 at ffffffffc0386be7
#7 cow_file_range at ffffffffc03879c1
#8 btrfs_run_delalloc_range at ffffffffc038894c
#9 writepage_delalloc at ffffffffc03a3c8f
#10 __extent_writepage at ffffffffc03a4c01
#11 extent_write_cache_pages at ffffffffc03a500b
#12 extent_writepages at ffffffffc03a6de2
#13 do_writepages at ffffffffa4c277eb
#14 __filemap_fdatawrite_range at ffffffffa4c1e5bb
#15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes
#16 normal_work_helper at ffffffffc03b706c
#17 process_one_work at ffffffffa4aba4e4
#18 worker_thread at ffffffffa4aba6fd
#19 kthread at ffffffffa4ac0a3d
#20 ret_from_fork at ffffffffa54001ff
To fully address those cases the complete fix is to never issue any
flushing while holding the transaction or the delayed node lock. This
patch achieves it by calling qgroup_reserve_meta directly which will
either succeed without flushing or will fail and return -EDQUOT. In the
latter case that return value is going to be propagated to
btrfs_dirty_inode which will fallback to start a new transaction. That's
fine as the majority of time we expect the inode will have
BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly
copying the in-memory state.
Fixes: c53e965360 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 778e45d772 upstream
The kernel test robot reported multiple linkage problems like this:
hppa64-linux-ld: init/main.o(.init.text+0x56c): cannot reach printk
init/main.o: in function `unknown_bootoption':
(.init.text+0x56c): relocation truncated to fit: R_PARISC_PCREL22F against
symbol `printk' defined in .text.unlikely section in kernel/printk/printk.o
There are two ways to solve it:
a) Enable the -mlong-call compiler option (CONFIG_MLONGCALLS),
b) Add long branch stub support in 64-bit linker.
While b) is the long-term solution, this patch works around the issue by
automatically enabling the CONFIG_MLONGCALLS option when
CONFIG_COMPILE_TEST is set, which indicates that a non-production kernel
(e.g. 0-day kernel) is built.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 00e35f2b0e ("parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES")
Cc: stable@vger.kernel.org # v5.6+
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc22c1c058 upstream
My 2TB SKC2000 showed the exact same symptoms that were provided
in 538e4a8c57 ("nvme-pci: avoid the deepest sleep state on
Kingston A2000 SSDs"), i.e. a complete NVME lockup that needed
cold boot to get it back.
According to some sources, the A2000 is simply a rebadged
SKC2000 with a slightly optimized firmware.
Adding the SKC2000 PCI ID to the quirk list with the same workaround
as the A2000 made my laptop survive a 5 hours long Yocto bootstrap
buildfest which reliably triggered the SSD lockup previously.
Signed-off-by: Zoltán Böszörményi <zboszor@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>