[ Upstream commit f787216f33 ]
The CPG/MSSR Clock Domain driver does not implement the
generic_pm_domain.power_{on,off}() callbacks, as the domain itself
cannot be powered down. Hence the domain should be marked as always-on
by setting the GENPD_FLAG_ALWAYS_ON flag, to prevent the core PM Domain
code from considering it for power-off, and doing unnessary processing.
Note that this only affects RZ/A2 SoCs. On R-Car Gen2 and Gen3 SoCs,
the R-Car SYSC driver handles Clock Domain creation, and offloads only
device attachment/detachment to the CPG/MSSR driver.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a459a184c9 ]
The CPG/MSTP Clock Domain driver does not implement the
generic_pm_domain.power_{on,off}() callbacks, as the domain itself
cannot be powered down. Hence the domain should be marked as always-on
by setting the GENPD_FLAG_ALWAYS_ON flag, to prevent the core PM Domain
code from considering it for power-off, and doing unnessary processing.
This also gets rid of a boot warning when the Clock Domain contains an
IRQ-safe device, e.g. on RZ/A1:
sh_mtu2 fcff0000.timer: PM domain cpg_clocks will not be powered off
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d21b8adbd4 ]
When cold-booting Asus X434DA, GPIO 7 is found to be already configured
as an interrupt, and the GPIO level is found to be in a state that
causes the interrupt to fire.
As soon as pinctrl-amd probes, this interrupt fires and invokes
amd_gpio_irq_handler(). The IRQ is acked, but no GPIO-IRQ handler was
invoked, so the GPIO level being unchanged just causes another interrupt
to fire again immediately after.
This results in an interrupt storm causing this platform to hang
during boot, right after pinctrl-amd is probed.
Detect this situation and disable the GPIO interrupt when this happens.
This enables the affected platform to boot as normal. GPIO 7 actually is
the I2C touchpad interrupt line, and later on, i2c-multitouch loads and
re-enables this interrupt when it is ready to handle it.
Instead of this approach, I considered disabling all GPIO interrupts at
probe time, however that seems a little risky, and I also confirmed that
Windows does not seem to have this behaviour: the same 41 GPIO IRQs are
enabled under both Linux and Windows, which is a far larger collection
than the GPIOs referenced by the DSDT on this platform.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Link: https://lore.kernel.org/r/20190814090540.7152-1-drake@endlessm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a1af2afbd2 ]
Some, mostly Fermi, vbioses appear to have zero max voltage. That causes Nouveau to not parse voltage entries, thus users not being able to set higher clocks.
When changing this value Nvidia driver still appeared to ignore it, and I wasn't able to find out why, thus the code is ignoring the value if it is zero.
CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 92c8026854 ]
vfio_pci_enable() saves the device's initial configuration information
with the intent that it is restored in vfio_pci_disable(). However,
the commit referenced in Fixes: below replaced the call to
__pci_reset_function_locked(), which is not wrapped in a state save
and restore, with pci_try_reset_function(), which overwrites the
restored device state with the current state before applying it to the
device. Reinstate use of __pci_reset_function_locked() to return to
the desired behavior.
Fixes: 890ed578df ("vfio-pci: Use pci "try" reset interface")
Signed-off-by: hexin <hexin15@baidu.com>
Signed-off-by: Liu Qi <liuqi16@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa06e3d60e ]
The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the
use of driver callbacks in drivers that have been bound part way
through the recovery process. This is necessary to prevent later stage
handlers from being called when the earlier stage handlers haven't,
which can be confusing for drivers.
However, the flag is set for all devices that are added after boot
time and only cleared at the end of the EEH recovery process. This
results in hot plugged devices erroneously having the flag set during
the first recovery after they are added (causing their driver's
handlers to be incorrectly ignored).
To remedy this, clear the flag at the beginning of recovery
processing. The flag is still cleared at the end of recovery
processing, although it is no longer really necessary.
Also clear the flag during eeh_handle_special_event(), for the same
reasons.
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b8ca5629d27de74c957d4f4b250177d1b6fc4bbd.1565930772.git.sbobroff@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ccfb5bd71d ]
After a partition migration, pseries_devicetree_update() processes
changes to the device tree communicated from the platform to
Linux. This is a relatively heavyweight operation, with multiple
device tree searches, memory allocations, and conversations with
partition firmware.
There's a few levels of nested loops which are bounded only by
decisions made by the platform, outside of Linux's control, and indeed
we have seen RCU stalls on large systems while executing this call
graph. Use cond_resched() in these loops so that the cpu is yielded
when needed.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-4-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38a0d0cdb4 ]
We see warnings such as:
kernel/futex.c: In function 'do_futex':
kernel/futex.c:1676:17: warning: 'oldval' may be used uninitialized in this function [-Wmaybe-uninitialized]
return oldval == cmparg;
^
kernel/futex.c:1651:6: note: 'oldval' was declared here
int oldval, ret;
^
This is because arch_futex_atomic_op_inuser() only sets *oval if ret
is 0 and GCC doesn't see that it will only use it when ret is 0.
Anyway, the non-zero ret path is an error path that won't suffer from
setting *oval, and as *oval is a local var in futex_atomic_op_inuser()
it will have no impact.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: reword change log slightly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/86b72f0c134367b214910b27b9a6dd3321af93bb.1565774657.git.christophe.leroy@c-s.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6717c01dd ]
The LPAR migration implementation and userspace-initiated cpu hotplug
can interleave their executions like so:
1. Set cpu 7 offline via sysfs.
2. Begin a partition migration, whose implementation requires the OS
to ensure all present cpus are online; cpu 7 is onlined:
rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up
This sets cpu 7 online in all respects except for the cpu's
corresponding struct device; dev->offline remains true.
3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is
already online and returns success. The driver core (device_online)
sets dev->offline = false.
4. The migration completes and restores cpu 7 to offline state:
rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down
This leaves cpu7 in a state where the driver core considers the cpu
device online, but in all other respects it is offline and
unused. Attempts to online the cpu via sysfs appear to succeed but the
driver core actually does not pass the request to the lower-level
cpuhp support code. This makes the cpu unusable until the cpu device
is manually set offline and then online again via sysfs.
Instead of directly calling cpu_up/cpu_down, the migration code should
use the higher-level device core APIs to maintain consistent state and
serialize operations.
Fixes: 120496ac2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3e0dbd7f7 ]
Currently, the xmon 'dx' command calls OPAL to dump the XIVE state in
the OPAL logs and also outputs some of the fields of the internal XIVE
structures in Linux. The OPAL calls can only be done on baremetal
(PowerNV) and they crash a pseries machine. Fix by checking the
hypervisor feature of the CPU.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190814154754.23682-2-clg@kaod.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf9ec1fc6d ]
A future patch is going to change semantics of clk_register() so that
clk_hw::init is guaranteed to be NULL after a clk is registered. Avoid
referencing this member here so that we don't run into NULL pointer
exceptions.
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20190731193517.237136-2-sboyd@kernel.org
[sboyd@kernel.org: Move name to after checking for error or NULL hw]
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c37c792dec ]
We allocate only the first level of multilevel TCE tables for KVM
already (alloc_userspace_copy==true), and the rest is allocated on demand.
This is not enabled though for bare metal.
This removes the KVM limitation (implicit, via the alloc_userspace_copy
parameter) and always allocates just the first level. The on-demand
allocation of missing levels is already implemented.
As from now on DMA map might happen with disabled interrupts, this
allocates TCEs with GFP_ATOMIC; otherwise lockdep reports errors 1].
In practice just a single page is allocated there so chances for failure
are quite low.
To save time when creating a new clean table, this skips non-allocated
indirect TCE entries in pnv_tce_free just like we already do in
the VFIO IOMMU TCE driver.
This changes the default level number from 1 to 2 to reduce the amount
of memory required for the default 32bit DMA window at the boot time.
The default window size is up to 2GB which requires 4MB of TCEs which is
unlikely to be used entirely or at all as most devices these days are
64bit capable so by switching to 2 levels by default we save 4032KB of
RAM per a device.
While at this, add __GFP_NOWARN to alloc_pages_node() as the userspace
can trigger this path via VFIO, see the failure and try creating a table
again with different parameters which might succeed.
[1]:
===
BUG: sleeping function called from invalid context at mm/page_alloc.c:4596
in_atomic(): 1, irqs_disabled(): 1, pid: 1038, name: scsi_eh_1
2 locks held by scsi_eh_1/1038:
#0: 000000005efd659a (&host->eh_mutex){+.+.}, at: ata_eh_acquire+0x34/0x80
#1: 0000000006cf56a6 (&(&host->lock)->rlock){....}, at: ata_exec_internal_sg+0xb0/0x5c0
irq event stamp: 500
hardirqs last enabled at (499): [<c000000000cb8a74>] _raw_spin_unlock_irqrestore+0x94/0xd0
hardirqs last disabled at (500): [<c000000000cb85c4>] _raw_spin_lock_irqsave+0x44/0x120
softirqs last enabled at (0): [<c000000000101120>] copy_process.isra.4.part.5+0x640/0x1a80
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 73 PID: 1038 Comm: scsi_eh_1 Not tainted 5.2.0-rc6-le_nv2_aikATfstn1-p1 #634
Call Trace:
[c000003d064cef50] [c000000000c8e6c4] dump_stack+0xe8/0x164 (unreliable)
[c000003d064cefa0] [c00000000014ed78] ___might_sleep+0x2f8/0x310
[c000003d064cf020] [c0000000003ca084] __alloc_pages_nodemask+0x2a4/0x1560
[c000003d064cf220] [c0000000000c2530] pnv_alloc_tce_level.isra.0+0x90/0x130
[c000003d064cf290] [c0000000000c2888] pnv_tce+0x128/0x3b0
[c000003d064cf360] [c0000000000c2c00] pnv_tce_build+0xb0/0xf0
[c000003d064cf3c0] [c0000000000bbd9c] pnv_ioda2_tce_build+0x3c/0xb0
[c000003d064cf400] [c00000000004cfe0] ppc_iommu_map_sg+0x210/0x550
[c000003d064cf510] [c00000000004b7a4] dma_iommu_map_sg+0x74/0xb0
[c000003d064cf530] [c000000000863944] ata_qc_issue+0x134/0x470
[c000003d064cf5b0] [c000000000863ec4] ata_exec_internal_sg+0x244/0x5c0
[c000003d064cf700] [c0000000008642d0] ata_exec_internal+0x90/0xe0
[c000003d064cf780] [c0000000008650ac] ata_dev_read_id+0x2ec/0x640
[c000003d064cf8d0] [c000000000878e28] ata_eh_recover+0x948/0x16d0
[c000003d064cfa10] [c00000000087d760] sata_pmp_error_handler+0x480/0xbf0
[c000003d064cfbc0] [c000000000884624] ahci_error_handler+0x74/0xe0
[c000003d064cfbf0] [c000000000879fa8] ata_scsi_port_error_handler+0x2d8/0x7c0
[c000003d064cfca0] [c00000000087a544] ata_scsi_error+0xb4/0x100
[c000003d064cfd00] [c000000000802450] scsi_error_handler+0x120/0x510
[c000003d064cfdb0] [c000000000140c48] kthread+0x1b8/0x1c0
[c000003d064cfe20] [c00000000000bd8c] ret_from_kernel_thread+0x5c/0x70
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
irq event stamp: 2305
========================================================
hardirqs last enabled at (2305): [<c00000000000e4c8>] fast_exc_return_irq+0x28/0x34
hardirqs last disabled at (2303): [<c000000000cb9fd0>] __do_softirq+0x4a0/0x654
WARNING: possible irq lock inversion dependency detected
5.2.0-rc6-le_nv2_aikATfstn1-p1 #634 Tainted: G W
softirqs last enabled at (2304): [<c000000000cba054>] __do_softirq+0x524/0x654
softirqs last disabled at (2297): [<c00000000010f278>] irq_exit+0x128/0x180
--------------------------------------------------------
swapper/0/0 just changed the state of lock:
0000000006cf56a6 (&(&host->lock)->rlock){-...}, at: ahci_single_level_irq_intr+0xac/0x120
but this lock took another, HARDIRQ-unsafe lock in the past:
(fs_reclaim){+.+.}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
local_irq_disable();
lock(&(&host->lock)->rlock);
lock(fs_reclaim);
<Interrupt>
lock(&(&host->lock)->rlock);
*** DEADLOCK ***
no locks held by swapper/0/0.
the shortest dependencies between 2nd lock and 1st lock:
-> (fs_reclaim){+.+.} ops: 167579 {
HARDIRQ-ON-W at:
lock_acquire+0xf8/0x2a0
fs_reclaim_acquire.part.23+0x44/0x60
kmem_cache_alloc_node_trace+0x80/0x590
alloc_desc+0x64/0x270
__irq_alloc_descs+0x2e4/0x3a0
irq_domain_alloc_descs+0xb0/0x150
irq_create_mapping+0x168/0x2c0
xics_smp_probe+0x2c/0x98
pnv_smp_probe+0x40/0x9c
smp_prepare_cpus+0x524/0x6c4
kernel_init_freeable+0x1b4/0x650
kernel_init+0x2c/0x148
ret_from_kernel_thread+0x5c/0x70
SOFTIRQ-ON-W at:
lock_acquire+0xf8/0x2a0
fs_reclaim_acquire.part.23+0x44/0x60
kmem_cache_alloc_node_trace+0x80/0x590
alloc_desc+0x64/0x270
__irq_alloc_descs+0x2e4/0x3a0
irq_domain_alloc_descs+0xb0/0x150
irq_create_mapping+0x168/0x2c0
xics_smp_probe+0x2c/0x98
pnv_smp_probe+0x40/0x9c
smp_prepare_cpus+0x524/0x6c4
kernel_init_freeable+0x1b4/0x650
kernel_init+0x2c/0x148
ret_from_kernel_thread+0x5c/0x70
INITIAL USE at:
lock_acquire+0xf8/0x2a0
fs_reclaim_acquire.part.23+0x44/0x60
kmem_cache_alloc_node_trace+0x80/0x590
alloc_desc+0x64/0x270
__irq_alloc_descs+0x2e4/0x3a0
irq_domain_alloc_descs+0xb0/0x150
irq_create_mapping+0x168/0x2c0
xics_smp_probe+0x2c/0x98
pnv_smp_probe+0x40/0x9c
smp_prepare_cpus+0x524/0x6c4
kernel_init_freeable+0x1b4/0x650
kernel_init+0x2c/0x148
ret_from_kernel_thread+0x5c/0x70
}
===
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190718051139.74787-4-aik@ozlabs.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5382701c3 ]
[Why]
The vm config will be clear to 0 when system enter S4. It will
cause hubbub didn't know how to fetch data when system resume.
The flip always pending because earliest_inuse_address and
request_address are different.
[How]
Reprogram VM config when system resume
Signed-off-by: Lewis Huang <Lewis.Huang@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1cbcfc9751 ]
[Why]
When endpoint is at the boundary of a region, such as at 2^0=1
we find that the last segment has a sharp slope and some points
are clipped at the top.
[How]
If end point is 1, which is exactly at the 2^0 region boundary, we
need to program an additional region beyond this point.
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 720099603d ]
The MMC2 clock slices are currently not defined in V3s CCU driver, which
makes MMC2 not working.
Fix this issue.
Fixes: d0f11d14b0 ("clk: sunxi-ng: add support for V3s CCU")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 340ff31ab0 ]
ipmi_thread() uses back-to-back schedule() to poll for command
completion which, on some machines, can push up CPU consumption and
heavily tax the scheduler locks leading to noticeable overall
performance degradation.
This was originally added so firmware updates through IPMI would
complete in a timely manner. But we can't kill the scheduler
locks for that one use case.
Instead, only run schedule() continuously in maintenance mode,
where firmware updates should run.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0df3e42167 ]
When building with -Wsometimes-uninitialized, clang warns:
drivers/pci/hotplug/rpaphp_core.c:243:14: warning: variable 'fndit' is
used uninitialized whenever 'for' loop exits because its condition is
false [-Wsometimes-uninitialized]
for (j = 0; j < entries; j++) {
^~~~~~~~~~~
drivers/pci/hotplug/rpaphp_core.c:256:6: note: uninitialized use occurs
here
if (fndit)
^~~~~
drivers/pci/hotplug/rpaphp_core.c:243:14: note: remove the condition if
it is always true
for (j = 0; j < entries; j++) {
^~~~~~~~~~~
drivers/pci/hotplug/rpaphp_core.c:233:14: note: initialize the variable
'fndit' to silence this warning
int j, fndit;
^
= 0
fndit is only used to gate a sprintf call, which can be moved into the
loop to simplify the code and eliminate the local variable, which will
fix this warning.
Fixes: 2fcf3ae508 ("hotplug/drc-info: Add code to search ibm,drc-info property")
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/504
Link: https://lore.kernel.org/r/20190603221157.58502-1-natechancellor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3eb9b8f67 ]
In radeon_connector_set_property(), there is an if statement on line 743
to check whether connector->encoder is NULL:
if (connector->encoder)
When connector->encoder is NULL, it is used on line 755:
if (connector->encoder->crtc)
Thus, a possible null-pointer dereference may occur.
To fix this bug, connector->encoder is checked before being used.
This bug is found by a static analysis tool STCheck written by us.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f7fe9a93e ]
During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.
Signed-off-by: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8c6bfc612 ]
The horizontal blanking periods are too short, as the values are
specified for a single LVDS channel. Since this panel is dual LVDS
they need to be doubled. With this change the panel reaches its
nominal vrefresh rate of 60Fps, instead of the 64Fps with the
current wrong blanking.
Philipp Zabel added:
The datasheet specifies 960 active clocks + 40/128/160 clocks blanking
on each of the two LVDS channels (min/typical/max), so doubled this is
now correct.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1562764060.23869.12.camel@pengutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit bb264220d9 upstream.
Laptops with AMD APU doesn't restore display backlight brightness after
system resume.
This issue started when DC was introduced.
Let's use BL_CORE_SUSPENDRESUME so the backlight core calls
update_status callback after system resume to restore the backlight
level.
Tested on Dell Inspiron 3180 (Stoney Ridge) and Dell Latitude 5495
(Raven Ridge).
Cc: <stable@vger.kernel.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 76e43c8cca ]
When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.
This may have to wait for fuse_iqueue::waitq.lock to be released by one
of many places that take it with IRQs enabled. Since the IRQ handler
may take kioctx::ctx_lock, lockdep reports that a deadlock is possible.
Fix it by protecting the state of struct fuse_iqueue with a separate
spinlock, and only accessing fuse_iqueue::waitq using the versions of
the waitqueue functions which do IRQ-safe locking internally.
Reproducer:
#include <fcntl.h>
#include <stdio.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <linux/aio_abi.h>
int main()
{
char opts[128];
int fd = open("/dev/fuse", O_RDWR);
aio_context_t ctx = 0;
struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd };
struct iocb *cbp = &cb;
sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd);
mkdir("mnt", 0700);
mount("foo", "mnt", "fuse", 0, opts);
syscall(__NR_io_setup, 1, &ctx);
syscall(__NR_io_submit, ctx, 1, &cbp);
}
Beginning of lockdep output:
=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.3.0-rc5 #9 Not tainted
-----------------------------------------------------
syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline]
000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825
and this task is already holding:
0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline]
0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline]
0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825
which would create a new lock dependency:
(&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}
but this new dependency connects a SOFTIRQ-irq-safe lock:
(&(&ctx->ctx_lock)->rlock){..-.}
[...]
Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com
Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com
Fixes: bfe4037e72 ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.19+
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c84a1372df ]
If the drives in a RAID0 are not all the same size, the array is
divided into zones.
The first zone covers all drives, to the size of the smallest.
The second zone covers all drives larger than the smallest, up to
the size of the second smallest - etc.
A change in Linux 3.14 unintentionally changed the layout for the
second and subsequent zones. All the correct data is still stored, but
each chunk may be assigned to a different device than in pre-3.14 kernels.
This can lead to data corruption.
It is not possible to determine what layout to use - it depends which
kernel the data was written by.
So we add a module parameter to allow the old (0) or new (1) layout to be
specified, and refused to assemble an affected array if that parameter is
not set.
Fixes: 20d0189b10 ("block: Introduce new bio_split()")
cc: stable@vger.kernel.org (3.14+)
Acked-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a016e2794f upstream.
There may be situations when a server negotiates SMB 2.1
protocol version or higher but responds to a CREATE request
with an oplock rather than a lease.
Currently the client doesn't handle such a case correctly:
when another CREATE comes in the server sends an oplock
break to the initial CREATE and the client doesn't send
an ack back due to a wrong caching level being set (READ
instead of RWH). Missing an oplock break ack makes the
server wait until the break times out which dramatically
increases the latency of the second CREATE.
Fix this by properly detecting oplocks when using SMB 2.1
protocol version and higher.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a71e2ac1f3 upstream.
The NACKF flag should be cleared in INTRIICNAKI interrupt processing as
description in HW manual.
This issue shows up quickly when PREEMPT_RT is applied and a device is
probed that is not plugged in (like a touchscreen controller). The result
is endless interrupts that halt system boot.
Fixes: 310c18a414 ("i2c: riic: add driver")
Cc: stable@vger.kernel.org
Reported-by: Chien Nguyen <chien.nguyen.eb@rvc.renesas.com>
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78887832e7 upstream.
add_early_randomness() is called by hwrng_register() when the
hardware is added. If this hardware and its module are present
at boot, and if there is no data available the boot hangs until
data are available and can't be interrupted.
For instance, in the case of virtio-rng, in some cases the host can be
not able to provide enough entropy for all the guests.
We can have two easy ways to reproduce the problem but they rely on
misconfiguration of the hypervisor or the egd daemon:
- if virtio-rng device is configured to connect to the egd daemon of the
host but when the virtio-rng driver asks for data the daemon is not
connected,
- if virtio-rng device is configured to connect to the egd daemon of the
host but the egd daemon doesn't provide data.
The guest kernel will hang at boot until the virtio-rng driver provides
enough data.
To avoid that, call rng_get_data() in non-blocking mode (wait=0)
from add_early_randomness().
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Fixes: d9e7972619 ("hwrng: add randomness to system from rng...")
Cc: <stable@vger.kernel.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6565c18209 upstream.
Quoted from
commit 3da40c7b08 ("ext4: only call ext4_truncate when size <= isize")
" At LSF we decided that if we truncate up from isize we shouldn't trim
fallocated blocks that were fallocated with KEEP_SIZE and are past the
new i_size. This patch fixes ext4 to do this. "
And generic/092 of fstest have covered this case for long time, however
is_quota_modification() didn't adjust based on that rule, so that in
below condition, we will lose to quota block change:
- fallocate blocks beyond EOF
- remount
- truncate(file_path, file_size)
Fix it.
Link: https://lore.kernel.org/r/20190911093650.35329-1-yuchao0@huawei.com
Fixes: 3da40c7b08 ("ext4: only call ext4_truncate when size <= isize")
CC: stable@vger.kernel.org
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1e8220bd3 upstream.
If a program attempts to punch a hole on an inline data file, we need
to convert it to a normal file first.
This was detected using ext4/032 using the adv configuration. Simple
reproducer:
mke2fs -Fq -t ext4 -O inline_data /dev/vdc
mount /vdc
echo "" > /vdc/testfile
xfs_io -c 'truncate 33554432' /vdc/testfile
xfs_io -c 'fpunch 0 1048576' /vdc/testfile
umount /vdc
e2fsck -fy /dev/vdc
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3d550c2c4 upstream.
Really enable warning when CONFIG_EXT4_DEBUG is set and fix missing
first argument. This was introduced in commit ff95ec22cd ("ext4:
add warning to ext4_convert_unwritten_extents_endio") and splitting
extents inside endio would trigger it.
Fixes: ff95ec22cd ("ext4: add warning to ext4_convert_unwritten_extents_endio")
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8619e5bdee upstream.
syzbot found that a thread can stall for minutes inside read_mem() or
write_mem() after that thread was killed by SIGKILL [1]. Reading from
iomem areas of /dev/mem can be slow, depending on the hardware.
While reading 2GB at one read() is legal, delaying termination of killed
thread for minutes is bad. Thus, allow reading/writing /dev/mem and
/dev/kmem to be preemptible and killable.
[ 1335.912419][T20577] read_mem: sz=4096 count=2134565632
[ 1335.943194][T20577] read_mem: sz=4096 count=2134561536
[ 1335.978280][T20577] read_mem: sz=4096 count=2134557440
[ 1336.011147][T20577] read_mem: sz=4096 count=2134553344
[ 1336.041897][T20577] read_mem: sz=4096 count=2134549248
Theoretically, reading/writing /dev/mem and /dev/kmem can become
"interruptible". But this patch chose "killable". Future patch will make
them "interruptible" so that we can revert to "killable" if some program
regressed.
[1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@vger.kernel.org>
Reported-by: syzbot <syzbot+8ab2d0f39fb79fe6ca40@syzkaller.appspotmail.com>
Link: https://lore.kernel.org/r/1566825205-10703-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1d3ad84ea upstream.
Currently frame registrations are not purged, even when changing the
interface type. This can lead to potentially weird situations where
frames possibly not allowed on a given interface type remain registered
due to the type switching happening after registration.
The kernel currently relies on userspace apps to actually purge the
registrations themselves, this is not something that the kernel should
rely on.
Add a call to cfg80211_mlme_purge_registrations() to forcefully remove
any registrations left over prior to switching the iftype.
Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190828211110.15005-1-denkenz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 480523feae upstream.
Since commit 4ad23a9764 ("MD: use per-cpu counter for
writes_pending"), set_in_sync() is substantially more expensive: it
can wait for a full RCU grace period which can be 10s of milliseconds.
So we should only call it when the cost is justified.
md_check_recovery() currently calls set_in_sync() every time it finds
anything to do (on non-external active arrays). For an array
performing resync or recovery, this will be quite often.
Each call will introduce a delay to the md thread, which can noticeable
affect IO submission latency.
In md_check_recovery() we only need to call set_in_sync() if
'safemode' was non-zero at entry, meaning that there has been not
recent IO. So we save this "safemode was nonzero" state, and only
call set_in_sync() if it was non-zero.
This measurably reduces mean and maximum IO submission latency during
resync/recovery.
Reported-and-tested-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
Cc: stable@vger.kernel.org (v4.12+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d4b45d6af upstream.
Until revalidate_disk() has completed, the size of a new md array will
appear to be zero.
So we shouldn't report, through array_state, that the array is active
until that time.
udev rules check array_state to see if the array is ready. As soon as
it appear to be zero, fsck can be run. If it find the size to be
zero, it will fail.
So add a new flag to provide an interlock between do_md_run() and
array_state_show(). This flag is set while do_md_run() is active and
it prevents array_state_show() from reporting that the array is
active.
Before do_md_run() is called, ->pers will be NULL so array is
definitely not active.
After do_md_run() is called, revalidate_disk() will have run and the
array will be completely ready.
We also move various sysfs_notify*() calls out of md_run() into
do_md_run() after MD_NOT_READY is cleared. This ensure the
information is ready before the notification is sent.
Prior to v4.12, array_state_show() was called with the
mddev->reconfig_mutex held, which provided exclusion with do_md_run().
Note that MD_NOT_READY cleared twice. This is deliberate to cover
both success and error paths with minimal noise.
Fixes: b7b17c9b67 ("md: remove mddev_lock() from md_attr_show()")
Cc: stable@vger.kernel.org (v4.12++)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 143f6e733b upstream.
7471fb77ce ("md/raid6: Fix anomily when recovering a single device in
RAID6.") avoids rereading P when it can be computed from other members.
However, this misses the chance to re-write the right data to P. This
patch sets R5_ReadError if the re-read fails.
Also, when re-read is skipped, we also missed the chance to reset
rdev->read_errors to 0. It can fail the disk when there are many read
errors on P member disk (other disks don't have read error)
V2: upper layer read request don't read parity/Q data. So there is no
need to consider such situation.
This is Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 7471fb77ce ("md/raid6: Fix anomily when recovering a single device in RAID6.")
Cc: <stable@vger.kernel.org> #4.4+
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>