[ Upstream commit f5200c1424 ]
Hamming ECC doesn't cover the OOB data, so reading or writing OOB shall
always be done without ECC enabled.
This is a problem when adding JFFS2 cleanmarkers to erased blocks. If JFFS2
clenmarkers are added to the OOB with ECC enabled, OOB bytes will be changed
from ff ff ff to 00 00 00, reporting incorrect ECC errors.
Fixes: 27c5b17cd1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210224080210.23686-1-noltari@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e41a962f82 ]
There is a upstream commit cffa4b2122f5("regmap:debugfs:
Fix a memory leak when calling regmap_attach_dev") that
adds a if condition when create name for debugfs_name.
With below function invoking logical, debugfs_name is
freed in regmap_debugfs_exit(), but it is not created again
because of the if condition introduced by above commit.
regmap_reinit_cache()
regmap_debugfs_exit()
...
regmap_debugfs_init()
So, set debugfs_name to NULL after it is freed.
Fixes: cffa4b2122 ("regmap: debugfs: Fix a memory leak when calling regmap_attach_dev")
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Link: https://lore.kernel.org/r/20210226021737.7690-1-Meng.Li@windriver.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19c234a14e ]
While interpreting CC_STATUS, ROLE_CONTROL has to be read to make
sure that CC1/CC2 is not forced presenting Rp/Rd.
>From the TCPCI spec:
4.4.5.2 ROLE_CONTROL (Normative):
The TCPM shall write B6 (DRP) = 0b and B3..0 (CC1/CC2) if it wishes
to control the Rp/Rd directly instead of having the TCPC perform
DRP toggling autonomously. When controlling Rp/Rd directly, the
TCPM writes to B3..0 (CC1/CC2) each time it wishes to change the
CC1/CC2 values. This control is used for TCPM-TCPC implementing
Source or Sink only as well as when a connection has been detected
via DRP toggling but the TCPM wishes to attempt Try.Src or Try.Snk.
Table 4-22. CC_STATUS Register Definition:
If (ROLE_CONTROL.CC1 = Rd) or ConnectResult=1)
00b: SNK.Open (Below maximum vRa)
01b: SNK.Default (Above minimum vRd-Connect)
10b: SNK.Power1.5 (Above minimum vRd-Connect) Detects Rp-1.5A
11b: SNK.Power3.0 (Above minimum vRd-Connect) Detects Rp-3.0A
If (ROLE_CONTROL.CC2=Rd) or (ConnectResult=1)
00b: SNK.Open (Below maximum vRa)
01b: SNK.Default (Above minimum vRd-Connect)
10b: SNK.Power1.5 (Above minimum vRd-Connect) Detects Rp 1.5A
11b: SNK.Power3.0 (Above minimum vRd-Connect) Detects Rp 3.0A
Fixes: 74e656d6b0 ("staging: typec: Type-C Port Controller Interface driver (tcpci)")
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Link: https://lore.kernel.org/r/20210304070931.1947316-1-badhri@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3db1d52466 ]
In "tx_empty", we should poll TC bit in both DMA and PIO modes (instead of
TXE) to check transmission data register has been transmitted independently
of the FIFO mode. TC indicates that both transmit register and shift
register are empty. When shift register is empty, tx_empty should return
TIOCSER_TEMT instead of TC value.
Cleans the USART_CR_TC TCCF register define (transmission complete clear
flag) as it is duplicate of USART_ICR_TCCF.
Fixes: 48a6092fb4 ("serial: stm32-usart: Add STM32 USART Driver")
Signed-off-by: Erwan Le Ray <erwan.leray@foss.st.com>
Link: https://lore.kernel.org/r/20210304162308.8984-13-erwan.leray@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f264c6f6ae ]
Incorrect characters are observed on console during boot. This issue occurs
when init/main.c is modifying termios settings to open /dev/console on the
rootfs.
This patch adds a waiting loop in set_termios to wait for TX shift register
empty (and TX FIFO if any) before stopping serial port.
Fixes: 48a6092fb4 ("serial: stm32-usart: Add STM32 USART Driver")
Signed-off-by: Erwan Le Ray <erwan.leray@foss.st.com>
Link: https://lore.kernel.org/r/20210304162308.8984-4-erwan.leray@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8987efbb17 ]
The Maxim PMIC datasheets describe the interrupt line as active low
with a requirement of acknowledge from the CPU. Without specifying the
interrupt type in Devicetree, kernel might apply some fixed
configuration, not necessarily working for this hardware.
Additionally, the interrupt line is shared so using level sensitive
interrupt is here especially important to avoid races.
Fixes: c61248afa8 ("ARM: dts: Add max77686 RTC interrupt to cros5250-common")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-9-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6368c6056 ]
The Maxim PMIC datasheets describe the interrupt line as active low
with a requirement of acknowledge from the CPU. Without specifying the
interrupt type in Devicetree, kernel might apply some fixed
configuration, not necessarily working for this hardware.
Additionally, the interrupt line is shared so using level sensitive
interrupt is here especially important to avoid races.
Fixes: 47580e8d94 ("ARM: dts: Specify MAX77686 pmic interrupt for exynos5250-smdk5250")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-8-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6503c568e9 ]
The Maxim PMIC datasheets describe the interrupt line as active low
with a requirement of acknowledge from the CPU. Without specifying the
interrupt type in Devicetree, kernel might apply some fixed
configuration, not necessarily working for this hardware.
Additionally, the interrupt line is shared so using level sensitive
interrupt is here especially important to avoid races.
Fixes: eea6653aae ("ARM: dts: Enable PMIC interrupts for exynos4412-odroid-common")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-6-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e52dcd6e70 ]
The Maxim PMIC datasheets describe the interrupt line as active low
with a requirement of acknowledge from the CPU. Without specifying the
interrupt type in Devicetree, kernel might apply some fixed
configuration, not necessarily working for this hardware.
Additionally, the interrupt line is shared so using level sensitive
interrupt is here especially important to avoid races.
Fixes: 15dfdfad2d ("ARM: dts: Add basic dts for Exynos4412-based Trats 2 board")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-5-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 15107e443a ]
The Maxim MUIC datasheets describe the interrupt line as active low
with a requirement of acknowledge from the CPU. Without specifying the
interrupt type in Devicetree, kernel might apply some fixed
configuration, not necessarily working for this hardware.
Additionally, the interrupt line is shared so using level sensitive
interrupt is here especially important to avoid races.
Fixes: 7eec126675 ("ARM: dts: Add Maxim 77693 PMIC to exynos4412-trats2")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-4-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a45f33bd3 ]
The Maxim fuel gauge datasheets describe the interrupt line as active
low with a requirement of acknowledge from the CPU. The falling edge
interrupt will mostly work but it's not correct.
Fixes: e8614292cd ("ARM: dts: Add Maxim 77693 fuel gauge node for exynos4412-trats2")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201210212534.216197-3-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e004c3e67b ]
Currently the array gpmc_cs is indexed by cs before it cs is range checked
and the pointer read from this out-of-index read is dereferenced. Fix this
by performing the range check on cs before the read and the following
pointer dereference.
Addresses-Coverity: ("Negative array index read")
Fixes: 9ed7a776eb ("ARM: OMAP2+: Fix support for multiple devices on a GPMC chip select")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210223193821.17232-1-colin.king@canonical.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 50a318cc9b upstream.
The commit d3cb25a121 ("usb: gadget: udc: fix spin_lock in pch_udc")
obviously was not thought through and had made the situation even worse
than it was before. Two changes after almost reverted it. but a few
leftovers have been left as it. With this revert d3cb25a121 completely.
While at it, narrow down the scope of unlocked section to prevent
potential race when prot_stall is assigned.
Fixes: d3cb25a121 ("usb: gadget: udc: fix spin_lock in pch_udc")
Fixes: 9903b6bedd ("usb: gadget: pch-udc: fix lock")
Fixes: 1d23d16a88 ("usb: gadget: pch_udc: reorder spin_[un]lock to avoid deadlock")
Cc: Iago Abal <mail@iagoabal.eu>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210323153626.54908-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b279bbfd2 upstream.
Smatch complains about missing that the ovl_override_creds() doesn't
have a matching revert_creds() if the dentry is disconnected. Fix this
by moving the ovl_override_creds() until after the disconnected check.
Fixes: aa3ff3c152 ("ovl: copy up of disconnected dentries")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d3c4c7938 upstream.
Abort the walk of coalesced MMIO zones if kvm_io_bus_unregister_dev()
fails to allocate memory for the new instance of the bus. If it can't
instantiate a new bus, unregister_dev() destroys all devices _except_ the
target device. But, it doesn't tell the caller that it obliterated the
bus and invoked the destructor for all devices that were on the bus. In
the coalesced MMIO case, this can result in a deleted list entry
dereference due to attempting to continue iterating on coalesced_zones
after future entries (in the walk) have been deleted.
Opportunistically add curly braces to the for-loop, which encompasses
many lines but sneaks by without braces due to the guts being a single
if statement.
Fixes: f65886606c ("KVM: fix memory leak in kvm_io_bus_unregister_dev()")
Cc: stable@vger.kernel.org
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210412222050.876100-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee050a5775 upstream.
Drop bits 63:32 of the VMCS field encoding when checking for a nested
VM-Exit on VMREAD/VMWRITE in !64-bit mode. VMREAD and VMWRITE always
use 32-bit operands outside of 64-bit mode.
The actual emulation of VMREAD/VMWRITE does the right thing, this bug is
purely limited to incorrectly causing a nested VM-Exit if a GPR happens
to have bits 63:32 set outside of 64-bit mode.
Fixes: a7cde481b6 ("KVM: nVMX: Do not forward VMREAD/VMWRITE VMExits to L1 if required so by vmcs12 vmread/vmwrite bitmaps")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210422022128.3464144-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b208108638 upstream.
The PoP documents:
134: The vector packed decimal facility is installed in the
z/Architecture architectural mode. When bit 134 is
one, bit 129 is also one.
135: The vector enhancements facility 1 is installed in
the z/Architecture architectural mode. When bit 135
is one, bit 129 is also one.
Looks like we confuse the vector enhancements facility 1 ("EXT") with the
Vector packed decimal facility ("BCD"). Let's fix the facility checks.
Detected while working on QEMU/tcg z14 support and only unlocking
the vector enhancements facility 1, but not the vector packed decimal
facility.
Fixes: 2583b848ca ("s390: report new vector facilities")
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20210503121244.25232-1-david@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44bada2821 upstream.
store_regs_fmt2() has an ordering problem: first the guarded storage
facility is enabled on the local cpu, then preemption disabled, and
then the STGSC (store guarded storage controls) instruction is
executed.
If the process gets scheduled away between enabling the guarded
storage facility and before preemption is disabled, this might lead to
a special operation exception and therefore kernel crash as soon as
the process is scheduled back and the STGSC instruction is executed.
Fixes: 4e0b1ab72b ("KVM: s390: gs support for kvm guests")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: <stable@vger.kernel.org> # 4.12
Link: https://lore.kernel.org/r/20210415080127.1061275-1-hca@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 266fd994b2 upstream.
In 9bbb94e57d ("ALSA: hda/realtek: fix static noise on ALC285 Lenovo
laptops") an existing Lenovo quirk was made more generic by removing a
0x12 pin requirement from the entry. This made the second chance table
Thinkpad jack entry unreachable as the pin configurations became
identical.
Revert the 0x12 pin requirement removal and move Thinkpad jack pin quirk
back to the primary pin table as they can co-exist when more specific
configurations come first.
Add a more targeted pin quirk for Lenovo devices that have 0x12 as
0x40000000.
Tested on Yoga 6 (AMD) laptop.
[ Corrected the commit ID -- tiwai ]
Fixes: 9bbb94e57d ("ALSA: hda/realtek: fix static noise on ALC285 Lenovo laptops")
Signed-off-by: Sami Loone <sami@loone.fi>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YI0oefvTYn8URYDb@yoga
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit defce244b0 upstream.
The quirk entry for Uniwill ECS M31EI is with the PCI SSID device 0,
which means matching with all. That is, it's essentially equivalent
with SND_PCI_QUIRK_VENDOR(0x1584), which also matches with the
previous entry for Haier W18 applying the very same quirk.
Let's unify them with the single vendor-quirk entry.
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210428112704.23967-13-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45461e3b55 upstream.
Just re-order the alc269_fixup_tbl[] entries for HP devices for
avoiding the oversight of the duplicated or unapplied item in future.
No functional changes.
Formerly, some entries were grouped for the actual codec, but this
doesn't seem reasonable to keep in that way. So now we simply keep
the PCI SSID order for the whole.
Also Cc-to-stable for the further patch applications.
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210428112704.23967-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8dbc2ccac5 upstream.
Currently the ioctl command RADEON_INFO_SI_BACKEND_ENABLED_MASK can
copy back uninitialised data in value_tmp that pointer *value points
to. This can occur when rdev->family is less than CHIP_BONAIRE and
less than CHIP_TAHITI. Fix this by adding in a missing -EINVAL
so that no invalid value is copied back to userspace.
Addresses-Coverity: ("Uninitialized scalar variable)
Cc: stable@vger.kernel.org # 3.13+
Fixes: 439a1cfffe ("drm/radeon: expose render backend mask to the userspace")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ff25985ea upstream.
Using a kernel with the Undefined Behaviour Sanity Checker (UBSAN) enabled, the
following array overrun is logged:
================================================================================
UBSAN: array-index-out-of-bounds in /home/finger/wireless-drivers-next/drivers/net/wireless/realtek/rtw88/phy.c:1789:34
index 5 is out of range for type 'u8 [5]'
CPU: 2 PID: 84 Comm: kworker/u16:3 Tainted: G O 5.12.0-rc5-00086-gd88bba47038e-dirty #651
Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.50 09/29/2014
Workqueue: phy0 ieee80211_scan_work [mac80211]
Call Trace:
dump_stack+0x64/0x7c
ubsan_epilogue+0x5/0x40
__ubsan_handle_out_of_bounds.cold+0x43/0x48
rtw_get_tx_power_params+0x83a/drivers/net/wireless/realtek/rtw88/0xad0 [rtw_core]
? rtw_pci_read16+0x20/0x20 [rtw_pci]
? check_hw_ready+0x50/0x90 [rtw_core]
rtw_phy_get_tx_power_index+0x4d/0xd0 [rtw_core]
rtw_phy_set_tx_power_level+0xee/0x1b0 [rtw_core]
rtw_set_channel+0xab/0x110 [rtw_core]
rtw_ops_config+0x87/0xc0 [rtw_core]
ieee80211_hw_config+0x9d/0x130 [mac80211]
ieee80211_scan_state_set_channel+0x81/0x170 [mac80211]
ieee80211_scan_work+0x19f/0x2a0 [mac80211]
process_one_work+0x1dd/0x3a0
worker_thread+0x49/0x330
? rescuer_thread+0x3a0/0x3a0
kthread+0x134/0x150
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x30
================================================================================
The statement where an array is being overrun is shown in the following snippet:
if (rate <= DESC_RATE11M)
tx_power = pwr_idx_2g->cck_base[group];
else
====> tx_power = pwr_idx_2g->bw40_base[group];
The associated arrays are defined in main.h as follows:
struct rtw_2g_txpwr_idx {
u8 cck_base[6];
u8 bw40_base[5];
struct rtw_2g_1s_pwr_idx_diff ht_1s_diff;
struct rtw_2g_ns_pwr_idx_diff ht_2s_diff;
struct rtw_2g_ns_pwr_idx_diff ht_3s_diff;
struct rtw_2g_ns_pwr_idx_diff ht_4s_diff;
};
The problem arises because the value of group is 5 for channel 14. The trivial
increase in the dimension of bw40_base fails as this struct must match the layout of
efuse. The fix is to add the rate as an argument to rtw_get_channel_group() and set
the group for channel 14 to 4 if rate <= DESC_RATE11M.
This patch fixes commit fa6dfe6bff ("rtw88: resolve order of tx power setting routines")
Fixes: fa6dfe6bff ("rtw88: resolve order of tx power setting routines")
Reported-by: Богдан Пилипенко <bogdan.pylypenko107@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210401192717.28927-1-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7abfabaf5f upstream.
Reading /proc/mdstat with a read buffer size that would not
fit the unused status line in the first read will skip this
line from the output.
So 'dd if=/proc/mdstat bs=64 2>/dev/null' will not print something
like: unused devices: <none>
Don't return NULL immediately in start() for v=2 but call
show() once to print the status line also for multiple reads.
Cc: stable@vger.kernel.org
Fixes: 1f4aace60b ("fs/seq_file.c: simplify seq_file iteration code and interface")
Signed-off-by: Jan Glauber <jglauber@digitalocean.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a4db2a603 upstream.
commit d3374825ce ("md: make devices disappear when they are no longer
needed.") introduced protection between mddev creating & removing. The
md_open shouldn't create mddev when all_mddevs list doesn't contain
mddev. With currently code logic, there will be very easy to trigger
soft lockup in non-preempt env.
This patch changes md_open returning from -ERESTARTSYS to -EBUSY, which
will break the infinitely retry when md_open enter racing area.
This patch is partly fix soft lockup issue, full fix needs mddev_find
is split into two functions: mddev_find & mddev_find_or_alloc. And
md_open should call new mddev_find (it only does searching job).
For more detail, please refer with Christoph's "split mddev_find" patch
in later commits.
commit 65aa97c4d2 upstream.
Split mddev_find into a simple mddev_find that just finds an existing
mddev by the unit number, and a more complicated mddev_find that deals
with find or allocating a mddev.
This turns out to fix this bug reported by Zhao Heming.
----------------------------- snip ------------------------------
commit d3374825ce ("md: make devices disappear when they are no longer
needed.") introduced protection between mddev creating & removing. The
md_open shouldn't create mddev when all_mddevs list doesn't contain
mddev. With currently code logic, there will be very easy to trigger
soft lockup in non-preempt env.
commit 404a8ef512 upstream.
NULL pointer dereference was observed in super_written() when it tries
to access the mddev structure.
[The below stack trace is from an older kernel, but the problem described
in this patch applies to the mainline kernel.]
[ 1194.474861] task: ffff8fdd20858000 task.stack: ffffb99d40790000
[ 1194.488000] RIP: 0010:super_written+0x29/0xe1
[ 1194.499688] RSP: 0018:ffff8ffb7fcc3c78 EFLAGS: 00010046
[ 1194.512477] RAX: 0000000000000000 RBX: ffff8ffb7bf4a000 RCX: ffff8ffb78991048
[ 1194.527325] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ffb56b8a200
[ 1194.542576] RBP: ffff8ffb7fcc3c90 R08: 000000000000000b R09: 0000000000000000
[ 1194.558001] R10: ffff8ffb56b8a298 R11: 0000000000000000 R12: ffff8ffb56b8a200
[ 1194.573070] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1194.588117] FS: 0000000000000000(0000) GS:ffff8ffb7fcc0000(0000) knlGS:0000000000000000
[ 1194.604264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1194.617375] CR2: 00000000000002b8 CR3: 00000021e040a002 CR4: 00000000007606e0
[ 1194.632327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1194.647865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1194.663316] PKRU: 55555554
[ 1194.674090] Call Trace:
[ 1194.683735] <IRQ>
[ 1194.692948] bio_endio+0xae/0x135
[ 1194.703580] blk_update_request+0xad/0x2fa
[ 1194.714990] blk_update_bidi_request+0x20/0x72
[ 1194.726578] __blk_end_bidi_request+0x2c/0x4d
[ 1194.738373] __blk_end_request_all+0x31/0x49
[ 1194.749344] blk_flush_complete_seq+0x377/0x383
[ 1194.761550] flush_end_io+0x1dd/0x2a7
[ 1194.772910] blk_finish_request+0x9f/0x13c
[ 1194.784544] scsi_end_request+0x180/0x25c
[ 1194.796149] scsi_io_completion+0xc8/0x610
[ 1194.807503] scsi_finish_command+0xdc/0x125
[ 1194.818897] scsi_softirq_done+0x81/0xde
[ 1194.830062] blk_done_softirq+0xa4/0xcc
[ 1194.841008] __do_softirq+0xd9/0x29f
[ 1194.851257] irq_exit+0xe6/0xeb
[ 1194.861290] do_IRQ+0x59/0xe3
[ 1194.871060] common_interrupt+0x1c6/0x382
[ 1194.881988] </IRQ>
[ 1194.890646] RIP: 0010:cpuidle_enter_state+0xdd/0x2a5
[ 1194.902532] RSP: 0018:ffffb99d40793e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff43
[ 1194.917317] RAX: ffff8ffb7fce27c0 RBX: ffff8ffb7fced800 RCX: 000000000000001f
[ 1194.932056] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000
[ 1194.946428] RBP: ffffb99d40793ea0 R08: 0000000000000004 R09: 0000000000002ed2
[ 1194.960508] R10: 0000000000002664 R11: 0000000000000018 R12: 0000000000000003
[ 1194.974454] R13: 000000000000000b R14: ffffffff925715a0 R15: 0000011610120d5a
[ 1194.988607] ? cpuidle_enter_state+0xcc/0x2a5
[ 1194.999077] cpuidle_enter+0x17/0x19
[ 1195.008395] call_cpuidle+0x23/0x3a
[ 1195.017718] do_idle+0x172/0x1d5
[ 1195.026358] cpu_startup_entry+0x73/0x75
[ 1195.035769] start_secondary+0x1b9/0x20b
[ 1195.044894] secondary_startup_64+0xa5/0xa5
[ 1195.084921] RIP: super_written+0x29/0xe1 RSP: ffff8ffb7fcc3c78
[ 1195.096354] CR2: 00000000000002b8
bio in the above stack is a bitmap write whose completion is invoked after
the tear down sequence sets the mddev structure to NULL in rdev.
During tear down, there is an attempt to flush the bitmap writes, but for
external bitmaps, there is no explicit wait for all the bitmap writes to
complete. For instance, md_bitmap_flush() is called to flush the bitmap
writes, but the last call to md_bitmap_daemon_work() in md_bitmap_flush()
could generate new bitmap writes for which there is no explicit wait to
complete those writes. The call to md_bitmap_update_sb() will return
simply for external bitmaps and the follow-up call to md_update_sb() is
conditional and may not get called for external bitmaps. This results in a
kernel panic when the completion routine, super_written() is called which
tries to reference mddev in the rdev that has been set to
NULL(in unbind_rdev_from_array() by tear down sequence).
The solution is to call md_super_wait() for external bitmaps after the
last call to md_bitmap_daemon_work() in md_bitmap_flush() to ensure there
are no pending bitmap writes before proceeding with the tear down.
Cc: stable@vger.kernel.org
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Reviewed-by: Zhao Heming <heming.zhao@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>