* changes:
Merge 5.15.108 into android14-5.15
ANDROID: db845c: Update list of module outs
ANDROID: preserve CRC for xhci symbols
Merge 5.15.107 into android14-5.15
ufshcd_queuecommand() may be called two times in a row for a SCSI
command before it is completed. Hence make the following changes:
- In the functions that submit a command, do not check the old value of
lrbp->cmd nor clear lrbp->cmd in error paths.
- In ufshcd_release_scsi_cmd(), do not clear lrbp->cmd.
See also scsi_send_eh_cmnd().
This patch prevents that the following appears if a command times out:
WARNING: at drivers/ufs/core/ufshcd.c:2965 ufshcd_queuecommand+0x6f8/0x9a8
Call trace:
ufshcd_queuecommand+0x6f8/0x9a8
scsi_send_eh_cmnd+0x2c0/0x960
scsi_eh_test_devices+0x100/0x314
scsi_eh_ready_devs+0xd90/0x114c
scsi_error_handler+0x2b4/0xb70
kthread+0x16c/0x1e0
Fixes: 5a0b0cb9be ("[SCSI] ufs: Add support for sending NOP OUT UPIU")
Bug: 276080919
Link: https://lore.kernel.org/linux-scsi/20230417230656.523826-5-bvanassche@acm.org/T/#m0d16bcb0507d082036732005b2d85d0cfc068325
Change-Id: I6dfe0a51cfc354fae56ecebeedd8f7e181580cec
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Don't use the uclamp of current task as the default uclamp for
binders, because the uclamp of current task influence
binders' placement when not in a transaction.
Just use default value 0 and SCHED_CAPACITY_SCALE for binders'
default uclamp min and max. Also replace set_inherited_uclamp with
set_binder_prio_uclamp
Bug: 277389699
Change-Id: I07c4f40c2689dbc7eb23e7d3e2a2f435353dc25f
Signed-off-by: Chungkai Mei <chungkai@google.com>
Per-vcpu flags are updated using a non-atomic RMW operation.
Which means it is possible to get preempted between the read and
write operations.
Another interesting thing to note is that preemption also updates
flags, as we have some flag manipulation in both the load and put
operations.
It is thus possible to lose information communicated by either
load or put, as the preempted flag update will overwrite the flags
when the thread is resumed. This is specially critical if either
load or put has stored information which depends on the physical
CPU the vcpu runs on.
This results in really elusive bugs, and kudos must be given to
Mostafa for the long hours of debugging, and finally spotting
the problem.
Fix it by disabling preemption during the RMW operation, which
ensures that the state stays consistent. Also upgrade vcpu_get_flag
path to use READ_ONCE() to make sure the field is always atomically
accessed.
Fixes: e87abb73e5 ("KVM: arm64: Add helpers to manipulate vcpu flags among a set")
Reported-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230418125737.2327972-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
(cherry picked from commit 35dcb3ac66)
[willdeacon@: also update __vcpu_copy_flag()]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I63058ff1494e4092dab9d29cb66c295dd8fe9d86
The existing pKVM code attempts to advertise CSV2/3 using values
initialized to 0, but never set. To advertise CSV2/3 to protected
guests, pass the CSV2/3 values to hyp when initializing hyp's
view of guests' ID_AA64PFR0_EL1.
Similar to non-protected KVM, these are system-wide, rather than
per cpu, for simplicity.
Fixes: 6c30bfb18d ("KVM: arm64: Add handlers for protected VM System Registers")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20230404152321.413064-1-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
(cherry picked from commit e81625218b)
[willdeacon@: fixed_config.h has been moved into kvm_pkvm.h]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I27821a28bcde0dbce3d45bac6cf4de20dcf299f9
Changes in 5.15.108
Revert "pinctrl: amd: Disable and mask interrupts on resume"
ALSA: emu10k1: fix capture interrupt handler unlinking
ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard
ALSA: i2c/cs8427: fix iec958 mixer control deactivation
ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex()
ALSA: emu10k1: don't create old pass-through playback device on Audigy
ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards
Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
Bluetooth: Fix race condition in hidp_session_thread
btrfs: print checksum type and implementation at mount time
btrfs: fix fast csum implementation detection
fbmem: Reject FB_ACTIVATE_KD_TEXT from userspace
mtdblock: tolerate corrected bit-flips
mtd: rawnand: meson: fix bitmask for length in command word
mtd: rawnand: stm32_fmc2: remove unsupported EDO mode
mtd: rawnand: stm32_fmc2: use timings.mode instead of checking tRC_min
KVM: arm64: PMU: Restore the guest's EL0 event counting after migration
drm/i915/dsi: fix DSS CTL register offsets for TGL+
clk: sprd: set max_register according to mapping range
RDMA/irdma: Fix memory leak of PBLE objects
RDMA/irdma: Increase iWARP CM default rexmit count
RDMA/irdma: Add ipv4 check to irdma_find_listener()
IB/mlx5: Add support for 400G_8X lane speed
RDMA/cma: Allow UD qp_type to join multicast only
bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition
niu: Fix missing unwind goto in niu_alloc_channels()
tcp: restrict net.ipv4.tcp_app_win
drm/armada: Fix a potential double free in an error handling path
qlcnic: check pci_reset_function result
net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()
sctp: fix a potential overflow in sctp_ifwdtsn_skip
RDMA/core: Fix GID entry ref leak when create_ah fails
udp6: fix potential access to stale information
net: macb: fix a memory corruption in extended buffer descriptor mode
skbuff: Fix a race between coalescing and releasing SKBs
libbpf: Fix single-line struct definition output in btf_dump
ARM: 9290/1: uaccess: Fix KASAN false-positives
power: supply: cros_usbpd: reclassify "default case!" as debug
wifi: mwifiex: mark OF related data as maybe unused
i2c: imx-lpi2c: clean rx/tx buffers upon new message
i2c: hisi: Avoid redundant interrupts
efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
verify_pefile: relax wrapper length check
asymmetric_keys: log on fatal failures in PE/pkcs7
wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
ACPI: resource: Add Medion S17413 to IRQ override quirk
counter: stm32-lptimer-cnt: Provide defines for clock polarities
counter: stm32-timer-cnt: Provide defines for slave mode selection
counter: Internalize sysfs interface code
counter: 104-quad-8: Fix Synapse action reported for Index signals
tracing: Add trace_array_puts() to write into instance
tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call
drm/i915: fix race condition UAF in i915_perf_add_config_ioctl
riscv: add icache flush for nommu sigreturn trampoline
net: sfp: initialize sfp->i2c_block_size at sfp allocation
net: phy: nxp-c45-tja11xx: add remove callback
net: phy: nxp-c45-tja11xx: fix unsigned long multiplication overflow
scsi: ses: Handle enclosure with just a primary component gracefully
x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach()
mptcp: use mptcp_schedule_work instead of open-coding it
mptcp: stricter state check in mptcp_worker
ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size
ubi: Fix deadlock caused by recursively holding work_sem
powerpc/papr_scm: Update the NUMA distance table for the target node
sched/fair: Move calculate of avg_load to a better location
sched/fair: Fix imbalance overflow
x86/rtc: Remove __init for runtime functions
i2c: ocores: generate stop condition after timeout in polling mode
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
nvme-pci: Crucial P2 has bogus namespace ids
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQN
nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD
kexec: turn all kexec_mutex acquisitions into trylocks
panic, kexec: make __crash_kexec() NMI safe
counter: fix docum. build problems after filename change
counter: Add the necessary colons and indents to the comments of counter_compi
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
Linux 5.15.108
Change-Id: Icdb539c68f2ea04d37818cb4fe66b08384b77609
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
In commit 3ef52e4bcf ("net: qrtr: combine nameservice into main
module"), the ns.ko kernel module got merged into the qrtr module, so it
needs to be removed from the db845c list of modules as it is no longer
part of the build at all.
Bug: 279448025
Fixes: 3ef52e4bcf ("net: qrtr: combine nameservice into main module")
Change-Id: I5d35ddaa0a8bf68ec16800264cb1a0bd7581b4e5
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Add hook to boost thread when process killed.
Bug: 237749933
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I7cc6f248397021f3a8271433144a0e582ed27cfa
(cherry picked from commit 709679142d583b0b7338d931fdd43b27b1bbf9e0)
and sched_waking to let module probe them
Get task info about sleep and waking
Bug: 190422437
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I828c93f531f84e6133c2c3a7f8faada51683afcf
(cherry picked from commit 13af062abf)
(cherry picked from commit 869954e72dac700580d0ea5734d07b574e41afe9)
Get task info about scheduling delay, iowait, and block time.
It is used to get thread scheduling info when thread happened abnormal situation.
Bug: 189415303
Change-Id: Ib6b548f8a78de5b26d555e9a89e3cc79ea2d1024
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
(cherry picked from commit a6bb1af39d)
(cherry picked from commit 6d8d2ab52facfd6d5de2715e2470872e6a70cf22)
Export get_wchan to get the block reason.
It is used to get the block reason(why the thread blocked in Uninterrupted Sleep) when happened long D state. We use this information check if it's reasonable.
Bug: 205684022
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I7b65bb502b805e7dac13e5f9d725da1ff70fe306
(cherry picked from commit 0db6925868)
(cherry picked from commit de72c813d12537ea6ced87b39ffcad446815609a)
Export symbol of the function wq_worker_comm() in kernel/workqueue.c for dlkm to get the description of the kworker process. It is used to get the description when kworker thread happened abnormal situation.
Bug: 208394207
Signed-off-by: zhengding chen <chenzhengding@oppo.com>
Change-Id: I2e7ddd52a15e22e99e6596f16be08243af1bb473
(cherry picked from commit 28de741861)
(cherry picked from commit 87e0e98c25ba8e121975708943335e3abad651d9)
Add a vendor hook for meminfo fixup.
Bug: 277746082
Change-Id: Ifa7850f75ccdf862d900c9a6c00f165b07e84595
Signed-off-by: Martin Liu <liumartin@google.com>
The upstream driver has added start/stop link inline functions that
result in quite a few changes to the designware code. The upstream patch
for this issue, builds on top of these changes. Since we are based on
5.15, it is cleaner to apply a simpler fix on top of the existing code.
In dw_pcie_host_init() regardless of whether the link has been started
or not, the code waits for the link to come up. Even in cases where
start_link() is not defined the code ends up spinning in a loop for 1
second. Since in some systems dw_pcie_host_init() gets called during
probe, this one second loop for each pcie interface instance ends up
extending the boot time.
Call trace when start_link() is not defined:
dw_pcie_wait_for_link << spins in a loop for 1 second
dw_pcie_host_init
Bug: 270085637
Link: https://lore.kernel.org/all/20230412093425.3659088-1-ajayagarwal@google.com/
Change-Id: Ibc42801fa06674e43e921b4976ec83c9fb5483cf
Signed-off-by: Sajid Dalvi <sdalvi@google.com>
With coalescing we don't refcount default PTE entries. Fix an issue
which clears out non-refcounted PTE entries on the unmap path.
Bug: 279165129
Change-Id: Ie4fdabcc420d54c1338272d38abbe393fc5ce75c
Signed-off-by: Sebastian Ene <sebastianene@google.com>
In the process of switching USB config from rndis to other config,
if the hardware does not support the ->pullup callback, or the
hardware encounters a low probability fault, both of them may cause
the ->pullup callback to fail, which will then cause a system panic
(use after free).
The gadget drivers sometimes need to be unloaded regardless of the
hardware's behavior.
Analysis as follows:
=======================================================================
(1) write /config/usb_gadget/g1/UDC "none"
gether_disconnect+0x2c/0x1f8
rndis_disable+0x4c/0x74
composite_disconnect+0x74/0xb0
configfs_composite_disconnect+0x60/0x7c
usb_gadget_disconnect+0x70/0x124
usb_gadget_unregister_driver+0xc8/0x1d8
gadget_dev_desc_UDC_store+0xec/0x1e4
(2) rm /config/usb_gadget/g1/configs/b.1/f1
rndis_deregister+0x28/0x54
rndis_free+0x44/0x7c
usb_put_function+0x14/0x1c
config_usb_cfg_unlink+0xc4/0xe0
configfs_unlink+0x124/0x1c8
vfs_unlink+0x114/0x1dc
(3) rmdir /config/usb_gadget/g1/functions/rndis.gs4
panic+0x1fc/0x3d0
do_page_fault+0xa8/0x46c
do_mem_abort+0x3c/0xac
el1_sync_handler+0x40/0x78
0xffffff801138f880
rndis_close+0x28/0x34
eth_stop+0x74/0x110
dev_close_many+0x48/0x194
rollback_registered_many+0x118/0x814
unregister_netdev+0x20/0x30
gether_cleanup+0x1c/0x38
rndis_attr_release+0xc/0x14
kref_put+0x74/0xb8
configfs_rmdir+0x314/0x374
If gadget->ops->pullup() return an error, function rndis_close() will be
called, then it will causes a use-after-free problem.
=======================================================================
Fixes: 0a55187a1e ("USB: gadget core: Issue ->disconnect() callback from usb_gadget_disconnect()")
Signed-off-by: Jiantao Zhang <water.zhangjiantao@huawei.com>
Signed-off-by: TaoXue <xuetao09@huawei.com>
Link: https://lore.kernel.org/r/20221121130805.10735-1-water.zhangjiantao@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 273510696
Bug: 275027942
Change-Id: I702f324c5852d3b2448081b092fef464f8691989
(cherry picked from commit afdc12887f)
[ray: Resolved minor conflict in drivers/usb/gadget/udc/core.c]
Signed-off-by: Ray Chi <raychi@google.com>
(cherry picked from commit 2ce4ee5f2e02702ce61b07a170eeb9ffede0601a)
commit 05c6257433 upstream.
Attempting to get a crash dump out of a debug PREEMPT_RT kernel via an NMI
panic() doesn't work. The cause of that lies in the PREEMPT_RT definition
of mutex_trylock():
if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
return 0;
This prevents an nmi_panic() from executing the main body of
__crash_kexec() which does the actual kexec into the kdump kernel. The
warning and return are explained by:
6ce47fd961 ("rtmutex: Warn if trylock is called from hard/softirq context")
[...]
The reasons for this are:
1) There is a potential deadlock in the slowpath
2) Another cpu which blocks on the rtmutex will boost the task
which allegedly locked the rtmutex, but that cannot work
because the hard/softirq context borrows the task context.
Furthermore, grabbing the lock isn't NMI safe, so do away with kexec_mutex
and replace it with an atomic variable. This is somewhat overzealous as
*some* callsites could keep using a mutex (e.g. the sysfs-facing ones
like crash_shrink_memory()), but this has the benefit of involving a
single unified lock and preventing any future NMI-related surprises.
Tested by triggering NMI panics via:
$ echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi
$ echo 1 > /proc/sys/kernel/unknown_nmi_panic
$ echo 1 > /proc/sys/kernel/panic
$ ipmitool power diag
Link: https://lkml.kernel.org/r/20220630223258.4144112-3-vschneid@redhat.com
Fixes: 6ce47fd961 ("rtmutex: Warn if trylock is called from hard/softirq context")
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 200dccd07d ]
Lexar NM610 reports bogus eui64 values that appear to be the same across
all drives. Quirk them out so they are not marked as "non globally unique"
duplicates.
Signed-off-by: Shyamin Ayesh <me@shyamin.com>
[patch formatting]
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Stable-dep-of: 74391b3e69 ("nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b961bce50 ]
When ZHITAI TiPro7000 SSDs entered deepest power state(ps4)
it has the same APST sleep problem as Kingston A2000.
by chance the system crashes and displays the same dmesg info:
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65
As the Archlinux wiki suggest (enlat + exlat) < 25000 is fine
and my testing shows no system crashes ever since.
Therefore disabling the deepest power state will fix the APST sleep issue.
https://wiki.archlinux.org/title/Solid_state_drive/NVMe
This is the APST data from 'nvme id-ctrl /dev/nvme1'
NVME Identify Controller:
vid : 0x1e49
ssvid : 0x1e49
sn : [...]
mn : ZHITAI TiPro7000 1TB
fr : ZTA32F3Y
[...]
ps 0 : mp:3.50W operational enlat:5 exlat:5 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:3.30W operational enlat:50 exlat:100 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:2.80W operational enlat:50 exlat:200 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.1500W non-operational enlat:500 exlat:5000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0200W non-operational enlat:2000 exlat:60000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
Signed-off-by: Ning Wang <ningwang35@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Stable-dep-of: 74391b3e69 ("nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3765fad508 ]
ADATA XPG GAMMIX S50 drives report bogus eui64 values that appear to
be the same across drives in one system. Quirk them out so they are
not marked as "non globally unique" duplicates.
Signed-off-by: Stefan Reiter <stefan@pimaker.at>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Stable-dep-of: 74391b3e69 ("nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8160d3b35 ]
In polling mode, no stop condition is generated after a timeout. This
causes SCL to remain low and thereby block the bus. If this happens
during a transfer it can cause slaves to misinterpret the subsequent
transfer and return wrong values.
To solve this, pass the ETIMEDOUT error up from ocores_process_polling()
instead of setting STATE_ERROR directly. The caller is adjusted to call
ocores_process_timeout() on error both in polling and in IRQ mode, which
will set STATE_ERROR and generate a stop condition.
Fixes: 69c8c0c0ef ("i2c: ocores: add polling interface")
Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Federico Vaga <federico.vaga@cern.ch>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 775d3c514c ]
set_rtc_noop(), get_rtc_noop() are after booting, therefore their __init
annotation is wrong.
A crash was observed on an x86 platform where CMOS RTC is unused and
disabled via device tree. set_rtc_noop() was invoked from ntp:
sync_hw_clock(), although CONFIG_RTC_SYSTOHC=n, however sync_cmos_clock()
doesn't honour that.
Workqueue: events_power_efficient sync_hw_clock
RIP: 0010:set_rtc_noop
Call Trace:
update_persistent_clock64
sync_hw_clock
Fix this by dropping the __init annotation from set/get_rtc_noop().
Fixes: c311ed6183 ("x86/init: Allow DT configured systems to disable RTC at boot time")
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/59f7ceb1-446b-1d3d-0bc8-1f0ee94b1e18@nokia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b277fc793d ]
Platform device helper routines won't update the NUMA distance table
while creating a platform device, even if the device is present on a
NUMA node that doesn't have memory or CPU. This is especially true for
pmem devices. If the target node of the pmem device is not online, we
find the nearest online node to the device and associate the pmem device
with that online node. To find the nearest online node, we should have
the numa distance table updated correctly. Update the distance
information during the device probe.
For a papr scm device on NUMA node 3 distance_lookup_table value for
distance_ref_points_depth = 2 before and after fix is below:
Before fix:
node 3 distance depth 0 - 0
node 3 distance depth 1 - 0
node 4 distance depth 0 - 4
node 4 distance depth 1 - 2
node 5 distance depth 0 - 5
node 5 distance depth 1 - 1
After fix
node 3 distance depth 0 - 3
node 3 distance depth 1 - 1
node 4 distance depth 0 - 4
node 4 distance depth 1 - 2
node 5 distance depth 0 - 5
node 5 distance depth 1 - 1
Without the fix, the nearest numa node to the pmem device (NUMA node 3)
will be picked as 4. After the fix, we get the correct numa node which
is 5.
Fixes: da1115fdbd ("powerpc/nvdimm: Pick nearby online node if the device node is not online")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230404041433.1781804-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f773f0a331 ]
During the processing of the bgt, if the sync_erase() return -EBUSY
or some other error code in __erase_worker(),schedule_erase() called
again lead to the down_read(ubi->work_sem) hold twice and may get
block by down_write(ubi->work_sem) in ubi_update_fastmap(),
which cause deadlock.
ubi bgt other task
do_work
down_read(&ubi->work_sem) ubi_update_fastmap
erase_worker # Blocked by down_read
__erase_worker down_write(&ubi->work_sem)
schedule_erase
schedule_ubi_work
down_read(&ubi->work_sem)
Fix this by changing input parameter @nested of the schedule_erase() to
'true' to avoid recursively acquiring the down_read(&ubi->work_sem).
Also, fix the incorrect comment about @nested parameter of the
schedule_erase() because when down_write(ubi->work_sem) is held, the
@nested is also need be true.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093
Fixes: 2e8f08deab ("ubi: Fix races around ubi_refill_pools()")
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 1e020e1b96 upstream.
Following process will make ubi attaching failed since commit
1b42b1a36f ("ubi: ensure that VID header offset ... size"):
ID="0xec,0xa1,0x00,0x15" # 128M 128KB 2KB
modprobe nandsim id_bytes=$ID
flash_eraseall /dev/mtd0
modprobe ubi mtd="0,2048" # set vid_hdr offset as 2048 (one page)
(dmesg):
ubi0 error: ubi_attach_mtd_dev [ubi]: VID header offset 2048 too large.
UBI error: cannot attach mtd0
UBI error: cannot initialize UBI, error -22
Rework original solution, the key point is making sure
'vid_hdr_shift + UBI_VID_HDR_SIZE < ubi->vid_hdr_alsize',
so we should check vid_hdr_shift rather not vid_hdr_offset.
Then, ubi still support (sub)page aligined VID header offset.
Fixes: 1b42b1a36f ("ubi: ensure that VID header offset ... size")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Tested-by: Nicolas Schichan <nschichan@freebox.fr>
Tested-by: Miquel Raynal <miquel.raynal@bootlin.com> # v5.10, v4.19
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5cb752b12 upstream.
Beyond reducing code duplication this also avoids scheduling
the mptcp_worker on a closed socket on some edge scenarios.
The addressed issue is actually older than the blamed commit
below, but this fix needs it as a pre-requisite.
Fixes: ba8f48f7a4 ("mptcp: introduce mptcp_schedule_work")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba9182a896 upstream.
After a successful cpuset_can_attach() call which increments the
attach_in_progress flag, either cpuset_cancel_attach() or cpuset_attach()
will be called later. In cpuset_attach(), tasks in cpuset_attach_wq,
if present, will be woken up at the end. That is not the case in
cpuset_cancel_attach(). So missed wakeup is possible if the attach
operation is somehow cancelled. Fix that by doing the wakeup in
cpuset_cancel_attach() as well.
Fixes: e44193d39e ("cpuset: let hotplug propagation work wait for task attaching")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8e22b7a16 upstream.
This reverts commit 3fe97ff3d9 ("scsi: ses: Don't attach if enclosure
has no components") and introduces proper handling of case where there are
no detected secondary components, but primary component (enumerated in
num_enclosures) does exist. That fix was originally proposed by Ding Hui
<dinghui@sangfor.com.cn>.
Completely ignoring devices that have one primary enclosure and no
secondary one results in ses_intf_add() bailing completely
scsi 2:0:0:254: enclosure has no enumerated components
scsi 2:0:0:254: Failed to bind enclosure -12ven in valid configurations such
even on valid configurations with 1 primary and 0 secondary enclosures as
below:
# sg_ses /dev/sg0
3PARdata SES 3321
Supported diagnostic pages:
Supported Diagnostic Pages [sdp] [0x0]
Configuration (SES) [cf] [0x1]
Short Enclosure Status (SES) [ses] [0x8]
# sg_ses -p cf /dev/sg0
3PARdata SES 3321
Configuration diagnostic page:
number of secondary subenclosures: 0
generation code: 0x0
enclosure descriptor list
Subenclosure identifier: 0 [primary]
relative ES process id: 0, number of ES processes: 1
number of type descriptor headers: 1
enclosure logical identifier (hex): 20000002ac02068d
enclosure vendor: 3PARdata product: VV rev: 3321
type descriptor header and text list
Element type: Unspecified, subenclosure id: 0
number of possible elements: 1
The changelog for the original fix follows
=====
We can get a crash when disconnecting the iSCSI session,
the call trace like this:
[ffff00002a00fb70] kfree at ffff00000830e224
[ffff00002a00fba0] ses_intf_remove at ffff000001f200e4
[ffff00002a00fbd0] device_del at ffff0000086b6a98
[ffff00002a00fc50] device_unregister at ffff0000086b6d58
[ffff00002a00fc70] __scsi_remove_device at ffff00000870608c
[ffff00002a00fca0] scsi_remove_device at ffff000008706134
[ffff00002a00fcc0] __scsi_remove_target at ffff0000087062e4
[ffff00002a00fd10] scsi_remove_target at ffff0000087064c0
[ffff00002a00fd70] __iscsi_unbind_session at ffff000001c872c4
[ffff00002a00fdb0] process_one_work at ffff00000810f35c
[ffff00002a00fe00] worker_thread at ffff00000810f648
[ffff00002a00fe70] kthread at ffff000008116e98
In ses_intf_add, components count could be 0, and kcalloc 0 size scomp,
but not saved in edev->component[i].scratch
In this situation, edev->component[0].scratch is an invalid pointer,
when kfree it in ses_intf_remove_enclosure, a crash like above would happen
The call trace also could be other random cases when kfree cannot catch
the invalid pointer
We should not use edev->component[] array when the components count is 0
We also need check index when use edev->component[] array in
ses_enclosure_data_process
=====
Reported-by: Michal Kolar <mich.k@seznam.cz>
Originally-by: Ding Hui <dinghui@sangfor.com.cn>
Cc: stable@vger.kernel.org
Fixes: 3fe97ff3d9 ("scsi: ses: Don't attach if enclosure has no components")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2304042122270.29760@cbobk.fhfr.pm
Tested-by: Michal Kolar <mich.k@seznam.cz>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>