Changes in 6.1.12
hv_netvsc: Allocate memory in netvsc_dma_map() with GFP_ATOMIC
btrfs: limit device extents to the device size
btrfs: zlib: zero-initialize zlib workspace
ALSA: hda/realtek: Add Positivo N14KP6-TG
ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control()
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 360
ALSA: hda/realtek: Enable mute/micmute LEDs on HP Elitebook, 645 G9
ALSA: hda/realtek: Add quirk for ASUS UM3402 using CS35L41
ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"
Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"
tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
of/address: Return an error when no valid dma-ranges are found
can: j1939: do not wait 250 ms if the same addr was already claimed
HID: logitech: Disable hi-res scrolling on USB
xfrm: compat: change expression for switch in xfrm_xlate64
IB/hfi1: Restore allocated resources on failed copyout
xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr()
IB/IPoIB: Fix legacy IPoIB due to wrong number of queues
xfrm: annotate data-race around use_time
RDMA/irdma: Fix potential NULL-ptr-dereference
RDMA/usnic: use iommu_map_atomic() under spin_lock()
xfrm: fix bug with DSCP copy to v6 from v4 tunnel
of: Make OF framebuffer device names unique
net: phylink: move phy_device_free() to correctly release phy device
bonding: fix error checking in bond_debug_reregister()
net: macb: Perform zynqmp dynamic configuration only for SGMII interface
net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY
ionic: clean interrupt before enabling queue to avoid credit race
ionic: refactor use of ionic_rx_fill()
ionic: missed doorbell workaround
cpufreq: qcom-hw: Fix cpufreq_driver->get() for non-LMH systems
uapi: add missing ip/ipv6 header dependencies for linux/stddef.h
net: microchip: sparx5: fix PTP init/deinit not checking all ports
HID: amd_sfh: if no sensors are enabled, clean up
drm/i915: Don't do the WM0->WM1 copy w/a if WM1 is already enabled
drm/virtio: exbuf->fence_fd unmodified on interrupted wait
cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
nvidiafb: detect the hardware support before removing console.
ice: Do not use WQ_MEM_RECLAIM flag for workqueue
ice: Fix disabling Rx VLAN filtering with port VLAN enabled
ice: switch: fix potential memleak in ice_add_adv_recipe()
net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"
net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
net/mlx5: Bridge, fix ageing of peer FDB entries
net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode
net/mlx5e: IPoIB, Show unknown speed instead of error
net/mlx5: Store page counters in a single array
net/mlx5: Expose SF firmware pages counter
net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
net/mlx5: Serialize module cleanup with reload and remove
igc: Add ndo_tx_timeout support
net: ethernet: mtk_eth_soc: fix wrong parameters order in __xdp_rxq_info_reg()
txhash: fix sk->sk_txrehash default
selftests: Fix failing VXLAN VNI filtering test
rds: rds_rm_zerocopy_callback() use list_first_entry()
net: mscc: ocelot: fix all IPv6 getting trapped to CPU when PTP timestamping is used
selftests: forwarding: lib: quote the sysctl values
arm64: dts: rockchip: fix input enable pinconf on rk3399
arm64: dts: rockchip: set sdmmc0 speed to sd-uhs-sdr50 on rock-3a
ALSA: pci: lx6464es: fix a debug loop
riscv: stacktrace: Fix missing the first frame
arm64: dts: mediatek: mt8195: Fix vdosys* compatible strings
ASoC: tas5805m: rework to avoid scheduling while atomic.
ASoC: tas5805m: add missing page switch.
ASoC: fsl_sai: fix getting version from VERID
ASoC: topology: Return -ENOMEM on memory allocation failure
clk: microchip: mpfs-ccc: Use devm_kasprintf() for allocating formatted strings
pinctrl: mediatek: Fix the drive register definition of some Pins
pinctrl: aspeed: Fix confusing types in return value
pinctrl: single: fix potential NULL dereference
spi: dw: Fix wrong FIFO level setting for long xfers
pinctrl: aspeed: Revert "Force to disable the function's signal"
pinctrl: intel: Restore the pins that used to be in Direct IRQ mode
cifs: Fix use-after-free in rdata->read_into_pages()
net: USB: Fix wrong-direction WARNING in plusb.c
mptcp: do not wait for bare sockets' timeout
mptcp: be careful on subflow status propagation on errors
selftests: mptcp: allow more slack for slow test-case
selftests: mptcp: stop tests earlier
btrfs: simplify update of last_dir_index_offset when logging a directory
btrfs: free device in btrfs_close_devices for a single device filesystem
usb: core: add quirk for Alcor Link AK9563 smartcard reader
usb: typec: altmodes/displayport: Fix probe pin assign check
cxl/region: Fix null pointer dereference for resetting decoder
cxl/region: Fix passthrough-decoder detection
clk: ingenic: jz4760: Update M/N/OD calculation algorithm
pinctrl: qcom: sm8450-lpass-lpi: correct swr_rx_data group
drm/amd/pm: add SMU 13.0.7 missing GetPptLimit message mapping
ceph: flush cap releases when the session is flushed
nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte
riscv: kprobe: Fixup misaligned load text
powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch
drm/amdgpu: Use the TGID for trace_amdgpu_vm_update_ptes
tracing: Fix TASK_COMM_LEN in trace event format file
rtmutex: Ensure that the top waiter is always woken up
arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive
arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive
arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive
Fix page corruption caused by racy check in __free_pages
arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
drm/amd/pm: bump SMU 13.0.0 driver_if header version
drm/amdgpu: Add unique_id support for GC 11.0.1/2
drm/amd/pm: bump SMU 13.0.7 driver_if header version
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
drm/amdgpu/smu: skip pptable init under sriov
drm/amd/display: properly handling AGP aperture in vm setup
drm/amd/display: fix cursor offset on rotation 180
drm/i915: Move fd_install after last use of fence
drm/i915: Initialize the obj flags for shmem objects
drm/i915: Fix VBT DSI DVO port handling
x86/speculation: Identify processors vulnerable to SMT RSB predictions
KVM: x86: Mitigate the cross-thread return address predictions bug
Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
Linux 6.1.12
Change-Id: I4deaf57516f3e7b40e728d473986fa355a11fc37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Add android/abi_gki_aarch64.stg as initial ABI representation of the
KMI and start enforcing KMI. Kernel is not trimmed yet, the trimming
will be enabled after adding symbols lists.
This is not KMI freeze. While this is hard enforcement in the code
base, we still allow controlled changes to the ABI.
Test: TH
Bug: 269323432
Change-Id: Ic3a126a6d60242ae713ae396bb239af5b3e58c53
Signed-off-by: Aleksei Vetrov <vvvvvv@google.com>
This reverts commit 8ecd88d9d3 as it is
broken with regards to upstream changes made in 6.1.12.
If this is still needed, it can be brought back in a way that works
properly based on the changes made upstream.
Bug: 174125747
Cc: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Cc: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Change-Id: Ic3163351faabbecbce688a87215f79ca3b5d6188
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Added following symbol to QCOM symbol list
snd_timer_interrupt
Bug: 269049178
Change-Id: I7f41f21208c96e7e3142336439b204c2016f0f4f
Signed-off-by: Raghu Bankapur <quic_rbankapu@quicinc.com>
Header generation script is using the protected exports list
as a source to generate the header file during the kernel build.
Script preporcess the symbols in-place before using it to generate
an array of symbols for protected exports causing the source file
to change. This may force the kleaf to build kernel again even
though there are no real changes in terms of symbols.
Use a copy as a temp file for processing leaving the source file
un affected.
Unprotected symbol list is already a temp file; so it doesn't
affect that target.
Bug: 268678245
Test: TH
Change-Id: Ifb551639451d1c7bd935ff732bd1959647c014d7
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
(cherry picked from commit 19ce16cc96)
... so that they can be loaded by Kleaf extensions
and read during the loading phase.
Moving forward, we should remove build configs in
the future and express constants in .bzl files. However,
for now, until kernel_build has been migrated to
use the defined cc_toolchain, we must keep this file.
Test: Treehugger
Bug: 228238975
Change-Id: Id9628663785970c460470382e1ae162e1112203d
Signed-off-by: Yifan Hong <elsk@google.com>
commit 6f0f2d5ef8 upstream.
By default, KVM/SVM will intercept attempts by the guest to transition
out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used
by a VMM to change this behavior. To mitigate the cross-thread return
address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to
override the default behavior to intercept C0 transitions.
Use a module parameter to control the mitigation on processors that are
vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the
X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug,
KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be8de49bea upstream.
Certain AMD processors are vulnerable to a cross-thread return address
predictions bug. When running in SMT mode and one of the sibling threads
transitions out of C0 state, the other sibling thread could use return
target predictions from the sibling thread that transitioned out of C0.
The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB
when context switching to the idle thread. However, KVM allows a VMM to
prevent exiting guest mode when transitioning out of C0. A guest could
act maliciously in this situation, so create a new x86 BUG that can be
used to detect if the processor is vulnerable.
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ad7bbf3db upstream.
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.
Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:
amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
<TASK>
amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
devm_drm_dev_init_release+0x49/0x70
[...]
To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.
Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].
[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/
Fixes: 067f44c8b4 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 190233164c upstream.
Commit 550b33cfd4 ("arm64: efi: Force the use of SetVirtualAddressMap()
on Altra machines") identifies the Altra family via the family field in
the type#1 SMBIOS record. eMAG and Altra Max machines are similarly
affected but not detected with the strict strcmp test.
The type1_family smbios string is not an entirely reliable means of
identifying systems with this issue as OEMs can, and do, use their own
strings for these fields. However, until we have a better solution,
capture the bulk of these systems by adding strcmp matching for "eMAG"
and "Altra Max".
Fixes: 550b33cfd4 ("arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines")
Cc: <stable@vger.kernel.org> # 6.1.x
Cc: Alexandru Elisei <alexandru.elisei@gmail.com>
Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Justin He <justin.he@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 462a8e08e0 upstream.
When we upgraded our kernel, we started seeing some page corruption like
the following consistently:
BUG: Bad page state in process ganesha.nfsd pfn:1304ca
page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca
flags: 0x17ffffc0000000()
raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000
raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O 5.10.158-1.nutanix.20221209.el7.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
Call Trace:
dump_stack+0x74/0x96
bad_page.cold+0x63/0x94
check_new_page_bad+0x6d/0x80
rmqueue+0x46e/0x970
get_page_from_freelist+0xcb/0x3f0
? _cond_resched+0x19/0x40
__alloc_pages_nodemask+0x164/0x300
alloc_pages_current+0x87/0xf0
skb_page_frag_refill+0x84/0x110
...
Sometimes, it would also show up as corruption in the free list pointer
and cause crashes.
After bisecting the issue, we found the issue started from commit
e320d3012d ("mm/page_alloc.c: fix freeing non-compound pages"):
if (put_page_testzero(page))
free_the_page(page, order);
else if (!PageHead(page))
while (order-- > 0)
free_the_page(page + (1 << order), order);
So the problem is the check PageHead is racy because at this point we
already dropped our reference to the page. So even if we came in with
compound page, the page can already be freed and PageHead can return
false and we will end up freeing all the tail pages causing double free.
Fixes: e320d3012d ("mm/page_alloc.c: fix freeing non-compound pages")
Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db370a8b9f upstream.
Let L1 and L2 be two spinlocks.
Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top
waiter of L2.
Let T2 be the task holding L2.
Let T3 be a task trying to acquire L1.
The following events will lead to a state in which the wait queue of L2
isn't empty, but no task actually holds the lock.
T1 T2 T3
== == ==
spin_lock(L1)
| raw_spin_lock(L1->wait_lock)
| rtlock_slowlock_locked(L1)
| | task_blocks_on_rt_mutex(L1, T3)
| | | orig_waiter->lock = L1
| | | orig_waiter->task = T3
| | | raw_spin_unlock(L1->wait_lock)
| | | rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3)
spin_unlock(L2) | | | |
| rt_mutex_slowunlock(L2) | | | |
| | raw_spin_lock(L2->wait_lock) | | | |
| | wakeup(T1) | | | |
| | raw_spin_unlock(L2->wait_lock) | | | |
| | | | waiter = T1->pi_blocked_on
| | | | waiter == rt_mutex_top_waiter(L2)
| | | | waiter->task == T1
| | | | raw_spin_lock(L2->wait_lock)
| | | | dequeue(L2, waiter)
| | | | update_prio(waiter, T1)
| | | | enqueue(L2, waiter)
| | | | waiter != rt_mutex_top_waiter(L2)
| | | | L2->owner == NULL
| | | | wakeup(T1)
| | | | raw_spin_unlock(L2->wait_lock)
T1 wakes up
T1 != top_waiter(L2)
schedule_rtlock()
If the deadline of T1 is updated before the call to update_prio(), and the
new deadline is greater than the deadline of the second top waiter, then
after the requeue, T1 is no longer the top waiter, and the wrong task is
woken up which will then go back to sleep because it is not the top waiter.
This can be reproduced in PREEMPT_RT with stress-ng:
while true; do
stress-ng --sched deadline --sched-period 1000000000 \
--sched-runtime 800000000 --sched-deadline \
1000000000 --mmapfork 23 -t 20
done
A similar issue was pointed out by Thomas versus the cases where the top
waiter drops out early due to a signal or timeout, which is a general issue
for all regular rtmutex use cases, e.g. futex.
The problematic code is in rt_mutex_adjust_prio_chain():
// Save the top waiter before dequeue/enqueue
prerequeue_top_waiter = rt_mutex_top_waiter(lock);
rt_mutex_dequeue(lock, waiter);
waiter_update_prio(waiter, task);
rt_mutex_enqueue(lock, waiter);
// Lock has no owner?
if (!rt_mutex_owner(lock)) {
// Top waiter changed
----> if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
----> wake_up_state(waiter->task, waiter->wake_state);
This only takes the case into account where @waiter is the new top waiter
due to the requeue operation.
But it fails to handle the case where @waiter is not longer the top
waiter due to the requeue operation.
Ensure that the new top waiter is woken up so in all cases so it can take
over the ownerless lock.
[ tglx: Amend changelog, add Fixes tag ]
Fixes: c014ef69b3 ("locking/rtmutex: Add wake_state to rt_mutex_waiter")
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com
Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e53448e0a1 upstream.
The pid field corresponds to the result of gettid() in userspace.
However, userspace cannot reliably attribute PTE events to processes
with just the thread id. This patch allows userspace to easily
attribute PTE update events to specific processes by comparing this
field with the result of getpid().
For attributing events to specific threads, the thread id is also
contained in the common fields of each trace event.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ea31e2e62 upstream.
The RFI and STF security mitigation options can flip the
interrupt_exit_not_reentrant static branch condition concurrently with
the interrupt exit code which tests that branch.
Interrupt exit tests this condition to set MSR[EE|RI] for exit, then
again in the case a soft-masked interrupt is found pending, to recover
the MSR so the interrupt can be replayed before attempting to exit
again. If the condition changes between these two tests, the MSR and irq
soft-mask state will become corrupted, leading to warnings and possible
crashes. For example, if the branch is initially true then false,
MSR[EE] will be 0 but PACA_IRQ_HARD_DIS clear and EE may not get
enabled, leading to warnings in irq_64.c.
Fixes: 13799748b9 ("powerpc/64: use interrupt restart table to speed up return from interrupt")
Cc: stable@vger.kernel.org # v5.14+
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230206042240.92103-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c91d713630 upstream.
Commit 6e9f05dc66 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
...updated MAX_STRUCT_PAGE_SIZE to account for sizeof(struct page)
potentially doubling in the case of CONFIG_KMSAN=y. Unfortunately this
doubles the amount of capacity stolen from user addressable capacity for
everyone, regardless of whether they are using the debug option. Revert
that change, mandate that MAX_STRUCT_PAGE_SIZE never exceed 64, but
allow for debug scenarios to proceed with creating debug sized page maps
with a compile option to support debug scenarios.
Note that this only applies to cases where the page map is permanent,
i.e. stored in a reservation of the pmem itself ("--map=dev" in "ndctl
create-namespace" terms). For the "--map=mem" case, since the allocation
is ephemeral for the lifespan of the namespace, there are no explicit
restriction. However, the implicit restriction, of having enough
available "System RAM" to store the page map for the typically large
pmem, still applies.
Fixes: 6e9f05dc66 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
Cc: <stable@vger.kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Yu Zhao <yuzhao@google.com>
Link: https://lore.kernel.org/r/167467815773.463042.7022545814443036382.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecfb9f4047 upstream.
The previous algorithm was pretty broken.
- The inner loop had a '(m > m_max)' condition, and the value of 'm'
would increase in each iteration;
- Each iteration would actually multiply 'm' by two, so it is not needed
to re-compute the whole equation at each iteration;
- It would loop until (m & 1) == 0, which means it would loop at most
once.
- The outer loop would divide the 'n' value by two at the end of each
iteration. This meant that for a 12 MHz parent clock and a 1.2 GHz
requested clock, it would first try n=12, then n=6, then n=3, then
n=1, none of which would work; the only valid value is n=2 in this
case.
Simplify this algorithm with a single for loop, which decrements 'n'
after each iteration, addressing all of the above problems.
Fixes: bdbfc02937 ("clk: ingenic: Add support for the JZ4760")
Cc: <stable@vger.kernel.org>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20221214123704.7305-1-paul@crapouillou.net
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fa4302d6d upstream.
Not all decoders have a reset callback.
The CXL specification allows a host bridge with a single root port to
have no explicit HDM decoders. Currently the region driver assumes there
are none. As such the CXL core creates a special pass through decoder
instance without a commit/reset callback.
Prior to this patch, the ->reset() callback was called unconditionally when
calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
1 Root Port, and one directly attached CXL type 3 device or multiple CXL
type 3 devices attached to downstream ports of a switch can cause a null
pointer dereference.
Before the fix, a kernel crash was observed when we destroy the region, and
a pass through decoder is reset.
The issue can be reproduced as below,
1) create a region with a CXL setup which includes a HB with a
single root port under which a memdev is attached directly.
2) destroy the region with cxl destroy-region regionX -f.
Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Cc: <stable@vger.kernel.org>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Link: https://lore.kernel.org/r/20221215170909.2650271-1-fan.ni@samsung.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54e5c00a4e upstream.
While checking Pin Assignments of the port and partner during probe, we
don't take into account whether the peripheral is a plug or receptacle.
This manifests itself in a mode entry failure on certain docks and
dongles with captive cables. For instance, the Startech.com Type-C to DP
dongle (Model #CDP2DP) advertises its DP VDO as 0x405. This would fail
the Pin Assignment compatibility check, despite it supporting
Pin Assignment C as a UFP.
Update the check to use the correct DP Pin Assign macros that
take the peripheral's receptacle bit into account.
Fixes: c1e5c2f0cb ("usb: typec: altmodes/displayport: correct pin assignment for UFP receptacles")
Cc: stable@vger.kernel.org
Reported-by: Diana Zigterman <dzigterman@chromium.org>
Signed-off-by: Prashant Malani <pmalani@chromium.org>
Link: https://lore.kernel.org/r/20230208205318.131385-1-pmalani@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f58d783fd upstream.
We have this check to make sure we don't accidentally add older devices
that may have disappeared and re-appeared with an older generation from
being added to an fs_devices (such as a replace source device). This
makes sense, we don't want stale disks in our file system. However for
single disks this doesn't really make sense.
I've seen this in testing, but I was provided a reproducer from a
project that builds btrfs images on loopback devices. The loopback
device gets cached with the new generation, and then if it is re-used to
generate a new file system we'll fail to mount it because the new fs is
"older" than what we have in cache.
Fix this by freeing the cache when closing the device for a single device
filesystem. This will ensure that the mount command passed device path is
scanned successfully during the next mount.
CC: stable@vger.kernel.org # 5.10+
Reported-by: Daan De Meyer <daandemeyer@fb.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 070d6dafac upstream.
These 'endpoint' tests from 'mptcp_join.sh' selftest start a transfer in
the background and check the status during this transfer.
Once the expected events have been recorded, there is no reason to wait
for the data transfer to finish. It can be stopped earlier to reduce the
execution time by more than half.
For these tests, the exchanged data were not verified. Errors, if any,
were ignored but that's fine, plenty of other tests are looking at that.
It is then OK to mute stderr now that we are sure errors will be printed
(and still ignored) because the transfer is stopped before the end.
Fixes: e274f71540 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1249db44a1 upstream.
Currently the subflow error report callback unconditionally
propagates the fallback subflow status to the owning msk.
If the msk is already orphaned, the above prevents the code
from correctly tracking the msk moving to the TCP_CLOSE state
and doing the appropriate cleanup.
All the above causes increasing memory usage over time and
sporadic self-tests failures.
There is a great deal of infrastructure trying to propagate
correctly the fallback subflow status to the owning mptcp socket,
e.g. via mptcp_subflow_eof() and subflow_sched_work_if_closed():
in the error propagation path we need only to cope with unorphaned
sockets.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/339
Fixes: 15cc104533 ("mptcp: deliver ssk errors to msk")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d4e85922e3 upstream.
If the peer closes all the existing subflows for a given
mptcp socket and later the application closes it, the current
implementation let it survive until the timewait timeout expires.
While the above is allowed by the protocol specification it
consumes resources for almost no reason and additionally
causes sporadic self-tests failures.
Let's move the mptcp socket to the TCP_CLOSE state when there are
no alive subflows at close time, so that the allocated resources
will be freed immediately.
Fixes: e16163b6e2 ("mptcp: refactor shutdown and close")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa5465aeca upstream.
When the network status is unstable, use-after-free may occur when
read data from the server.
BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0
Call Trace:
<TASK>
dump_stack_lvl+0x38/0x4c
print_report+0x16f/0x4a6
kasan_report+0xb7/0x130
readpages_fill_pages+0x14c/0x7e0
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
</TASK>
Allocated by task 2535:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x82/0x90
cifs_readdata_direct_alloc+0x2c/0x110
cifs_readdata_alloc+0x2d/0x60
cifs_readahead+0x393/0xfe0
read_pages+0x12f/0x470
page_cache_ra_unbounded+0x1b1/0x240
filemap_get_pages+0x1c8/0x9a0
filemap_read+0x1c0/0x540
cifs_strict_readv+0x21b/0x240
vfs_read+0x395/0x4b0
ksys_read+0xb8/0x150
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 79:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2e/0x50
__kasan_slab_free+0x10e/0x1a0
__kmem_cache_free+0x7a/0x1a0
cifs_readdata_release+0x49/0x60
process_one_work+0x46c/0x760
worker_thread+0x2a4/0x6f0
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
Last potentially related work creation:
kasan_save_stack+0x22/0x50
__kasan_record_aux_stack+0x95/0xb0
insert_work+0x2b/0x130
__queue_work+0x1fe/0x660
queue_work_on+0x4b/0x60
smb2_readv_callback+0x396/0x800
cifs_abort_connection+0x474/0x6a0
cifs_reconnect+0x5cb/0xa50
cifs_readv_from_socket.cold+0x22/0x6c
cifs_read_page_from_socket+0xc1/0x100
readpages_fill_pages.cold+0x2f/0x46
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
The following function calls will cause UAF of the rdata pointer.
readpages_fill_pages
cifs_read_page_from_socket
cifs_readv_from_socket
cifs_reconnect
__cifs_reconnect
cifs_abort_connection
mid->callback() --> smb2_readv_callback
queue_work(&rdata->work) # if the worker completes first,
# the rdata is freed
cifs_readv_complete
kref_put
cifs_readdata_release
kfree(rdata)
return rdata->... # UAF in readpages_fill_pages()
Similarly, this problem also occurs in the uncache_fill_pages().
Fix this by adjusts the order of condition judgment in the return
statement.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>