1/ using RGA instead of MULTI_RGA in rk312x
2/ enable isp & ddrclk_sip_v2 in rk312x
3/ enable tve & motorcomm_phy in rk3128
Change-Id: I17793fca5025dc9e42aa6550c7c8cb16bad44529
Signed-off-by: Jianlong Wang <jianlong.wang@rock-chips.com>
This patch adds restart-poweroff node to power off board
by restart system which has no pmic, no power-hold.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: Ia26ae06bca4b3ab4dcf071cf8ff5c0dfeda26c80
This patch enable this config to allow power-off by restarting
system.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I3662a3d88a92bc6d628446f0d99b3cd2b74eda63
The GPIO0_B4 to GPIO0_D7 on RK3588 SoCs have two registers to configure
the pinmux, one is in pmugrf and the other is in grf.
When the value in pmugrf is '8' means the final pinmux is in the grf
register, this patch fixes the condition.
Fixes: 9b97628093 ("pinctrl: rockchip: fix rk3588 pinmux get")
Signed-off-by: Jianqun Xu <jay.xu@rock-chips.com>
Change-Id: Ia9f0bec25fbf9696f05b45ff806f2853df56d1c3
This patch wrap register access by lock which guarantee the clk
enabled first. and remove the unused CLK status from CMN.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: Iba058ad08c71de0b118913216baec1886d8f3819
Changes in 5.10.158
btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino
btrfs: free btrfs_path before copying inodes to userspace
spi: spi-imx: Fix spi_bus_clk if requested clock is higher than input clock
btrfs: move QUOTA_ENABLED check to rescan_should_stop from btrfs_qgroup_rescan_worker
drm/display/dp_mst: Fix drm_dp_mst_add_affected_dsc_crtcs() return code
drm/amdgpu: update drm_display_info correctly when the edid is read
drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"
btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()
iio: health: afe4403: Fix oob read in afe4403_read_raw
iio: health: afe4404: Fix oob read in afe4404_[read|write]_raw
iio: light: rpr0521: add missing Kconfig dependencies
bpf, perf: Use subprog name when reporting subprog ksymbol
scripts/faddr2line: Fix regression in name resolution on ppc64le
ARM: at91: rm9200: fix usb device clock id
libbpf: Handle size overflow for ringbuf mmap
hwmon: (ltc2947) fix temperature scaling
hwmon: (ina3221) Fix shunt sum critical calculation
hwmon: (i5500_temp) fix missing pci_disable_device()
hwmon: (ibmpex) Fix possible UAF when ibmpex_register_bmc() fails
bpf: Do not copy spin lock field from user in bpf_selem_alloc
of: property: decrement node refcount in of_fwnode_get_reference_args()
ixgbevf: Fix resource leak in ixgbevf_init_module()
i40e: Fix error handling in i40e_init_module()
fm10k: Fix error handling in fm10k_init_module()
iavf: remove redundant ret variable
iavf: Fix error handling in iavf_init_module()
e100: switch from 'pci_' to 'dma_' API
e100: Fix possible use after free in e100_xmit_prepare
net/mlx5: Fix uninitialized variable bug in outlen_write()
net/mlx5e: Fix use-after-free when reverting termination table
can: sja1000_isa: sja1000_isa_probe(): add missing free_sja1000dev()
can: cc770: cc770_isa_probe(): add missing free_cc770dev()
qlcnic: fix sleep-in-atomic-context bugs caused by msleep
aquantia: Do not purge addresses when setting the number of rings
wifi: cfg80211: fix buffer overflow in elem comparison
wifi: cfg80211: don't allow multi-BSSID in S1G
wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration
net: phy: fix null-ptr-deref while probe() failed
net: net_netdev: Fix error handling in ntb_netdev_init_module()
net/9p: Fix a potential socket leak in p9_socket_open
net: ethernet: nixge: fix NULL dereference
dsa: lan9303: Correct stat name
tipc: re-fetch skb cb after tipc_msg_validate
net: hsr: Fix potential use-after-free
afs: Fix fileserver probe RTT handling
net: tun: Fix use-after-free in tun_detach()
packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE
sctp: fix memory leak in sctp_stream_outq_migrate()
net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed
hwmon: (coretemp) Check for null before removing sysfs attrs
hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()
net/mlx5: DR, Fix uninitialized var warning
riscv: vdso: fix section overlapping under some conditions
error-injection: Add prompt for function error injection
tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3
pinctrl: intel: Save and restore pins in "direct IRQ" mode
net: stmmac: Set MAC's flow control register to reflect current settings
mmc: mmc_test: Fix removal of debugfs file
mmc: core: Fix ambiguous TRIM and DISCARD arg
mmc: sdhci-esdhc-imx: correct CQHCI exit halt state check
mmc: sdhci-sprd: Fix no reset data and command after voltage switch
mmc: sdhci: Fix voltage switch delay
drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
drm/i915: Never return 0 if not all requests retired
tracing: Free buffers when a used dynamic event is removed
io_uring: don't hold uring_lock when calling io_run_task_work*
ASoC: ops: Fix bounds check for _sx controls
pinctrl: single: Fix potential division by zero
iommu/vt-d: Fix PCI device refcount leak in has_external_pci()
iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()
parisc: Increase size of gcc stack frame check
xtensa: increase size of gcc stack frame check
parisc: Increase FRAME_WARN to 2048 bytes on parisc
Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
selftests: net: add delete nexthop route warning test
selftests: net: fix nexthop warning cleanup double ip typo
ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference
ipv4: Fix route deletion when nexthop info is not specified
Revert "tty: n_gsm: avoid call of sleeping functions from atomic context"
x86/tsx: Add a feature bit for TSX control MSR support
x86/pm: Add enumeration check before spec MSRs save/restore setup
i2c: npcm7xx: Fix error handling in npcm_i2c_init()
i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag set
ACPI: HMAT: remove unnecessary variable initialization
ACPI: HMAT: Fix initiator registration for single-initiator systems
Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"
char: tpm: Protect tpm_pm_suspend with locks
Input: raydium_ts_i2c - fix memory leak in raydium_i2c_send()
block: unhash blkdev part inode when the part is deleted
proc: avoid integer type confusion in get_proc_long
proc: proc_skip_spaces() shouldn't think it is working on C strings
v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails
ipc/sem: Fix dangling sem_array access in semtimedop race
Linux 5.10.158
Change-Id: I8db196fa535e260ed31965b52ed53ef0b6bd526b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 83c10cc362 ]
The documentation for find_vpid() clearly states:
"Must be called with the tasklist_lock or rcu_read_lock() held."
Presently we do neither for find_vpid() instance in bpf_task_fd_query().
Add proper rcu_read_lock/unlock() to fix the issue.
Bug: 232939090
Fixes: 41bdc4b40e ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220912133855.1218900-1-lee@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I517b52d97a22b92be1a479279067ddb7b84efbba
[ Upstream commit b52be557e2 ]
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c89 ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6647e76ab6 upstream.
The V4L2_MEMORY_USERPTR interface is long deprecated and shouldn't be
used (and is discouraged for any modern v4l drivers). And Seth Jenkins
points out that the fallback to VM_PFNMAP/VM_IO is fundamentally racy
and dangerous.
Note that it's not even a case that should trigger, since any normal
user pointer logic ends up just using the pin_user_pages_fast() call
that does the proper page reference counting. That's not the problem
case, only if you try to use special device mappings do you have any
issues.
Normally I'd just remove this during the merge window, but since Seth
pointed out the problem cases, we really want to know as soon as
possible if there are actually any users of this odd special case of a
legacy interface. Neither Hans nor Mauro seem to think that such
mis-uses of the old legacy interface should exist. As Mauro says:
"See, V4L2 has actually 4 streaming APIs:
- Kernel-allocated mmap (usually referred simply as just mmap);
- USERPTR mmap;
- read();
- dmabuf;
The USERPTR is one of the oldest way to use it, coming from V4L
version 1 times, and by far the least used one"
And Hans chimed in on the USERPTR interface:
"To be honest, I wouldn't mind if it goes away completely, but that's a
bit of a pipe dream right now"
but while removing this legacy interface entirely may be a pipe dream we
can at least try to remove the unlikely (and actively broken) case of
using special device mappings for USERPTR accesses.
This replaces it with a WARN_ONCE() that we can remove once we've
hopefully confirmed that no actual users exist.
NOTE! Longer term, this means that a 'struct frame_vector' only ever
contains proper page pointers, and all the games we have with converting
them to pages can go away (grep for 'frame_vector_to_pages()' and the
uses of 'vec->is_pfns'). But this is just the first step, to verify
that this code really is all dead, and do so as quickly as possible.
Reported-by: Seth Jenkins <sethjenkins@google.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bce9332220 upstream.
proc_skip_spaces() seems to think it is working on C strings, and ends
up being just a wrapper around skip_spaces() with a really odd calling
convention.
Instead of basing it on skip_spaces(), it should have looked more like
proc_skip_char(), which really is the exact same function (except it
skips a particular character, rather than whitespace). So use that as
inspiration, odd coding and all.
Now the calling convention actually makes sense and works for the
intended purpose.
Reported-and-tested-by: Kyle Zeng <zengyhkyle@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6cfaf34be upstream.
proc_get_long() is passed a size_t, but then assigns it to an 'int'
variable for the length. Let's not do that, even if our IO paths are
limited to MAX_RW_COUNT (exactly because of these kinds of type errors).
So do the proper test in the rigth type.
Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
v5.11 changes the blkdev lookup mechanism completely since commit
22ae8ce8b8 ("block: simplify bdev/disk lookup in blkdev_get"),
and small part of the change is to unhash part bdev inode when
deleting partition. Turns out this kind of change does fix one
nasty issue in case of BLOCK_EXT_MAJOR:
1) when one partition is deleted & closed, disk_put_part() is always
called before bdput(bdev), see blkdev_put(); so the part's devt can
be freed & re-used before the inode is dropped
2) then new partition with same devt can be created just before the
inode in 1) is dropped, then the old inode/bdev structurein 1) is
re-used for this new partition, this way causes use-after-free and
kernel panic.
It isn't possible to backport the whole big patchset of "merge struct
block_device and struct hd_struct v4" for addressing this issue.
https://lore.kernel.org/linux-block/20201128161510.347752-1-hch@lst.de/
So fixes it by unhashing part bdev in delete_partition(), and this way
is actually aligned with v5.11+'s behavior.
Reported-by: Shiwei Cui <cuishw@inspur.com>
Tested-by: Shiwei Cui <cuishw@inspur.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
commit 23393c6461 upstream.
Currently tpm transactions are executed unconditionally in
tpm_pm_suspend() function, which may lead to races with other tpm
accessors in the system.
Specifically, the hw_random tpm driver makes use of tpm_get_random(),
and this function is called in a loop from a kthread, which means it's
not frozen alongside userspace, and so can race with the work done
during system suspend:
tpm tpm0: tpm_transmit: tpm_recv: error -52
tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics
CPU: 0 PID: 1 Comm: init Not tainted 6.1.0-rc5+ #135
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
Call Trace:
tpm_tis_status.cold+0x19/0x20
tpm_transmit+0x13b/0x390
tpm_transmit_cmd+0x20/0x80
tpm1_pm_suspend+0xa6/0x110
tpm_pm_suspend+0x53/0x80
__pnp_bus_suspend+0x35/0xe0
__device_suspend+0x10f/0x350
Fix this by calling tpm_try_get_ops(), which itself is a wrapper around
tpm_chip_start(), but takes the appropriate mutex.
Signed-off-by: Jan Dabros <jsd@semihalf.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/all/c5ba47ef-393f-1fba-30bd-1230d1b4b592@suse.cz/
Cc: stable@vger.kernel.org
Fixes: e891db1a18 ("tpm: turn on TPM on suspend for TPM 1.x")
[Jason: reworked commit message, added metadata]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d9f15a9de4 ]
This reverts commit 232ccac1bd.
On the subject of suspend, the RISC-V SBI spec states:
This does not cover whether any given events actually reach the hart or
not, just what the hart will do if it receives an event. On PolarFire
SoC, and potentially other SiFive based implementations, events from the
RISC-V timer do reach a hart during suspend. This is not the case for the
implementation on the Allwinner D1 - there timer events are not received
during suspend.
To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the
timer driver - but this has broken both RCU stall detection and timers
generally on PolarFire SoC and potentially other SiFive based
implementations.
If an AXI read to the PCIe controller on PolarFire SoC times out, the
system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system
just locks up without RCU stalling:
io scheduler mq-deadline registered
io scheduler kyber registered
microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges:
microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000
microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie 2000000000.pcie: axi read request error
microchip-pcie 2000000000.pcie: axi read timeout
microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
Freeing initrd memory: 7332K
Similarly issues were reported with clock_nanosleep() - with a test app
that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed
commit in place, the sleep times are rounded up to the next jiffy:
== CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 ==
Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179
Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193
Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000
Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000
Samples: 521 Samples: 521 Samples: 521 Samples: 521
Fortunately, the D1 has a second timer, which is "currently used in
preference to the RISC-V/SBI timer driver" so a revert here does not
hurt operation of D1 in its current form.
Ultimately, a DeviceTree property (or node) will be added to encode the
behaviour of the timers, but until then revert the addition of
CLOCK_EVT_FEAT_C3STOP.
Fixes: 232ccac1bd ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/
Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@sholland.org/
Link: https://lore.kernel.org/r/20221122121620.3522431-1-conor.dooley@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 48d4180939 ]
In a system with a single initiator node, and one or more memory-only
'target' nodes, the memory-only node(s) would fail to register their
initiator node correctly. i.e. in sysfs:
# ls /sys/devices/system/node/node0/access0/targets/
node0
Where as the correct behavior should be:
# ls /sys/devices/system/node/node0/access0/targets/
node0 node1
This happened because hmat_register_target_initiators() uses list_sort()
to sort the initiator list, but the sort comparision function
(initiator_cmp()) is overloaded to also set the node mask's bits.
In a system with a single initiator, the list is singular, and list_sort
elides the comparision helper call. Thus the node mask never gets set,
and the subsequent search for the best initiator comes up empty.
Add a new helper to consume the sorted initiator list, and generate the
nodemask, decoupling it from the overloaded initiator_cmp() comparision
callback. This prevents the singular list corner case naturally, and
makes the code easier to follow as well.
Cc: <stable@vger.kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Chris Piper <chris.d.piper@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-2-3712569be691@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d36678f790 ]
Recent changes to the DMA code has resulting in the IMX driver failing
I2C transfers when the buffer has been vmalloc. Only perform DMA
transfers if the message has the I2C_M_DMA_SAFE flag set, indicating
the client is providing a buffer which is DMA safe.
This is a minimal fix for stable. The I2C core provides helpers to
allocate a bounce buffer. For a fuller fix the master should make use
of these helpers.
Fixes: 4544b9f25e ("dma-mapping: Add vmap checks to dma_map_single()")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 145900cf91 ]
A problem about i2c-npcm7xx create debugfs failed is triggered with the
following log given:
[ 173.827310] debugfs: Directory 'npcm_i2c' with parent '/' already present!
The reason is that npcm_i2c_init() returns platform_driver_register()
directly without checking its return value, if platform_driver_register()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of npcm_i2c can never be created later.
npcm_i2c_init()
debugfs_create_dir() # create debugfs directory
platform_driver_register()
driver_register()
bus_add_driver()
priv = kzalloc(...) # OOM happened
# return without destroy debugfs directory
Fix by removing debugfs when platform_driver_register() returns error.
Fixes: 56a1485b10 ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Tali Perry <tali.perry@nuvoton.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 50bcceb772 upstream.
pm_save_spec_msr() keeps a list of all the MSRs which _might_ need
to be saved and restored at hibernate and resume. However, it has
zero awareness of CPU support for these MSRs. It mostly works by
unconditionally attempting to manipulate these MSRs and relying on
rdmsrl_safe() being able to handle a #GP on CPUs where the support is
unavailable.
However, it's possible for reads (RDMSR) to be supported for a given MSR
while writes (WRMSR) are not. In this case, msr_build_context() sees
a successful read (RDMSR) and marks the MSR as valid. Then, later, a
write (WRMSR) fails, producing a nasty (but harmless) error message.
This causes restore_processor_state() to try and restore it, but writing
this MSR is not allowed on the Intel Atom N2600 leading to:
unchecked MSR access error: WRMSR to 0x122 (tried to write 0x0000000000000002) \
at rIP: 0xffffffff8b07a574 (native_write_msr+0x4/0x20)
Call Trace:
<TASK>
restore_processor_state
x86_acpi_suspend_lowlevel
acpi_suspend_enter
suspend_devices_and_enter
pm_suspend.cold
state_store
kernfs_fop_write_iter
vfs_write
ksys_write
do_syscall_64
? do_syscall_64
? up_read
? lock_is_held_type
? asm_exc_page_fault
? lockdep_hardirqs_on
entry_SYSCALL_64_after_hwframe
To fix this, add the corresponding X86_FEATURE bit for each MSR. Avoid
trying to manipulate the MSR when the feature bit is clear. This
required adding a X86_FEATURE bit for MSRs that do not have one already,
but it's a small price to pay.
[ bp: Move struct msr_enumeration inside the only function that uses it. ]
[Pawan: Resolve build issue in backport]
Fixes: 73924ec4d5 ("x86/pm: Save the MSR validity status at context setup")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/c24db75d69df6e66c0465e13676ad3f2837a2ed8.1668539735.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aaa65d17ee upstream.
Support for the TSX control MSR is enumerated in MSR_IA32_ARCH_CAPABILITIES.
This is different from how other CPU features are enumerated i.e. via
CPUID. Currently, a call to tsx_ctrl_is_supported() is required for
enumerating the feature. In the absence of a feature bit for TSX control,
any code that relies on checking feature bits directly will not work.
In preparation for adding a feature bit check in MSR save/restore
during suspend/resume, set a new feature bit X86_FEATURE_TSX_CTRL when
MSR_IA32_TSX_CTRL is present.
[ bp: Remove tsx_ctrl_is_supported()]
[Pawan: Resolved conflicts in backport; Removed parts of commit message
referring to removed function tsx_ctrl_is_supported()]
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/de619764e1d98afbb7a5fa58424f1278ede37b45.1668539735.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d5082d386e ]
When the kernel receives a route deletion request from user space it
tries to delete a route that matches the route attributes specified in
the request.
If only prefix information is specified in the request, the kernel
should delete the first matching FIB alias regardless of its associated
FIB info. However, an error is currently returned when the FIB info is
backed by a nexthop object:
# ip nexthop add id 1 via 192.0.2.2 dev dummy10
# ip route add 198.51.100.0/24 nhid 1
# ip route del 198.51.100.0/24
RTNETLINK answers: No such process
Fix by matching on such a FIB info when legacy nexthop attributes are
not specified in the request. An earlier check already covers the case
where a nexthop ID is specified in the request.
Add tests that cover these flows. Before the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [FAIL]
Tests passed: 11
Tests failed: 1
After the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [ OK ]
Tests passed: 12
Tests failed: 0
No regressions in other tests:
# ./fib_nexthops.sh
...
Tests passed: 228
Tests failed: 0
# ./fib_tests.sh
...
Tests passed: 186
Tests failed: 0
Cc: stable@vger.kernel.org
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Jonas Gorski <jonas.gorski@gmail.com>
Fixes: 493ced1ac4 ("ipv4: Allow routes to use nexthop objects")
Fixes: 6bf92d70e6 ("net: ipv4: fix route with nexthop object delete warning")
Fixes: 61b91eb33a ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20221124210932.2470010-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61b91eb33a ]
Gwangun Jung reported a slab-out-of-bounds access in fib_nh_match:
fib_nh_match+0xf98/0x1130 linux-6.0-rc7/net/ipv4/fib_semantics.c:961
fib_table_delete+0x5f3/0xa40 linux-6.0-rc7/net/ipv4/fib_trie.c:1753
inet_rtm_delroute+0x2b3/0x380 linux-6.0-rc7/net/ipv4/fib_frontend.c:874
Separate nexthop objects are mutually exclusive with the legacy
multipath spec. Fix fib_nh_match to return if the config for the
to be deleted route contains a multipath spec while the fib_info
is using a nexthop object.
Fixes: 493ced1ac4 ("ipv4: Allow routes to use nexthop objects")
Fixes: 6bf92d70e6 ("net: ipv4: fix route with nexthop object delete warning")
Reported-by: Gwangun Jung <exsociety@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: d5082d386e ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 692930cc43 ]
I made a stupid typo when adding the nexthop route warning selftest and
added both $IP and ip after it (double ip) on the cleanup path. The
error doesn't show up when running the test, but obviously it doesn't
cleanup properly after it.
Fixes: 392baa339c ("selftests: net: add delete nexthop route warning test")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: d5082d386e ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 392baa339c ]
Add a test which causes a WARNING on kernels which treat a
nexthop route like a normal route when comparing for deletion and a
device is specified. That is, a route is found but we hit a warning while
matching it. The warning is from fib_info_nh() in include/net/nexthop.h
because we run it on a fib_info with nexthop object. The call chain is:
inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
the fib_info and triggering the warning).
Repro steps:
$ ip nexthop add id 12 via 172.16.1.3 dev veth1
$ ip route add 172.16.101.1/32 nhid 12
$ ip route delete 172.16.101.1/32 dev veth1
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: d5082d386e ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d192bec53 ]
PA-RISC uses a much bigger frame size for functions than other
architectures. So increase it to 2048 for 32- and 64-bit kernels.
This fixes e.g. a warning in lib/xxhash.c.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Stable-dep-of: 152fe65f30 ("Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55b70eed81 ]
parisc uses much bigger frames than other architectures, so increase the
stack frame check value to avoid compiler warnings.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Stable-dep-of: 152fe65f30 ("Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4bedbbd782 ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() for the error path to avoid reference count leak.
Fixes: 2e45528930 ("iommu/vt-d: Unify the way to process DMAR device scope array")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Link: https://lore.kernel.org/r/20221121113649.190393-3-wangxiongfeng2@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afca9e19cc ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() before 'return true' to avoid reference count leak.
Fixes: 89a6079df7 ("iommu/vt-d: Force IOMMU on for platform opt in hint")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Link: https://lore.kernel.org/r/20221121113649.190393-2-wangxiongfeng2@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64c150339e ]
There is a possibility of dividing by zero due to the pcs->bits_per_pin
if pcs->fmask() also has a value of zero and called fls
from asm-generic/bitops/builtin-fls.h or arch/x86/include/asm/bitops.h.
The function pcs_probe() has the branch that assigned to fmask 0 before
pcs_allocate_pin_table() was called
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4e7e8017a8 ("pinctrl: pinctrl-single: enhance to configure multiple pins of different modules")
Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20221117123034.27383-1-korotkov.maxim.s@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>