The page pinning causes CMA allocation long latency until the process
held the refcont is scheduled in and then released the refcount, which
introduces CMA allocaiton failure.
To overcome the issue, add vendor hooks to migrate the target page of
GUP out of CMA area.
Bug: 218731671
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I5ebf491531d0bfee96ebee83919f22e34ee1d41b
Make the large_file_test check if there is at least 3GB of free disk
space and skip the test if there is not. This is to make the tests pass
on a VM with limited disk size, now all functional tests are passing.
TAP version 13
1..26
ok 1 basic_file_ops_test
ok 2 cant_touch_index_test
ok 3 dynamic_files_and_data_test
ok 4 concurrent_reads_and_writes_test
ok 5 attribute_test
ok 6 work_after_remount_test
ok 7 child_procs_waiting_for_data_test
ok 8 multiple_providers_test
ok 9 hash_tree_test
ok 10 read_log_test
ok 11 get_blocks_test
ok 12 get_hash_blocks_test
ok 13 large_file_test
ok 14 mapped_file_test
ok 15 compatibility_test
ok 16 data_block_count_test
ok 17 hash_block_count_test
ok 18 per_uid_read_timeouts_test
ok 19 inotify_test
ok 20 verity_test
ok 21 enable_verity_test
ok 22 mmap_test
ok 23 truncate_test
ok 24 stat_test
ok 25 sysfs_test
Error mounting fs.: File exists
Error mounting fs.: File exists
ok 26 sysfs_rename_test
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I2260e2b314429251070d0163c70173f237f86476
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
This is a design decision and the user space component "Data Loader"
expects this to work for app re-install use case.
The mounting depth needs to be controlled, however, and only allowed
to be two levels deep. In case of more than two mount attempts the
driver needs to return an error.
In case of the issues listed below the common pattern is that the
reproducer calls:
mount("./file0", "./file0", "incremental-fs", 0, NULL)
many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:
BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN
This change also cleans up the mount error path to properly clean
allocated resources and call deactivate_locked_super(), which
causes the incfs_kill_sb() to be called, where the sb is freed.
Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I08d9b545a2715423296bf4beb67bdbbed78d1be1
Without it incfs/incfs_perf runtime fails in format_signature:
malloc(): invalid size (unsorted)
Aborted
When compiled with gcc version 11.2.0.
Also add check for NULL after the malloc, and remove unneeded
space for uint32_t in signing_section.
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I62b775140e4b89f75335cbd65665cf6a3e0fe964
The FROMLIST patches merged in aosp/1974918 that add vmalloc support to
KASAN now have a few fixes staged in linux-next/akpm. Sync the changes.
Bug: 217222520
Bug: 222221793
Change-Id: I33dd30e3834a4d1bb8eac611b350004afdb08a74
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Changes in 5.10.107
Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
sctp: fix the processing for INIT chunk
xfrm: Check if_id in xfrm_migrate
xfrm: Fix xfrm migrate issues when address family changes
arm64: dts: rockchip: fix rk3399-puma eMMC HS400 signal integrity
arm64: dts: rockchip: reorder rk3399 hdmi clocks
arm64: dts: agilex: use the compatible "intel,socfpga-agilex-hsotg"
ARM: dts: rockchip: reorder rk322x hmdi clocks
ARM: dts: rockchip: fix a typo on rk3288 crypto-controller
mac80211: refuse aggregations sessions before authorized
MIPS: smp: fill in sibling and core maps earlier
ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready
atm: firestream: check the return value of ioremap() in fs_init()
iwlwifi: don't advertise TWT support
drm/vrr: Set VRR capable prop only if it is attached to connector
nl80211: Update bss channel on channel switch for P2P_CLIENT
tcp: make tcp_read_sock() more robust
sfc: extend the locking on mcdi->seqno
kselftest/vm: fix tests build with old libc
io_uring: return back safer resurrect
arm64: kvm: Fix copy-and-paste error in bhb templates for v5.10 stable
Linux 5.10.107
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib5977657dd66c90a01694f04ee85d72c3a22bebb
KVM's infrastructure for spectre mitigations in the vectors in v5.10 and
earlier is different, it uses templates which are used to build a set of
vectors at runtime.
There are two copy-and-paste errors in the templates: __spectre_bhb_loop_k24
should loop 24 times and __spectre_bhb_loop_k32 32.
Fix these.
Reported-by: Pavel Machek <pavel@denx.de>
Link: https://lore.kernel.org/all/20220310234858.GB16308@amd/
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f70865db5f upstream.
Revert of revert of "io_uring: wait potential ->release() on resurrect",
which adds a helper for resurrect not racing completion reinit, as was
removed because of a strange bug with no clear root or link to the
patch.
Was improved, instead of rcu_synchronize(), just wait_for_completion()
because we're at 0 refs and it will happen very shortly. Specifically
use non-interruptible version to ignore all pending signals that may
have ended prior interruptible wait.
This reverts commit cb5e1b8130.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7a080c20f686d026efade810b116b72f88abaff9.1618101759.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b773827e36 ]
The error message when I build vm tests on debian10 (GLIBC 2.28):
userfaultfd.c: In function `userfaultfd_pagemap_test':
userfaultfd.c:1393:37: error: `MADV_PAGEOUT' undeclared (first use
in this function); did you mean `MADV_RANDOM'?
if (madvise(area_dst, test_pgsize, MADV_PAGEOUT))
^~~~~~~~~~~~
MADV_RANDOM
This patch includes these newer definitions from UAPI linux/mman.h, is
useful to fix tests build on systems without these definitions in glibc
sys/mman.h.
Link: https://lkml.kernel.org/r/20220227055330.43087-2-zhouchengming@bytedance.com
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1fb205efb ]
seqno could be read as a stale value outside of the lock. The lock is
already acquired to protect the modification of seqno against a possible
race condition. Place the reading of this value also inside this locking
to protect it against a possible race condition.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e50b88c4f0 ]
The wdev channel information is updated post channel switch only for
the station mode and not for the other modes. Due to this, the P2P client
still points to the old value though it moved to the new channel
when the channel change is induced from the P2P GO.
Update the bss channel after CSA channel switch completion for P2P client
interface as well.
Signed-off-by: Sreeramya Soratkal <quic_ssramya@quicinc.com>
Link: https://lore.kernel.org/r/1646114600-31479-1-git-send-email-quic_ssramya@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11c57c3ba9 ]
Resending this to properly add it to the patch tracker - thanks for letting
me know, Arnd :)
When ARM is enabled, and BITREVERSE is disabled,
Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for HAVE_ARCH_BITREVERSE
Depends on [n]: BITREVERSE [=n]
Selected by [y]:
- ARM [=y] && (CPU_32v7M [=n] || CPU_32v7 [=y]) && !CPU_32v6 [=n]
This is because ARM selects HAVE_ARCH_BITREVERSE
without selecting BITREVERSE, despite
HAVE_ARCH_BITREVERSE depending on BITREVERSE.
This unmet dependency bug was found by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2703def33 ]
After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle),
2-core 2-thread-per-core interAptiv (CPS-driven) started emitting
the following:
[ 0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi))
[ 0.048183] ------------[ cut here ]------------
[ 0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240
[ 0.048220] Modules linked in:
[ 0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f
[ 0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1
[ 0.048278] 830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000
[ 0.048307] 00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000
[ 0.048334] 00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34
[ 0.048361] 817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933
[ 0.048389] ...
[ 0.048396] Call Trace:
[ 0.048399] [<8105a7bc>] show_stack+0x3c/0x140
[ 0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80
[ 0.048440] [<8108b5c0>] __warn+0xc0/0xf4
[ 0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c
[ 0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240
[ 0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80
[ 0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140
[ 0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140
[ 0.048523] [<8106593c>] start_secondary+0xbc/0x280
[ 0.048539]
[ 0.048543] ---[ end trace 0000000000000000 ]---
[ 0.048636] Synchronize counters for CPU 1: done.
...for each but CPU 0/boot.
Basic debug printks right before the mentioned line say:
[ 0.048170] CPU: 1, smt_mask:
So smt_mask, which is sibling mask obviously, is empty when entering
the function.
This is critical, as sched_core_cpu_starting() calculates
core-scheduling parameters only once per CPU start, and it's crucial
to have all the parameters filled in at that moment (at least it
uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on
MIPS).
A bit of debugging led me to that set_cpu_sibling_map() performing
the actual map calculation, was being invocated after
notify_cpu_start(), and exactly the latter function starts CPU HP
callback round (sched_core_cpu_starting() is basically a CPU HP
callback).
While the flow is same on ARM64 (maps after the notifier, although
before calling set_cpu_online()), x86 started calculating sibling
maps earlier than starting the CPU HP callbacks in Linux 4.14 (see
[0] for the reference). Neither me nor my brief tests couldn't find
any potential caveats in calculating the maps right after performing
delay calibration, but the WARN splat is now gone.
The very same debug prints now yield exactly what I expected from
them:
[ 0.048433] CPU: 1, smt_mask: 0-1
[0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 268a491aeb ]
The DWC2 USB controller on the Agilex platform does not support clock
gating, so use the chip specific "intel,socfpga-agilex-hsotg"
compatible.
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e03c3bba35 ]
xfrm_migrate cannot handle address family change of an xfrm_state.
The symptons are the xfrm_state will be migrated to a wrong address,
and sending as well as receiving packets wil be broken.
This commit fixes it by breaking the original xfrm_state_clone
method into two steps so as to update the props.family before
running xfrm_init_state. As the result, xfrm_state's inner mode,
outer mode, type and IP header length in xfrm_state_migrate can
be updated with the new address family.
Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1885354
Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c1aca3080e ]
This patch enables distinguishing SAs and SPs based on if_id during
the xfrm_migrate flow. This ensures support for xfrm interfaces
throughout the SA/SP lifecycle.
When there are multiple existing SPs with the same direction,
the same xfrm_selector and different endpoint addresses,
xfrm_migrate might fail with ENODATA.
Specifically, the code path for performing xfrm_migrate is:
Stage 1: find policy to migrate with
xfrm_migrate_policy_find(sel, dir, type, net)
Stage 2: find and update state(s) with
xfrm_migrate_state_find(mp, net)
Stage 3: update endpoint address(es) of template(s) with
xfrm_policy_migrate(pol, m, num_migrate)
Currently "Stage 1" always returns the first xfrm_policy that
matches, and "Stage 3" looks for the xfrm_tmpl that matches the
old endpoint address. Thus if there are multiple xfrm_policy
with same selector, direction, type and net, "Stage 1" might
rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
because it cannot find a xfrm_tmpl with the matching endpoint
address.
The fix is to allow userspace to pass an if_id and add if_id
to the matching rule in Stage 1 and Stage 2 since if_id is a
unique ID for xfrm_policy and xfrm_state. For compatibility,
if_id will only be checked if the attribute is set.
Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1668886
Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit eae5783908 upstream.
This patch fixes the problems below:
1. In non-shutdown_ack_sent states: in sctp_sf_do_5_1B_init() and
sctp_sf_do_5_2_2_dupinit():
chunk length check should be done before any checks that may cause
to send abort, as making packet for abort will access the init_tag
from init_hdr in sctp_ootb_pkt_new().
2. In shutdown_ack_sent state: in sctp_sf_do_9_2_reshutack():
The same checks as does in sctp_sf_do_5_2_2_dupinit() is needed
for sctp_sf_do_9_2_reshutack().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When memory is tight, system may start to compact memory for large
continuous memory demands. If one process tries to lock a memory page
that is being locked and isolated for compaction, it may wait a long time
or even forever. This is because compaction will perform non-atomic
PG_Isolated clear while holding page lock, this may overwrite PG_waiters
set by the process that can't obtain the page lock and add itself to the
waiting queue to wait for the lock to be unlocked.
CPU1 CPU2
lock_page(page); (successful)
lock_page(); (failed)
__ClearPageIsolated(page); SetPageWaiters(page) (may be overwritten)
unlock_page(page);
The solution is to not perform non-atomic operation on page flags while
holding page lock.
Link: https://lkml.kernel.org/r/20220315030515.20263-1-andrew.yang@mediatek.com
Signed-off-by: andrew.yang <andrew.yang@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Vlastimil Babka" <vbabka@suse.cz>
Cc: David Howells <dhowells@redhat.com>
Cc: "William Kucharski" <william.kucharski@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
(cherry picked from commit 48911e41ddee4fe113bf1e4303dda1a413b169c9
https://github.com/hnaz/linux-mm.git)
Bug: 225086204
Change-Id: Ia72bd22fdf2fc0476fda7abb769e6c879f427b2b
Changes in 5.10.106
ARM: boot: dts: bcm2711: Fix HVS register range
clk: qcom: gdsc: Add support to update GDSC transition delay
HID: vivaldi: fix sysfs attributes leak
arm64: dts: armada-3720-turris-mox: Add missing ethernet0 alias
tipc: fix kernel panic when enabling bearer
mISDN: Remove obsolete PIPELINE_DEBUG debugging information
mISDN: Fix memory leak in dsp_pipeline_build()
virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zero
isdn: hfcpci: check the return value of dma_set_mask() in setup_hw()
net: qlogic: check the return value of dma_alloc_coherent() in qed_vf_hw_prepare()
esp: Fix BEET mode inter address family tunneling on GSO
qed: return status of qed_iov_get_link
drm/sun4i: mixer: Fix P010 and P210 format numbers
net: dsa: mt7530: fix incorrect test in mt753x_phylink_validate()
ARM: dts: aspeed: Fix AST2600 quad spi group
i40e: stop disabling VFs due to PF error responses
ice: stop disabling VFs due to PF error responses
ice: Align macro names to the specification
ice: Remove unnecessary checker loop
ice: Rename a couple of variables
ice: Fix curr_link_speed advertised speed
ethernet: Fix error handling in xemaclite_of_probe
tipc: fix incorrect order of state message data sanity check
net: ethernet: ti: cpts: Handle error for clk_enable
net: ethernet: lpc_eth: Handle error for clk_enable
ax25: Fix NULL pointer dereference in ax25_kill_by_device
net/mlx5: Fix size field in bufferx_reg struct
net/mlx5: Fix a race on command flush flow
net/mlx5e: Lag, Only handle events from highest priority multipath entry
NFC: port100: fix use-after-free in port100_send_complete
selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
gpio: ts4900: Do not set DAT and OE together
gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
net: phy: DP83822: clear MISR2 register to disable interrupts
sctp: fix kernel-infoleak for SCTP sockets
net: bcmgenet: Don't claim WOL when its not available
selftests/bpf: Add test for bpf_timer overwriting crash
spi: rockchip: Fix error in getting num-cs property
spi: rockchip: terminate dma transmission when slave abort
net-sysfs: add check for netdevice being present to speed_show
hwmon: (pmbus) Clear pmbus fault/warning bits after read
gpio: Return EPROBE_DEFER if gc->to_irq is NULL
Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
Revert "xen-netback: Check for hotplug-status existence before watching"
ipv6: prevent a possible race condition with lifetimes
tracing: Ensure trace buffer is at least 4096 bytes large
selftest/vm: fix map_fixed_noreplace test failure
selftests/memfd: clean up mapping in mfd_fail_write
ARM: Spectre-BHB: provide empty stub for non-config
fuse: fix pipe buffer lifetime for direct_io
staging: rtl8723bs: Fix access-point mode deadlock
staging: gdm724x: fix use after free in gdm_lte_rx()
net: macb: Fix lost RX packet wakeup race in NAPI receive
mmc: meson: Fix usage of meson_mmc_post_req()
riscv: Fix auipc+jalr relocation range checks
arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0
virtio: unexport virtio_finalize_features
virtio: acknowledge all features before access
watch_queue, pipe: Free watchqueue state after clearing pipe ring
watch_queue: Fix to release page in ->release()
watch_queue: Fix to always request a pow-of-2 pipe ring size
watch_queue: Fix the alloc bitmap size to reflect notes allocated
watch_queue: Free the alloc bitmap when the watch_queue is torn down
watch_queue: Fix lack of barrier/sync/lock between post and read
watch_queue: Make comment about setting ->defunct more accurate
x86/boot: Fix memremap of setup_indirect structures
x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
x86/traps: Mark do_int3() NOKPROBE_SYMBOL
ext4: add check to prevent attempting to resize an fs with sparse_super2
ARM: fix Thumb2 regression with Spectre BHB
watch_queue: Fix filter limit check
Linux 5.10.106
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: If942a4fbbd980ec5a5275d2b66e983f70caa837c
Changes in 5.10.105
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
x86/speculation: Add eIBRS + Retpoline options
Documentation/hw-vuln: Update spectre doc
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
x86/speculation: Use generic retpoline by default on AMD
x86/speculation: Update link to AMD speculation whitepaper
x86/speculation: Warn about Spectre v2 LFENCE mitigation
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
ARM: report Spectre v2 status through sysfs
ARM: early traps initialisation
ARM: use LOADADDR() to get load address of sections
ARM: Spectre-BHB workaround
ARM: include unprivileged BPF status in Spectre V2 reporting
arm64: cputype: Add CPU implementor & types for the Apple M1 cores
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
arm64: Add Cortex-X2 CPU part definition
arm64: Add Cortex-A510 CPU part definition
arm64: Add HWCAP for self-synchronising virtual counter
arm64: add ID_AA64ISAR2_EL1 sys register
arm64: cpufeature: add HWCAP for FEAT_AFP
arm64: cpufeature: add HWCAP for FEAT_RPRES
arm64: entry.S: Add ventry overflow sanity checks
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
arm64: entry: Make the trampoline cleanup optional
arm64: entry: Free up another register on kpti's tramp_exit path
arm64: entry: Move the trampoline data page before the text page
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
arm64: entry: Don't assume tramp_vectors is the start of the vectors
arm64: entry: Move trampoline macros out of ifdef'd section
arm64: entry: Make the kpti trampoline's kpti sequence optional
arm64: entry: Allow the trampoline text to occupy multiple pages
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
arm64: entry: Add vectors that have the bhb mitigation sequences
arm64: entry: Add macro for reading symbol addresses from the trampoline
arm64: Add percpu vectors for EL1
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
arm64: Mitigate spectre style branch history side channels
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
arm64: Use the clearbhb instruction in mitigations
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
ARM: fix build error when BPF_SYSCALL is disabled
ARM: fix co-processor register typo
ARM: Do not use NOCROSSREFS directive with ld.lld
ARM: fix build warning in proc-v7-bugs.c
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
xen/grant-table: add gnttab_try_end_foreign_access()
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
xen/gntalloc: don't use gnttab_query_foreign_access()
xen: remove gnttab_query_foreign_access()
xen/9p: use alloc/free_pages_exact()
xen/pvcalls: use alloc/free_pages_exact()
xen/gnttab: fix gnttab_end_foreign_access() without page specified
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Revert "ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE"
Linux 5.10.105
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5aaa201c1d982930fbd7eaa8b035fbad4fcec983
commit 58c9a5060c upstream.
The mitigations for Spectre-BHB are only applied when an exception is
taken from user-space. The mitigation status is reported via the spectre_v2
sysfs vulnerabilities file.
When unprivileged eBPF is enabled the mitigation in the exception vectors
can be avoided by an eBPF program.
When unprivileged eBPF is enabled, print a warning and report vulnerable
via the sysfs vulnerabilities file.
Bug: 215557547
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib4a7b967d820684dbb18eafacdbaa22794469040
Future CPUs may implement a clearbhb instruction that is sufficient
to mitigate SpectreBHB. CPUs that implement this instruction, but
not CSV2.3 must be affected by Spectre-BHB.
Add support to use this instruction as the BHB mitigation on CPUs
that support it. The instruction is in the hint space, so it will
be treated by a NOP as older CPUs.
Bug: 215557547
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 228a26b912)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie233d92acfaed7b44ac255472ddb06367660384d
commit a5905d6af4 upstream.
KVM allows the guest to discover whether the ARCH_WORKAROUND SMCCC are
implemented, and to preserve that state during migration through its
firmware register interface.
Add the necessary boiler plate for SMCCC_ARCH_WORKAROUND_3.
Bug: 215557547
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9e0964b54727930bbda4f089c91075447b046c91
commit 558c303c97 upstream.
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites or invalidates the branch history.
The sequence of branches is added to the vectors, and should appear
before the first indirect branch. For systems using KPTI the sequence
is added to the kpti trampoline where it has a free register as the exit
from the trampoline is via a 'ret'. For systems not using KPTI, the same
register tricks are used to free up a register in the vectors.
For the firmware call, arch-workaround-3 clobbers 4 registers, so
there is no choice but to save them to the EL1 stack. This only happens
for entry from EL0, so if we take an exception due to the stack access,
it will not become re-entrant.
For KVM, the existing branch-predictor-hardening vectors are used.
When a spectre version of these vectors is in use, the firmware call
is sufficient to mitigate against Spectre-BHB. For the non-spectre
versions, the sequence of branches is added to the indirect vector.
Bug: 215557547
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Change-Id: I7d1f5a9767d1dbc9e6ef363ca3bf7bffe91c402c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
When building arm64 defconfig + CONFIG_LTO_CLANG_{FULL,THIN}=y after
commit 558c303c97 ("arm64: Mitigate spectre style branch history side
channels"), the following error occurs:
<instantiation>:4:2: error: invalid fixup for movz/movk instruction
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3
^
Marc figured out that moving "#include <linux/init.h>" in
include/linux/arm-smccc.h into a !__ASSEMBLY__ block resolves it. The
full include chain with CONFIG_LTO=y from include/linux/arm-smccc.h:
include/linux/init.h
include/linux/compiler.h
arch/arm64/include/asm/rwonce.h
arch/arm64/include/asm/alternative-macros.h
arch/arm64/include/asm/assembler.h
The asm/alternative-macros.h include in asm/rwonce.h only happens when
CONFIG_LTO is set, which ultimately casues asm/assembler.h to be
included before the definition of ARM_SMCCC_ARCH_WORKAROUND_3. As a
result, the preprocessor does not expand ARM_SMCCC_ARCH_WORKAROUND_3 in
__mitigate_spectre_bhb_fw, which results in the error above.
Avoid this problem by just avoiding the CONFIG_LTO=y __READ_ONCE() block
in asm/rwonce.h with assembly files, as nothing in that block is useful
to assembly files, which allows ARM_SMCCC_ARCH_WORKAROUND_3 to be
properly expanded with CONFIG_LTO=y builds.
Bug: 215557547
Fixes: e35123d83e ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y")
Cc: <stable@vger.kernel.org> # 5.11.x
Link: https://lore.kernel.org/r/20220309155716.3988480-1-maz@kernel.org/
Reported-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220309191633.2307110-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 52c9f93a9c)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3beec784b16c9b094bc2e86466f2dadddc3ec6a7
CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware
call from the vectors, or run a sequence of branches. This gets added
to the hyp vectors. If there is no support for arch-workaround-1 in
firmware, the indirect vector will be used.
kvm_init_vector_slots() only initialises the two indirect slots if
the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors()
only initialises __hyp_bp_vect_base if the platform is vulnerable to
Spectre-v3a.
As there are about to more users of the indirect vectors, ensure
their entries in hyp_spectre_vector_selector[] are always initialised,
and __hyp_bp_vect_base defaults to the regular VA mapping.
The Spectre-v3a check is moved to a helper
kvm_system_needs_idmapped_vectors(), and merged with the code
that creates the hyp mappings.
Bug: 215557547
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 5bdf343760)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6ad747d012e2d9965a9450e62bebd5bfe1605316
commit dee435be76 upstream.
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation as part of
a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that
previously reported 'Not affected' are now moderately mitigated by CSV2.
Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2
to also show the state of the BHB mitigation.
Bug: 215557547
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3a6aeb077869dc310d420c27f6752dfeacf438a6
commit bd09128d16 upstream.
The Spectre-BHB workaround adds a firmware call to the vectors. This
is needed on some CPUs, but not others. To avoid the unaffected CPU in
a big/little pair from making the firmware call, create per cpu vectors.
The per-cpu vectors only apply when returning from EL0.
Systems using KPTI can use the canonical 'full-fat' vectors directly at
EL1, the trampoline exit code will switch to this_cpu_vector on exit to
EL0. Systems not using KPTI should always use this_cpu_vector.
this_cpu_vector will point at a vector in tramp_vecs or
__bp_harden_el1_vectors, depending on whether KPTI is in use.
Bug: 215557547
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie96a6896e26d54d09143ec668a0bdfbf21fa2618
commit b28a8eebe8 upstream.
The trampoline code needs to use the address of symbols in the wider
kernel, e.g. vectors. PC-relative addressing wouldn't work as the
trampoline code doesn't run at the address the linker expected.
tramp_ventry uses a literal pool, unless CONFIG_RANDOMIZE_BASE is
set, in which case it uses the data page as a literal pool because
the data page can be unmapped when running in user-space, which is
required for CPUs vulnerable to meltdown.
Pull this logic out as a macro, instead of adding a third copy
of it.
Bug: 215557547
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1d11f54bb867a6312069bc3fae6f4ceb4ab1fd20
Properly handle the async case. The existing bpf operations will likely
need to be reworked. Given that they don't allow altering anything as
is, this change just incrementally moves us in the right direction.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Test: generic/467 and fuse_test
Bug: 217570523
Change-Id: I31c0b48bf3d674efecad4bff4ea8b482c4e7da45
Adds support for lseek via fuse-bpf
Bug: 224855060
Test: bpf_test_lseek
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: Ic282940d53b9bb44a291cb3a5dfe09847b4e5c9a