In order to simplify the injection of exceptions in the host in pkvm
context, let's factor out of enter_exception64() the code calculating
the exception offset from VBAR_EL1 and the cpsr.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 215520143
Change-Id: I97b2431a79fdec87c95c2d1f691bd3a11635c29b
Add a helper allowing to check when the pkvm static key is enabled to
ease the introduction of pkvm hooks in other parts of the code.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 215520143
Change-Id: Iae065b09bb33d42d73a408365c803727269d0de0
If a malicious/compromised host issues a PSCI SYSTEM_RESET call in the
presence of guest-owned pages then the contents of those pages may be
susceptible to cold-reboot attacks.
Use the PSCI MEM_PROTECT call to ensure that volatile memory is wiped by
the firmware if a SYSTEM_RESET occurs while unpoisoned guest pages exist
in the system. Since this call does not offer protection for a "warm"
reset initiated by SYSTEM_RESET2, detect this case in the PSCI relay and
repaint the call to a standard SYSTEM_RESET instead.
Bug: 196204410
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5c3dd93bc83ebcd0b6cea2ec734f6e3a77f0064e
Let's check the return value of pin_user_pages() before blindly
dereferencing the struct page pointer as it may very well be NULL.
Bug: 223678931
Reported-by: Keir Fraser <keirf@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I49eb0eb14b88429cfeed3e7cc8a2a72404cfea97
A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF,
while it really should only get a zero back if the register is not
handled by the hypervisor emulation (as mandated by the architecture).
Introduce all the missing ID registers (including the unallocated ones),
and have them to return 0.
Bug: 226913064
Reported-by: Will Deacon <willdeacon@google.com>
Signed-off-by: Marc Zyngier <mzyngier@google.com>
Change-Id: I1f8de324af8a47974e6ab6b0bf68c8e1b01c4baf
On Android 32-bit system, the following Cts Verifier testcase failed:
manualTests#com.android.cts.verifier.usb.accessory.UsbAccessoryTestActivity
The reason is that compat_ioctl() needs to be called.
So let's add compat_ioctl() for 32-bit applications to solve this issue.
Bug: 223101878
Change-Id: I6e1f797d919494d293184411041955c33ad08aef
Signed-off-by: Aran Dalton <arda@allwinnertech.com>
(cherry picked from commit 77bf53b486)
A deep process chain with many vmas could grow really high. With
default sysctl_max_map_count (64k) and default pid_max (32k) the max
number of vmas in the system is 2147450880 and the refcounter has
headroom of 1073774592 before it reaches REFCOUNT_SATURATED
(3221225472).
Therefore it's unlikely that an anonymous name refcounter will overflow
with these defaults. Currently the max for pid_max is PID_MAX_LIMIT
(4194304) and for sysctl_max_map_count it's INT_MAX (2147483647). In
this configuration anon_vma_name refcount overflow becomes theoretically
possible (that still require heavy sharing of that anon_vma_name between
processes).
kref refcounting interface used in anon_vma_name structure will detect a
counter overflow when it reaches REFCOUNT_SATURATED value but will only
generate a warning and freeze the ref counter. This would lead to the
refcounted object never being freed. A determined attacker could leak
memory like that but it would be rather expensive and inefficient way to
do so.
To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
leaves INT_MAX/2 (1073741823) values before the counter reaches
REFCOUNT_SATURATED. This should provide enough headroom for raising the
refcounts temporarily.
Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Colin Cross <ccross@google.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiaofeng Cao <caoxiaofeng@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 96403e1128)
Bug: 218352794
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ieaab58f6300d9aff3139eed1c1d3417237d81955
Alistair reports an ext4 splat when running a non-protected guest under
pKVM using Cuttlefish on a rockpi board:
| WARNING: CPU: 4 PID: 3125 at fs/ext4/inode.c:3592 ext4_set_page_dirty+0x6c/0x90
| sp : ffffffc00e1a39b0
| x29: ffffffc00e1a39b0 x28: ffffffc009ac3c18 x27: ffffffc009a80968
| x26: ffffff80c2753a00 x25: 0000000200000000 x24: ffffffc00a6dc000
| x23: 0000000000000000 x22: 0000000000000001 x21: fffffffe0314f640
| x20: ffffff8063a99890 x19: fffffffe0314f640 x18: ffffffc00dbf5090
| x17: 0000000000000020 x16: ffffffc00ab73080 x15: 0000000000000040
| x14: 0000000000000040 x13: 0000000000000040 x12: 0000000080200000
| x11: 0000000000000000 x10: fffffffe0314f640 x9 : 0000000000000016
| x8 : 0000000000000015 x7 : 0000000000000062 x6 : 0000000000000068
| x5 : 0000000080200015 x4 : ffffff80067c7500 x3 : 0000000080200016
| x2 : 0000000000000001 x1 : 0000000000000001 x0 : fffffffe0314f640
| Call trace:
| ext4_set_page_dirty+0x6c/0x90
| set_page_dirty+0xf0/0x264
| set_page_dirty_lock+0x94/0x164
| unpin_user_pages_dirty_lock+0xa0/0x15c
| kvm_shadow_destroy+0xd4/0x150
| kvm_arch_destroy_vm+0xa0/0xa4
| kvm_destroy_vm+0x634/0xa0c
| kvm_vcpu_release+0x44/0xc0
| __fput+0xf8/0x43c
| ____fput+0x14/0x24
| task_work_run+0x140/0x204
| do_exit+0x450/0x12b0
| do_group_exit+0xc8/0x17c
| get_signal+0x85c/0xa10
| do_signal+0x9c/0x268
| do_notify_resume+0x98/0x220
| el0_svc+0x5c/0x84
| el0t_64_sync_handler+0x88/0xec
| el0t_64_sync+0x1b4/0x1b8
This appears to be due to virtio-pmem mapping a host page-cache page
directly into the guest and pinning it with GUP. A later attempt to
wrprotect the page using page_mkclean() on the writeback path will not
find the guest mapping and consequently the filesystem becomes confused
when we later dirty the page without any page buffers having been
allocated.
Since the host cannot generally access the memory of protected VMs,
restrict ourselves to swap-backed pages for now and avoid attempting
writeback altogether, with the GUP pin preventing swapout.
Bug: 223678931
Reported-by: Alistair Delva <adelva@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Id8da126aac220df6eff44177a911dc4627e68c02
When a shadow VM is torn down, its VMID can be reallocated as soon as
the shadow table entry is cleared to NULL. Since tearing down the
stage-2 page-table does not imply TLB invalidation, the TLB could still
contain stale entries from the old VM and the new user of the VMID could
end up seeing erroneous translations.
Invalidate the TLB for the VMID of the VM being torn down prior to
clearing its entry in the shadow table.
Bug: 226312378
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ice44d030bf01a1b7612413ee32440f3f38cb3e4e
Let's try this to avoid lock contention, until we find a better solution.
Bug: 216636351
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: Ib7ae218cb4a2531fdb85679b8530e4eba755e06a
* aosp/upstream-f2fs-stable-linux-5.10.y:
fscrypt: update documentation for direct I/O support
f2fs: support direct I/O with fscrypt using blk-crypto
ext4: support direct I/O with fscrypt using blk-crypto
iomap: support direct I/O with fscrypt using blk-crypto
fscrypt: add functions for direct I/O support
f2fs: fix to do sanity check on .cp_pack_total_block_count
f2fs: make gc_urgent and gc_segment_mode sysfs node readable
f2fs: use aggressive GC policy during f2fs_disable_checkpoint()
f2fs: fix compressed file start atomic write may cause data corruption
f2fs: initialize sbi->gc_mode explicitly
f2fs: introduce gc_urgent_mid mode
f2fs: compress: fix to print raw data size in error path of lz4 decompression
f2fs: remove redundant parameter judgment
f2fs: use spin_lock to avoid hang
f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
f2fs: remove unnecessary read for F2FS_FITS_IN_INODE
f2fs: introduce F2FS_UNFAIR_RWSEM to support unfair rwsem
f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
f2fs: fix to do sanity check on curseg->alloc_type
f2fs: fix to avoid potential deadlock
f2fs: quota: fix loop condition at f2fs_quota_sync()
f2fs: Restore rwsem lockdep support
f2fs: fix missing free nid in f2fs_handle_failed_inode
f2fs: add a way to limit roll forward recovery time
f2fs: introduce F2FS_IPU_HONOR_OPU_WRITE ipu policy
f2fs: adjust readahead block number during recovery
f2fs: fix to unlock page correctly in error path of is_alive()
f2fs: expose discard related parameters in sysfs
f2fs: move discard parameters into discard_cmd_control
f2fs: fix to enable ATGC correctly via gc_idle sysfs interface
f2fs: move f2fs to use reader-unfair rwsems
Bug: 216636351
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I53cc37765ba69df2a9b7b9c070e4938822354f05
Set KMI_GENERATION=2 for 3/23 KMI update
Leaf changes summary: 505 artifacts changed
Changed leaf types summary: 2 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 489 Changed, 6 Added functions
Removed/Changed/Added variables summary: 0 Removed, 3 Changed, 5 Added variables
6 Added functions:
[A] 'function int __traceiter_android_vh___get_user_pages_remote(void*, int*, unsigned int*)'
[A] 'function int __traceiter_android_vh_get_user_pages(void*, unsigned int*)'
[A] 'function int __traceiter_android_vh_internal_get_user_pages_fast(void*, unsigned int*)'
[A] 'function int __traceiter_android_vh_pin_user_pages(void*, unsigned int*)'
[A] 'function int __traceiter_android_vh_try_grab_compound_head(void*, page*, int, unsigned int, bool*)'
[A] 'function unsigned long int get_pfnblock_flags_mask(page*, unsigned long int, unsigned long int)'
489 functions with some sub-type change:
[C] 'function sk_buff* __alloc_skb(unsigned int, gfp_t, int, int)' at skbuff.c:183:1 has some sub-type changes:
CRC (modversions) changed from 0x42ee9964 to 0x7c77e5af
[C] 'function sk_buff* __cfg80211_alloc_event_skb(wiphy*, wireless_dev*, nl80211_commands, nl80211_attrs, unsigned int, int, int, gfp_t)' at nl80211.c:10277:1 has some sub-type changes:
CRC (modversions) changed from 0x55bb655c to 0x5f07fe5f
[C] 'function sk_buff* __cfg80211_alloc_reply_skb(wiphy*, nl80211_commands, nl80211_attrs, int)' at nl80211.c:13811:1 has some sub-type changes:
CRC (modversions) changed from 0x8854dc9d to 0x4d096973
... 486 omitted; 489 symbols have only CRC changes
5 Added variables:
[A] 'tracepoint __tracepoint_android_vh___get_user_pages_remote'
[A] 'tracepoint __tracepoint_android_vh_get_user_pages'
[A] 'tracepoint __tracepoint_android_vh_internal_get_user_pages_fast'
[A] 'tracepoint __tracepoint_android_vh_pin_user_pages'
[A] 'tracepoint __tracepoint_android_vh_try_grab_compound_head'
3 Changed variables:
[C] 'net init_net' was changed at net_namespace.c:47:1:
CRC (modversions) changed from 0xaff22d13 to 0x59ca894
[C] 'pid_namespace init_pid_ns' was changed at pid.c:75:1:
CRC (modversions) changed from 0x31a2d4d4 to 0x1ee0d04c
[C] 'softnet_data softnet_data' was changed at dev.c:403:1:
CRC (modversions) changed from 0x3f45ee4 to 0xad33d222
'struct net_device at netdevice.h:1898:1' changed:
type size hasn't changed
1 data member insertion:
'const macsec_ops* macsec_ops', at offset 19328 (in bits) at netdevice.h:2202:1
there are data member changes:
11 ('const udp_tunnel_nic_info* udp_tunnel_nic_info' .. 'u64 android_kabi_reserved8') offsets changed (by +64 bits)
2953 impacted interfaces
'struct phy_device at phy.h:541:1' changed:
type size changed from 12736 to 12800 (in bits)
1 data member insertion:
'const macsec_ops* macsec_ops', at offset 12480 (in bits) at phy.h:647:1
there are data member changes:
4 ('u64 android_kabi_reserved1' .. 'u64 android_kabi_reserved4') offsets changed (by +64 bits)
2953 impacted interfaces
Bug: 226384098
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: I128f3003dff88cee9e0dd4041e2f2cc467dac1ee
The page pinning causes CMA allocation long latency until the process
held the refcont is scheduled in and then released the refcount, which
introduces CMA allocaiton failure.
To overcome the issue, add vendor hooks to migrate the target page of
GUP out of CMA area.
Bug: 218731671
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I5ebf491531d0bfee96ebee83919f22e34ee1d41b
Make the large_file_test check if there is at least 3GB of free disk
space and skip the test if there is not. This is to make the tests pass
on a VM with limited disk size, now all functional tests are passing.
TAP version 13
1..26
ok 1 basic_file_ops_test
ok 2 cant_touch_index_test
ok 3 dynamic_files_and_data_test
ok 4 concurrent_reads_and_writes_test
ok 5 attribute_test
ok 6 work_after_remount_test
ok 7 child_procs_waiting_for_data_test
ok 8 multiple_providers_test
ok 9 hash_tree_test
ok 10 read_log_test
ok 11 get_blocks_test
ok 12 get_hash_blocks_test
ok 13 large_file_test
ok 14 mapped_file_test
ok 15 compatibility_test
ok 16 data_block_count_test
ok 17 hash_block_count_test
ok 18 per_uid_read_timeouts_test
ok 19 inotify_test
ok 20 verity_test
ok 21 enable_verity_test
ok 22 mmap_test
ok 23 truncate_test
ok 24 stat_test
ok 25 sysfs_test
Error mounting fs.: File exists
Error mounting fs.: File exists
ok 26 sysfs_rename_test
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I2260e2b314429251070d0163c70173f237f86476
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
This is a design decision and the user space component "Data Loader"
expects this to work for app re-install use case.
The mounting depth needs to be controlled, however, and only allowed
to be two levels deep. In case of more than two mount attempts the
driver needs to return an error.
In case of the issues listed below the common pattern is that the
reproducer calls:
mount("./file0", "./file0", "incremental-fs", 0, NULL)
many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:
BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN
This change also cleans up the mount error path to properly clean
allocated resources and call deactivate_locked_super(), which
causes the incfs_kill_sb() to be called, where the sb is freed.
Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I08d9b545a2715423296bf4beb67bdbbed78d1be1
Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data. However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.
Therefore, make f2fs support DIO on files that are using inline
encryption. Since f2fs uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:
- Let DIO proceed when supported, by checking fscrypt_dio_supported()
instead of assuming that encrypted files never support DIO.
- In f2fs_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
length of the mapping in the rare case where a DUN discontiguity
occurs in the middle of an extent. The iomap DIO implementation
requires this, since it assumes that it can submit a bio covering (up
to) the whole mapping, without checking fscrypt constraints itself.
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Link: https://lore.kernel.org/r/20220128233940.79464-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data. However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.
Therefore, make ext4 support DIO on files that are using inline
encryption. Since ext4 uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:
- Let DIO proceed when supported, by checking fscrypt_dio_supported()
instead of assuming that encrypted files never support DIO.
- In ext4_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
length of the mapping in the rare case where a DUN discontiguity
occurs in the middle of an extent. The iomap DIO implementation
requires this, since it assumes that it can submit a bio covering (up
to) the whole mapping, without checking fscrypt constraints itself.
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Link: https://lore.kernel.org/r/20220128233940.79464-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data. However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.
Add support for this to the iomap DIO implementation by calling
fscrypt_set_bio_crypt_ctx() to set encryption contexts on the bios.
Don't check for the rare case where a DUN (crypto data unit number)
discontiguity creates a boundary that bios must not cross. Instead,
filesystems are expected to handle this in ->iomap_begin() by limiting
the length of the mapping so that iomap doesn't have to worry about it.
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220128233940.79464-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data. However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.
In preparation for supporting this, add the following functions:
- fscrypt_dio_supported() checks whether a DIO request is supported as
far as encryption is concerned. Encrypted files will only support DIO
when inline encryption is used and the I/O request is properly
aligned; this function checks these preconditions.
- fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing
a place in the file that a bio with an encryption context cannot
cross due to a DUN discontiguity. This function is needed by
filesystems that use the iomap DIO implementation (which operates
directly on logical ranges, so it won't use fscrypt_mergeable_bio())
and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32.
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220128233940.79464-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Without it incfs/incfs_perf runtime fails in format_signature:
malloc(): invalid size (unsorted)
Aborted
When compiled with gcc version 11.2.0.
Also add check for NULL after the malloc, and remove unneeded
space for uint32_t in signing_section.
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I62b775140e4b89f75335cbd65665cf6a3e0fe964
As bughunter reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215709
f2fs may hang when mounting a fuzzed image, the dmesg shows as below:
__filemap_get_folio+0x3a9/0x590
pagecache_get_page+0x18/0x60
__get_meta_page+0x95/0x460 [f2fs]
get_checkpoint_version+0x2a/0x1e0 [f2fs]
validate_checkpoint+0x8e/0x2a0 [f2fs]
f2fs_get_valid_checkpoint+0xd0/0x620 [f2fs]
f2fs_fill_super+0xc01/0x1d40 [f2fs]
mount_bdev+0x18a/0x1c0
f2fs_mount+0x15/0x20 [f2fs]
legacy_get_tree+0x28/0x50
vfs_get_tree+0x27/0xc0
path_mount+0x480/0xaa0
do_mount+0x7c/0xa0
__x64_sys_mount+0x8b/0xe0
do_syscall_64+0x38/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is cp_pack_total_block_count field in checkpoint was fuzzed
to one, as calcuated, two cp pack block locates in the same block address,
so then read latter cp pack block, it will block on the page lock due to
the lock has already held when reading previous cp pack block, fix it by
adding sanity check for cp_pack_total_block_count.
Cc: stable@vger.kernel.org
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The FROMLIST patches merged in aosp/1974918 that add vmalloc support to
KASAN now have a few fixes staged in linux-next/akpm. Sync the changes.
Bug: 217222520
Bug: 222221793
Change-Id: I33dd30e3834a4d1bb8eac611b350004afdb08a74
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Changes in 5.10.107
Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
sctp: fix the processing for INIT chunk
xfrm: Check if_id in xfrm_migrate
xfrm: Fix xfrm migrate issues when address family changes
arm64: dts: rockchip: fix rk3399-puma eMMC HS400 signal integrity
arm64: dts: rockchip: reorder rk3399 hdmi clocks
arm64: dts: agilex: use the compatible "intel,socfpga-agilex-hsotg"
ARM: dts: rockchip: reorder rk322x hmdi clocks
ARM: dts: rockchip: fix a typo on rk3288 crypto-controller
mac80211: refuse aggregations sessions before authorized
MIPS: smp: fill in sibling and core maps earlier
ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready
atm: firestream: check the return value of ioremap() in fs_init()
iwlwifi: don't advertise TWT support
drm/vrr: Set VRR capable prop only if it is attached to connector
nl80211: Update bss channel on channel switch for P2P_CLIENT
tcp: make tcp_read_sock() more robust
sfc: extend the locking on mcdi->seqno
kselftest/vm: fix tests build with old libc
io_uring: return back safer resurrect
arm64: kvm: Fix copy-and-paste error in bhb templates for v5.10 stable
Linux 5.10.107
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib5977657dd66c90a01694f04ee85d72c3a22bebb
KVM's infrastructure for spectre mitigations in the vectors in v5.10 and
earlier is different, it uses templates which are used to build a set of
vectors at runtime.
There are two copy-and-paste errors in the templates: __spectre_bhb_loop_k24
should loop 24 times and __spectre_bhb_loop_k32 32.
Fix these.
Reported-by: Pavel Machek <pavel@denx.de>
Link: https://lore.kernel.org/all/20220310234858.GB16308@amd/
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f70865db5f upstream.
Revert of revert of "io_uring: wait potential ->release() on resurrect",
which adds a helper for resurrect not racing completion reinit, as was
removed because of a strange bug with no clear root or link to the
patch.
Was improved, instead of rcu_synchronize(), just wait_for_completion()
because we're at 0 refs and it will happen very shortly. Specifically
use non-interruptible version to ignore all pending signals that may
have ended prior interruptible wait.
This reverts commit cb5e1b8130.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7a080c20f686d026efade810b116b72f88abaff9.1618101759.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b773827e36 ]
The error message when I build vm tests on debian10 (GLIBC 2.28):
userfaultfd.c: In function `userfaultfd_pagemap_test':
userfaultfd.c:1393:37: error: `MADV_PAGEOUT' undeclared (first use
in this function); did you mean `MADV_RANDOM'?
if (madvise(area_dst, test_pgsize, MADV_PAGEOUT))
^~~~~~~~~~~~
MADV_RANDOM
This patch includes these newer definitions from UAPI linux/mman.h, is
useful to fix tests build on systems without these definitions in glibc
sys/mman.h.
Link: https://lkml.kernel.org/r/20220227055330.43087-2-zhouchengming@bytedance.com
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1fb205efb ]
seqno could be read as a stale value outside of the lock. The lock is
already acquired to protect the modification of seqno against a possible
race condition. Place the reading of this value also inside this locking
to protect it against a possible race condition.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e50b88c4f0 ]
The wdev channel information is updated post channel switch only for
the station mode and not for the other modes. Due to this, the P2P client
still points to the old value though it moved to the new channel
when the channel change is induced from the P2P GO.
Update the bss channel after CSA channel switch completion for P2P client
interface as well.
Signed-off-by: Sreeramya Soratkal <quic_ssramya@quicinc.com>
Link: https://lore.kernel.org/r/1646114600-31479-1-git-send-email-quic_ssramya@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11c57c3ba9 ]
Resending this to properly add it to the patch tracker - thanks for letting
me know, Arnd :)
When ARM is enabled, and BITREVERSE is disabled,
Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for HAVE_ARCH_BITREVERSE
Depends on [n]: BITREVERSE [=n]
Selected by [y]:
- ARM [=y] && (CPU_32v7M [=n] || CPU_32v7 [=y]) && !CPU_32v6 [=n]
This is because ARM selects HAVE_ARCH_BITREVERSE
without selecting BITREVERSE, despite
HAVE_ARCH_BITREVERSE depending on BITREVERSE.
This unmet dependency bug was found by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2703def33 ]
After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle),
2-core 2-thread-per-core interAptiv (CPS-driven) started emitting
the following:
[ 0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi))
[ 0.048183] ------------[ cut here ]------------
[ 0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240
[ 0.048220] Modules linked in:
[ 0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f
[ 0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1
[ 0.048278] 830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000
[ 0.048307] 00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000
[ 0.048334] 00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34
[ 0.048361] 817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933
[ 0.048389] ...
[ 0.048396] Call Trace:
[ 0.048399] [<8105a7bc>] show_stack+0x3c/0x140
[ 0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80
[ 0.048440] [<8108b5c0>] __warn+0xc0/0xf4
[ 0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c
[ 0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240
[ 0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80
[ 0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140
[ 0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140
[ 0.048523] [<8106593c>] start_secondary+0xbc/0x280
[ 0.048539]
[ 0.048543] ---[ end trace 0000000000000000 ]---
[ 0.048636] Synchronize counters for CPU 1: done.
...for each but CPU 0/boot.
Basic debug printks right before the mentioned line say:
[ 0.048170] CPU: 1, smt_mask:
So smt_mask, which is sibling mask obviously, is empty when entering
the function.
This is critical, as sched_core_cpu_starting() calculates
core-scheduling parameters only once per CPU start, and it's crucial
to have all the parameters filled in at that moment (at least it
uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on
MIPS).
A bit of debugging led me to that set_cpu_sibling_map() performing
the actual map calculation, was being invocated after
notify_cpu_start(), and exactly the latter function starts CPU HP
callback round (sched_core_cpu_starting() is basically a CPU HP
callback).
While the flow is same on ARM64 (maps after the notifier, although
before calling set_cpu_online()), x86 started calculating sibling
maps earlier than starting the CPU HP callbacks in Linux 4.14 (see
[0] for the reference). Neither me nor my brief tests couldn't find
any potential caveats in calculating the maps right after performing
delay calibration, but the WARN splat is now gone.
The very same debug prints now yield exactly what I expected from
them:
[ 0.048433] CPU: 1, smt_mask: 0-1
[0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>