This reverts commit c22d8a19e9.
It needs to be added _after_ 5.15.13 is merged, not before, otherwise
the revert and merge break.
Fixes: c22d8a19e9 ("BACKPORT: vsock: each transport cycles only on its own sockets")
Signed-off-by: Jiyong Park <jiyong@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I513425a7346540c5b6a90e9e793a7fadf71b0a1d
This is an KMI-preserving implementation of commit
8e6ed96376 upstream.
When iterating over sockets using vsock_for_each_connected_socket, make
sure that a transport filters out sockets that don't belong to the
transport.
There actually was an issue caused by this; in a nested VM
configuration, destroying the nested VM (which often involves the
closing of /dev/vhost-vsock if there was h2g connections to the nested
VM) kills not only the h2g connections, but also all existing g2h
connections to the (outmost) host which are totally unrelated.
Tested: Executed the following steps on Cuttlefish (Android running on a
VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
connection inside the VM, (2) open and then close /dev/vhost-vsock by
`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
session is not reset.
[1] https://android.googlesource.com/device/google/cuttlefish/
Fixes: c0cfa2d8a7 ("vsock: add multi-transports support")
Signed-off-by: Jiyong Park <jiyong@google.com>
(cherry picked from commit 8e6ed96376)
Change-Id: I271ddbf365d336269a78f603543b82a52306c7c4
* aosp/upstream-f2fs-stable-linux-5.15.y:
fscrypt: update documentation for direct I/O support
f2fs: support direct I/O with fscrypt using blk-crypto
ext4: support direct I/O with fscrypt using blk-crypto
iomap: support direct I/O with fscrypt using blk-crypto
fscrypt: add functions for direct I/O support
f2fs: fix to do sanity check on .cp_pack_total_block_count
f2fs: make gc_urgent and gc_segment_mode sysfs node readable
f2fs: use aggressive GC policy during f2fs_disable_checkpoint()
f2fs: fix compressed file start atomic write may cause data corruption
f2fs: initialize sbi->gc_mode explicitly
f2fs: introduce gc_urgent_mid mode
f2fs: compress: fix to print raw data size in error path of lz4 decompression
f2fs: remove redundant parameter judgment
f2fs: use spin_lock to avoid hang
f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
f2fs: remove unnecessary read for F2FS_FITS_IN_INODE
f2fs: introduce F2FS_UNFAIR_RWSEM to support unfair rwsem
f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
f2fs: fix to do sanity check on curseg->alloc_type
f2fs: fix to avoid potential deadlock
f2fs: quota: fix loop condition at f2fs_quota_sync()
f2fs: Restore rwsem lockdep support
f2fs: fix missing free nid in f2fs_handle_failed_inode
f2fs: support idmapped mounts
f2fs: add a way to limit roll forward recovery time
f2fs: introduce F2FS_IPU_HONOR_OPU_WRITE ipu policy
f2fs: adjust readahead block number during recovery
f2fs: fix to unlock page correctly in error path of is_alive()
f2fs: expose discard related parameters in sysfs
f2fs: move discard parameters into discard_cmd_control
fs: handle circular mappings correctly
f2fs: fix to enable ATGC correctly via gc_idle sysfs interface
f2fs: move f2fs to use reader-unfair rwsems
Bug: 216636351
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I53cc37765ba69df2a9b7b9c070e4938822354f05
Set KMI_GENERATION=2 for 3/23 KMI update
Leaf changes summary: 3073 artifacts changed (1 filtered out)
Changed leaf types summary: 52 (1 filtered out) leaf types changed
Removed/Changed/Added functions summary: 1 Removed, 2959 Changed, 3 Added functions
Removed/Changed/Added variables summary: 0 Removed, 58 Changed, 0 Added variable
1 Removed function:
[D] 'function vm_area_struct* find_vma(mm_struct*, unsigned long int)'
3 Added functions:
[A] 'function vm_area_struct* __find_vma(mm_struct*, unsigned long int)'
[A] 'function long int dma_buf_set_name(dma_buf*, const char*)'
[A] 'function int reclaim_shmem_address_space(address_space*)'
2959 functions with some sub-type change:
[C] 'function void* PDE_DATA(const inode*)' at generic.c:794:1 has some sub-type changes:
CRC (modversions) changed from 0x1c3e2a86 to 0xedd5d462
[C] 'function void __ClearPageMovable(page*)' at compaction.c:138:1 has some sub-type changes:
CRC (modversions) changed from 0x734edab3 to 0x3aeae4f2
[C] 'function void __SetPageMovable(page*, address_space*)' at compaction.c:130:1 has some sub-type changes:
CRC (modversions) changed from 0x891f9c1d to 0x96ef33e3
... 2956 omitted; 2959 symbols have only CRC changes
58 Changed variables:
[C] 'rw_semaphore crypto_alg_sem' was changed at api.c:27:1:
size of symbol changed from 40 to 48
CRC (modversions) changed from 0x35d3dc46 to 0xf32f316e
type of variable changed:
type size changed from 320 to 384 (in bits)
1 data member insertion:
'u64 android_vendor_data1', at offset 320 (in bits) at rwsem.h:68:1
3276 impacted interfaces
[C] 'const vm_operations_struct drm_gem_cma_vm_ops' was changed at drm_gem_cma_helper.c:294:1:
size of symbol changed from 112 to 120
CRC (modversions) changed from 0x3bc32679 to 0x248b2833
type of variable changed:
[C] 'net init_net' was changed at net_namespace.c:47:1:
CRC (modversions) changed from 0xe32665c4 to 0x83c0a9ee
type of variable changed:
type size hasn't changed
there are data member changes:
type 'struct netns_nexthop' of 'net::nexthop' changed:
type size changed from 576 to 640 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'netns_nexthop::notifier_chain' changed:
type size changed from 384 to 448 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'blocking_notifier_head::rwsem' changed, as reported earlier
'notifier_block* head' offset changed (by +64 bits)
3268 impacted interfaces
3258 impacted interfaces
3258 impacted interfaces
[C] 'rq runqueues' was changed at core.c:49:1:
CRC (modversions) changed from 0x4ce2ba0f to 0x3be19baa
type of variable changed:
type size hasn't changed
1 data member insertion:
'u64 prev_steal_time_rq', at offset 24576 (in bits) at sched.h:1064:1
there are data member changes:
2 ('unsigned long int calc_load_update' .. 'long int calc_load_active') offsets changed (by +64 bits)
3258 impacted interfaces
[C] 'const vm_operations_struct vb2_common_vm_ops' was changed at videobuf2-memops.c:121:1:
size of symbol changed from 112 to 120
CRC (modversions) changed from 0x234a35c to 0x50ba9795
type of variable changed:
[C] 'vm_event_state vm_event_states' was changed at vmstat.c:107:1:
size of symbol changed from 704 to 720
CRC (modversions) changed from 0xbe72514d to 0x85d767b0
type of variable changed:
type size changed from 5632 to 5760 (in bits)
there are data member changes:
type 'unsigned long int[88]' of 'vm_event_state::event' changed:
type name changed from 'unsigned long int[88]' to 'unsigned long int[90]'
array type size changed from 5632 to 5760
array type subrange 1 changed length from 88 to 90
one impacted interface
[C] 'bus_type amba_bustype' was changed at bus.c:313:1:
CRC (modversions) changed from 0x517f2d17 to 0x69625ec
[C] 'const address_space_operations balloon_aops' was changed at balloon_compaction.c:253:1:
CRC (modversions) changed from 0x89a77b8c to 0xefa16792
[C] 'const clk_ops clk_divider_ops' was changed at clk-divider.c:522:1:
CRC (modversions) changed from 0x5a75cc1 to 0xcd0b5d59
... 49 omitted; 52 symbols have only CRC changes
'struct address_space at fs.h:460:1' changed (indirectly):
type size changed from 1536 to 1664 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'address_space::invalidate_lock' changed, as reported earlier
3 ('gfp_t gfp_mask' .. 'rb_root_cached i_mmap') offsets changed (by +64 bits)
type 'struct rw_semaphore' of 'address_space::i_mmap_rwsem' changed, as reported earlier
and offset changed from 704 to 768 (in bits) (by +64 bits)
8 ('unsigned long int nrpages' .. 'void* private_data') offsets changed (by +128 bits)
3258 impacted interfaces
'struct anon_vma at rmap.h:29:1' changed (indirectly):
type size changed from 640 to 704 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'anon_vma::rwsem' changed, as reported earlier
4 ('atomic_t refcount' .. 'rb_root_cached rb_root') offsets changed (by +64 bits)
3258 impacted interfaces
'struct backing_dev_info at backing-dev-defs.h:169:1' changed (indirectly):
type size changed from 9024 to 9088 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'backing_dev_info::wb_switch_rwsem' changed, as reported earlier
6 ('wait_queue_head_t wb_waitq' .. 'dentry* debug_dir') offsets changed (by +64 bits)
3258 impacted interfaces
'struct blk_keyslot_manager at keyslot-manager.h:52:1' changed (indirectly):
type size changed from 1408 to 1472 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'blk_keyslot_manager::lock' changed, as reported earlier
6 ('wait_queue_head_t idle_slots_wait_queue' .. 'blk_ksm_keyslot* slots') offsets changed (by +64 bits)
3258 impacted interfaces
'struct blocking_notifier_head at notifier.h:65:1' changed (indirectly):
details were reported earlier
'struct bpf_prog_stats at filter.h:556:1' changed:
type size hasn't changed
there are data member changes:
type 'typedef u64' of 'bpf_prog_stats::cnt' changed:
typedef name changed from u64 to u64_stats_t at u64_stats_sync.h:79:1
underlying type 'typedef __u64' at int-ll64.h:31:1 changed:
entity changed from 'typedef __u64' to 'struct {local64_t v;}' at u64_stats_sync.h:77:1
type size hasn't changed
type 'typedef u64' of 'bpf_prog_stats::nsecs' changed, as reported earlier
type 'typedef u64' of 'bpf_prog_stats::misses' changed, as reported earlier
3258 impacted interfaces
'struct cpufreq_policy at cpufreq.h:55:1' changed (indirectly):
type size changed from 5120 to 5312 (in bits)
there are data member changes:
type 'struct freq_constraints' of 'cpufreq_policy::constraints' changed:
type size changed from 1408 to 1536 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'freq_constraints::min_freq_notifiers' changed, as reported earlier
'pm_qos_constraints max_freq' offset changed (by +64 bits)
type 'struct blocking_notifier_head' of 'freq_constraints::max_freq_notifiers' changed, as reported earlier
and offset changed from 1024 to 1088 (in bits) (by +64 bits)
3261 impacted interfaces
7 ('freq_qos_request* min_freq_req' .. 'completion kobj_unregister') offsets changed (by +128 bits)
type 'struct rw_semaphore' of 'cpufreq_policy::rwsem' changed, as reported earlier
and offset changed from 3712 to 3840 (in bits) (by +128 bits)
16 ('bool fast_switch_possible' .. 'notifier_block nb_max') offsets changed (by +192 bits)
31 impacted interfaces
'struct dev_pm_qos at pm_qos.h:117:1' changed (indirectly):
type size changed from 2432 to 2560 (in bits)
there are data member changes:
type 'struct freq_constraints' of 'dev_pm_qos::freq' changed, as reported earlier
4 ('pm_qos_flags flags' .. 'dev_pm_qos_request* flags_req') offsets changed (by +128 bits)
3258 impacted interfaces
'struct freq_constraints at pm_qos.h:85:1' changed (indirectly):
details were reported earlier
'struct gpio_device at gpiolib.h:46:1' changed (indirectly):
type size changed from 8064 to 8128 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'gpio_device::notifier' changed, as reported earlier
'list_head pin_ranges' offset changed (by +64 bits)
3258 impacted interfaces
'struct i3c_bus at master.h:332:1' changed (indirectly):
type size changed from 1152 to 1216 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'i3c_bus::lock' changed, as reported earlier
14 impacted interfaces
'struct i3c_master_controller at master.h:483:1' changed (indirectly):
type size changed from 16128 to 16192 (in bits)
there are data member changes:
type 'struct i3c_bus' of 'i3c_master_controller::bus' changed, as reported earlier
'workqueue_struct* wq' offset changed (by +64 bits)
14 impacted interfaces
'struct inode at fs.h:624:1' changed (indirectly):
type size changed from 5056 to 5248 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'inode::i_rwsem' changed, as reported earlier
11 ('unsigned long int dirtied_when' .. 'list_head i_wb_list') offsets changed (by +64 bits)
anonymous data member 'union {hlist_head i_dentry; callback_head i_rcu;}' offset changed from 2496 to 2560 (in bits) (by +64 bits)
6 ('atomic64_t i_version' .. 'atomic_t i_readcount') offsets changed (by +64 bits)
anonymous data member 'union {const file_operations* i_fop; void (inode*)* free_inode;}' offset changed from 2880 to 2944 (in bits) (by +64 bits)
'file_lock_context* i_flctx' offset changed (by +64 bits)
type 'struct address_space' of 'inode::i_data' changed, as reported earlier
and offset changed from 3008 to 3072 (in bits) (by +64 bits)
'list_head i_devices' offset changed (by +192 bits)
anonymous data member 'union {pipe_inode_info* i_pipe; cdev* i_cdev; char* i_link; unsigned int i_dir_seq;}' offset changed from 4672 to 4864 (in bits) (by +192 bits)
6 ('__u32 i_generation' .. 'void* i_private') offsets changed (by +192 bits)
3258 impacted interfaces
'struct io_pgtable_ops at io-pgtable.h:155:1' changed:
type size changed from 320 to 384 (in bits)
1 data member insertion:
'int (io_pgtable_ops*, unsigned long int, scatterlist*, unsigned int, int, typedef gfp_t, size_t*)* map_sg', at offset 128 (in bits) at io-pgtable.h:164:1
there are data member changes:
3 ('typedef size_t (io_pgtable_ops*, unsigned long int, typedef size_t, iommu_iotlb_gather*)* unmap' .. 'typedef phys_addr_t (io_pgtable_ops*, unsigned long int)* iova_to_phys') offsets changed (by +64 bits)
2 impacted interfaces
'struct iommu_group at iommu.c:37:1' changed (indirectly):
type size changed from 1856 to 1920 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'iommu_group::notifier' changed, as reported earlier
7 ('void* iommu_data' .. 'list_head entry') offsets changed (by +64 bits)
3258 impacted interfaces
'struct iommu_ops at iommu.h:254:1' changed:
type size changed from 2624 to 2688 (in bits)
1 data member insertion:
'int (iommu_domain*, unsigned long int, scatterlist*, unsigned int, int, typedef gfp_t, size_t*)* map_sg', at offset 448 (in bits) at iommu.h:270:1
there are data member changes:
34 ('typedef size_t (iommu_domain*, unsigned long int, typedef size_t, iommu_iotlb_gather*)* unmap' .. 'module* owner') offsets changed (by +64 bits)
3258 impacted interfaces
'struct key at key.h:189:1' changed (indirectly):
type size changed from 1728 to 1792 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'key::sem' changed, as reported earlier
2 ('key_user* user' .. 'void* security') offsets changed (by +64 bits)
anonymous data member 'union {time64_t expiry; time64_t revoked_at;}' offset changed from 704 to 768 (in bits) (by +64 bits)
8 ('time64_t last_used_at' .. 'unsigned long int flags') offsets changed (by +64 bits)
anonymous data member 'union {keyring_index_key index_key; struct {unsigned long int hash; unsigned long int len_desc; key_type* type; key_tag* domain_tag; char* description;};}' offset changed from 1088 to 1152 (in bits) (by +64 bits)
anonymous data member 'union {key_payload payload; struct {list_head name_link; assoc_array keys;};}' offset changed from 1408 to 1472 (in bits) (by +64 bits)
'key_restriction* restrict_link' offset changed (by +64 bits)
3258 impacted interfaces
'struct led_classdev at leds.h:70:1' changed (indirectly):
type size changed from 2816 to 2880 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'led_classdev::trigger_lock' changed, as reported earlier
6 ('led_trigger* trigger' .. 'mutex led_access') offsets changed (by +64 bits)
3258 impacted interfaces
'struct led_classdev_flash at led-class-flash.h:65:1' changed (indirectly):
type size changed from 3456 to 3520 (in bits)
there are data member changes:
type 'struct led_classdev' of 'led_classdev_flash::led_cdev' changed, as reported earlier
4 ('const led_flash_ops* ops' .. 'const attribute_group* sysfs_groups[5]') offsets changed (by +64 bits)
2 impacted interfaces
'struct mem_cgroup at memcontrol.h:237:1' changed (indirectly):
type size hasn't changed
there are data member changes:
type 'struct memcg_vmstats' of 'mem_cgroup::vmstats' changed:
type size changed from 16768 to 17024 (in bits)
there are data member changes:
type 'unsigned long int[88]' of 'memcg_vmstats::events' changed:
type name changed from 'unsigned long int[88]' to 'unsigned long int[90]'
array type size changed from 5632 to 5760
array type subrange 1 changed length from 88 to 90
'long int state_pending[43]' offset changed (by +128 bits)
type 'unsigned long int[88]' of 'memcg_vmstats::events_pending' changed:
type name changed from 'unsigned long int[88]' to 'unsigned long int[90]'
array type size changed from 5632 to 5760
array type subrange 1 changed length from 88 to 90
and offset changed from 11136 to 11264 (in bits) (by +128 bits)
3258 impacted interfaces
9 ('atomic_long_t memory_events[8]' .. 'list_head objcg_list') offsets changed (by +256 bits)
3258 impacted interfaces
'struct memcg_vmstats at memcontrol.h:92:1' changed:
details were reported earlier
'struct memcg_vmstats_percpu at memcontrol.h:78:1' changed:
type size changed from 16960 to 17216 (in bits)
there are data member changes:
type 'unsigned long int[88]' of 'memcg_vmstats_percpu::events' changed:
type name changed from 'unsigned long int[88]' to 'unsigned long int[90]'
array type size changed from 5632 to 5760
array type subrange 1 changed length from 88 to 90
'long int state_prev[43]' offset changed (by +128 bits)
type 'unsigned long int[88]' of 'memcg_vmstats_percpu::events_prev' changed:
type name changed from 'unsigned long int[88]' to 'unsigned long int[90]'
array type size changed from 5632 to 5760
array type subrange 1 changed length from 88 to 90
and offset changed from 11136 to 11264 (in bits) (by +128 bits)
2 ('unsigned long int nr_page_events' .. 'unsigned long int targets[2]') offsets changed (by +256 bits)
3258 impacted interfaces
'struct mm_struct at mm_types.h:417:1' changed:
type size changed from 7168 to 7360 (in bits)
there are data member changes:
anonymous data member at offset 0 (in bits) changed from:
struct {vm_area_struct* mmap; rb_root mm_rb; u64 vmacache_seqnum; unsigned long int (file*, unsigned long int, unsigned long int, unsigned long int, unsigned long int)* get_unmapped_area; unsigned long int mmap_base; unsigned long int mmap_legacy_base; unsigned long int task_size; unsigned long int highest_vm_end; pgd_t* pgd; atomic_t membarrier_state; atomic_t mm_users; atomic_t mm_count; atomic_long_t pgtables_bytes; int map_count; spinlock_t page_table_lock; rw_semaphore mmap_lock; list_head mmlist; unsigned long int hiwater_rss; unsigned long int hiwater_vm; unsigned long int total_vm; unsigned long int locked_vm; atomic64_t pinned_vm; unsigned long int data_vm; unsigned long int exec_vm; unsigned long int stack_vm; unsigned long int def_flags; seqcount_t write_protect_seq; spinlock_t arg_lock; unsigned long int start_code; unsigned long int end_code; unsigned long int start_data; unsigned long int end_data; unsigned long int start_brk; unsigned long int brk; unsigned long int start_stack; unsigned long int arg_start; unsigned long int arg_end; unsigned long int env_start; unsigned long int env_end; unsigned long int saved_auxv[46]; mm_rss_stat rss_stat; linux_binfmt* binfmt; mm_context_t context; unsigned long int flags; core_state* core_state; spinlock_t ioctx_lock; kioctx_table* ioctx_table; task_struct* owner; user_namespace* user_ns; file* exe_file; mmu_notifier_subscriptions* notifier_subscriptions; atomic_t tlb_flush_pending; uprobes_state uprobes_state; work_struct async_put_work; u32 pasid;}
to:
struct {vm_area_struct* mmap; rb_root mm_rb; u64 vmacache_seqnum; unsigned long int (file*, unsigned long int, unsigned long int, unsigned long int, unsigned long int)* get_unmapped_area; unsigned long int mmap_base; unsigned long int mmap_legacy_base; unsigned long int task_size; unsigned long int highest_vm_end; pgd_t* pgd; atomic_t membarrier_state; atomic_t mm_users; atomic_t mm_count; atomic_long_t pgtables_bytes; int map_count; spinlock_t page_table_lock; rw_semaphore mmap_lock; unsigned long int mmap_seq; list_head mmlist; unsigned long int hiwater_rss; unsigned long int hiwater_vm; unsigned long int total_vm; unsigned long int locked_vm; atomic64_t pinned_vm; unsigned long int data_vm; unsigned long int exec_vm; unsigned long int stack_vm; unsigned long int def_flags; seqcount_t write_protect_seq; spinlock_t arg_lock; unsigned long int start_code; unsigned long int end_code; unsigned long int start_data; unsigned long int end_data; unsigned long int start_brk; unsigned long int brk; unsigned long int start_stack; unsigned long int arg_start; unsigned long int arg_end; unsigned long int env_start; unsigned long int env_end; unsigned long int saved_auxv[46]; mm_rss_stat rss_stat; linux_binfmt* binfmt; mm_context_t context; unsigned long int flags; core_state* core_state; spinlock_t ioctx_lock; kioctx_table* ioctx_table; task_struct* owner; user_namespace* user_ns; file* exe_file; mmu_notifier_subscriptions* notifier_subscriptions; percpu_rw_semaphore* mmu_notifier_lock; atomic_t tlb_flush_pending; uprobes_state uprobes_state; work_struct async_put_work; u32 pasid;}
and size changed from 7168 to 7360 (in bits) (by +192 bits)
'unsigned long int cpu_bitmap[]' offset changed (by +192 bits)
3258 impacted interfaces
'struct mmc_host at host.h:292:1' changed (indirectly):
type size hasn't changed
there are data member changes:
type 'struct blk_keyslot_manager' of 'mmc_host::ksm' changed, as reported earlier
'bool hsq_enabled' offset changed (by +64 bits)
32 impacted interfaces
'struct net at net_namespace.h:56:1' changed (indirectly):
details were reported earlier
'struct net_device at netdevice.h:1949:1' changed:
type size hasn't changed
1 data member insertion:
'const macsec_ops* macsec_ops', at offset 17984 (in bits) at netdevice.h:2262:1
there are data member changes:
3 ('const udp_tunnel_nic_info* udp_tunnel_nic_info' .. 'bpf_xdp_entity xdp_state[3]') offsets changed (by +64 bits)
3258 impacted interfaces
'struct netns_nexthop at nexthop.h:11:1' changed (indirectly):
details were reported earlier
'struct nvmem_config at nvmem-provider.h:78:1' changed:
type size hasn't changed
1 data member insertion:
'bool ignore_wp', at offset 592 (in bits) at nvmem-provider.h:92:1
one impacted interface
'struct opp_table at opp.h:173:1' changed (indirectly):
type size changed from 4928 to 4992 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'opp_table::head' changed, as reported earlier
32 ('list_head dev_list' .. 'char dentry_name[255]') offsets changed (by +64 bits)
72 impacted interfaces
'struct percpu_rw_semaphore at percpu-rwsem.h:12:1' changed (indirectly):
type size hasn't changed
3258 impacted interfaces
'struct phy_device at phy.h:563:1' changed:
type size changed from 10752 to 10816 (in bits)
1 data member insertion:
'const macsec_ops* macsec_ops', at offset 10752 (in bits) at phy.h:671:1
3258 impacted interfaces
'struct quota_info at quota.h:519:1' changed (indirectly):
type size changed from 2496 to 2560 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'quota_info::dqio_sem' changed, as reported earlier
3 ('inode* files[3]' .. 'const quota_format_ops* ops[3]') offsets changed (by +64 bits)
3258 impacted interfaces
'struct regulator_dev at driver.h:603:1' changed (indirectly):
type size changed from 9024 to 9088 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'regulator_dev::notifier' changed, as reported earlier
19 ('ww_mutex mutex' .. 'spinlock_t err_lock') offsets changed (by +64 bits)
556 impacted interfaces
'struct rndis_params at rndis.h:159:1' changed:
type size changed from 768 to 832 (in bits)
1 data member insertion:
'spinlock_t resp_lock', at offset 768 (in bits) at rndis.h:177:1
11 impacted interfaces
'struct rq at sched.h:931:1' changed:
details were reported earlier
'struct rw_semaphore at rwsem.h:48:1' changed:
details were reported earlier
'struct sdhci_host at sdhci.h:365:1' changed (indirectly):
type size hasn't changed
there are data member changes:
type 'struct led_classdev' of 'sdhci_host::led' changed, as reported earlier
64 ('char led_name[32]' .. 'u64 data_timeout') offsets changed (by +64 bits)
12 impacted interfaces
'struct signal_struct at signal.h:82:1' changed (indirectly):
type size changed from 8320 to 8384 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'signal_struct::exec_update_lock' changed, as reported earlier
3258 impacted interfaces
'struct snd_card at core.h:79:1' changed (indirectly):
type size changed from 18240 to 18304 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'snd_card::controls_rwsem' changed, as reported earlier
26 ('rwlock_t ctl_files_rwlock' .. 'wait_queue_head_t power_ref_sleep') offsets changed (by +64 bits)
120 impacted interfaces
'struct snd_soc_jack at soc-jack.h:82:1' changed (indirectly):
type size changed from 1088 to 1152 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'snd_soc_jack::notifier' changed, as reported earlier
'list_head jack_zones' offset changed (by +64 bits)
45 impacted interfaces
'struct sock at sock.h:355:1' changed:
type size hasn't changed
there are data member changes:
type 'typedef u32' of 'sock::sk_tskey' changed:
typedef name changed from u32 to atomic_t at types.h:168:1
underlying type 'typedef __u32' at int-ll64.h:27:1 changed:
entity changed from 'typedef __u32' to 'struct {int counter;}' at types.h:166:1
type size hasn't changed
3258 impacted interfaces
'struct subsys_private at base.h:40:1' changed (indirectly):
type size changed from 3264 to 3328 (in bits)
there are data member changes:
type 'struct blocking_notifier_head' of 'subsys_private::bus_notifier' changed, as reported earlier
4 ('unsigned int drivers_autoprobe' .. 'class* class') offsets changed (by +64 bits)
3258 impacted interfaces
'struct super_block at fs.h:1466:1' changed (indirectly):
type size changed from 11264 to 11776 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'super_block::s_umount' changed, as reported earlier
16 ('int s_count' .. 'unsigned int s_quota_types') offsets changed (by +64 bits)
type 'struct quota_info' of 'super_block::s_dquot' changed, as reported earlier
and offset changed from 2304 to 2368 (in bits) (by +64 bits)
29 ('sb_writers s_writers' .. 'int s_stack_depth') offsets changed (by +128 bits)
4 ('spinlock_t s_inode_list_lock' .. 'list_head s_inodes_wb') offsets changed (by +512 bits)
3258 impacted interfaces
'struct tcf_block at sch_generic.h:463:1' changed (indirectly):
type size changed from 10112 to 10176 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'tcf_block::cb_lock' changed, as reported earlier
10 ('flow_block flow_block' .. 'mutex proto_destroy_lock') offsets changed (by +64 bits)
3258 impacted interfaces
'struct tty_struct at tty.h:143:1' changed (indirectly):
type size changed from 5568 to 5632 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'tty_struct::termios_rwsem' changed, as reported earlier
26 ('mutex winsize_mutex' .. 'tty_port* port') offsets changed (by +64 bits)
3258 impacted interfaces
'struct ufs_hba at ufshcd.h:808:1' changed (indirectly):
type size changed from 36992 to 37120 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'ufs_hba::clk_scaling_lock' changed, as reported earlier
9 ('unsigned char desc_size[10]' .. 'u32 crypto_cfg_register') offsets changed (by +64 bits)
type 'struct blk_keyslot_manager' of 'ufs_hba::ksm' changed, as reported earlier
and offset changed from 34688 to 34752 (in bits) (by +64 bits)
5 ('dentry* debugfs_root' .. 'bool complete_put') offsets changed (by +128 bits)
28 impacted interfaces
'struct user_namespace at user_namespace.h:66:1' changed (indirectly):
type size changed from 4800 to 4864 (in bits)
there are data member changes:
type 'struct rw_semaphore' of 'user_namespace::keyring_sem' changed, as reported earlier
5 ('work_struct work' .. 'long int ucount_max[14]') offsets changed (by +64 bits)
3258 impacted interfaces
'struct vm_area_struct at mm_types.h:326:1' changed (indirectly):
type size hasn't changed
3258 impacted interfaces
'struct vm_event_state at vmstat.h:54:1' changed:
details were reported earlier
'struct vm_fault at mm.h:531:1' changed:
type size changed from 832 to 960 (in bits)
1 data member deletion:
'union {pte_t orig_pte; pmd_t orig_pmd;}', at offset 448 (in bits) at mm.h:545:1
3 data member insertions:
'unsigned long int seq', at offset 320 (in bits) at mm.h:544:1
'pmd_t orig_pmd', at offset 384 (in bits) at mm.h:545:1
'union {pte_t orig_pte;}', at offset 576 (in bits) at mm.h:552:1
there are data member changes:
6 ('pmd_t* pmd' .. 'spinlock_t* ptl') offsets changed (by +128 bits)
type 'typedef pgtable_t' of 'vm_fault::prealloc_pte' changed:
underlying type 'page*' changed:
and offset changed from 768 to 896 (in bits) (by +128 bits)
3258 impacted interfaces
'struct vm_operations_struct at mm.h:588:1' changed:
type size changed from 896 to 960 (in bits)
1 data member insertion:
'bool speculative', at offset 896 (in bits) at mm.h:672:1
3258 impacted interfaces
'struct vsock_sock at af_vsock.h:27:1' changed (indirectly):
type size hasn't changed
there are data member changes:
type 'struct sock' of 'vsock_sock::sk' changed, as reported earlier
33 impacted interfaces
Bug: 226384098
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: Id923f2a1b14e9e2abab7c3cfd93fadeedc24013d
This reverts commit 407543a2ff.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9dc3ecfa72fe2244fbd8fa567a6618aff9536c80
This reverts commit 165953b352.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9470b62f88f1099b028c3ac24c25eb5ac3fa0ba4
This reverts commit 0e189b0893.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I361d61ed4282366eea314224870ff8d02ebb7311
Enable PARAVIRT_TIME_ACCOUNTING in gki_defconfig to
support fine granularity task steal time accounting.
Bug: 223353878
Change-Id: I26d515ba0a94deacbaddad8432000c64c2cc9187
Signed-off-by: Naina Mehta <quic_nainmeht@quicinc.com>
Add reclaim_shmem_address_space to symbol list. This is used for the
drivers who want to maintain the shmem pages on their own.
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added
function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added
variable
1 Added function:
[A] 'function int reclaim_shmem_address_space(address_space*)'
Bug: 201263305
Change-Id: Ice5646f5a753bd8431f394644e19e9b31a49645a
Signed-off-by: Charan Teja Reddy <quic_charante@quicinc.com>
Add the functionality that allow users of shmem to reclaim its pages
without going through the kswapd/direct reclaim path. An example usecase
is: Say that device allocates a larger amount of shmem pages and shares
it with hardware. To faster reclaims such pages, drivers can register
the shrinkers and call reclaim_shmem_address_space().
The implementation of this function is mostly borrowed from
reclaim_address_space() implemented for per process reclaim[1].
[1] https://lore.kernel.org/patchwork/cover/378056/
Bug: 201263305
Change-Id: I03d2c3b9610612af977f89ddeabb63b8e9e50918
Signed-off-by: Charan Teja Reddy <quic_charante@quicinc.com>
This reverts commit bb592b6898.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1467db884a714b1379d31d65e650efccbc17ac5c
This reverts commit beb134d21a.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ibabb6d2e2a1e00d18ad2e8c39b4459ba118c7002
This reverts commit 74d434ad67.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8be50594477ec53edb4005e5227c2df26218afdd
This reverts commit fc94364a70.
It is no longer needed as we can modify the KABI at this point in time.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Icb196c5ca88a1bb2dca12e132f012be16b2be11e
User space user can call DMA_BUF_SET_NAME to set dma_buf.name,
but until now we can't set it at kernel side, it's difficult to debug
kernel dma_buf users.
There are some kernel users of dma_heap also need it at MTK,
such as camera, it's also have a allocator for other camera part,
unlike most case in userspace, it's in kernel.
For debug buffer owner, we need add it to let it can set debug name
for each dmabuf, so that we can know dmabuf owner by dma_buf.name.
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable
1 Added function:
[A] 'function long int dma_buf_set_name(dma_buf*, const char*)'
Bug: 223353875
Link: https://lore.kernel.org/patchwork/patch/1459719/
Change-Id: Iac5c6b8838b9b4d976f4525d000e17a3abab94f6
Signed-off-by: Guangming Cao <Guangming.Cao@mediatek.com>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Add support for IOMMU drivers to have their own map_sg() callbacks.
This completes the path for having iommu_map_sg() invoke an IOMMU
driver's map_sg() callback, which can then invoke the io-pgtable
map_sg() callback with the entire scatter-gather list, so that it
can be processed entirely in the io-pgtable layer.
For IOMMU drivers that do not provide a callback, the default
implementation of iterating through the scatter-gather list, while
calling iommu_map() will be used.
Bug: 190544587
Link: https://lore.kernel.org/linux-iommu/1610376862-927-1-git-send-email-isaacm@codeaurora.org/T/#t
Change-Id: I3d5a8a9e8648649d8dcdda3fa1df41d72f87a528
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
While mapping a scatter-gather list, iommu_map_sg() calls
into the IOMMU driver through an indirect call, which can
call into the io-pgtable code through another indirect call.
This sequence of going through the IOMMU core code, the IOMMU
driver, and finally the io-pgtable code, occurs for every
element in the scatter-gather list, in the worse case, which
is not optimal.
Introduce a map_sg callback in the io-pgtable ops so that
IOMMU drivers can invoke it with the complete scatter-gather
list, so that it can be processed within the io-pgtable
code entirely, reducing the number of indirect calls, and
boosting overall iommu_map_sg() performance.
Bug: 190544587
Link: https://lore.kernel.org/linux-iommu/1610376862-927-1-git-send-email-isaacm@codeaurora.org/T/#t
Change-Id: I4b2088dd08eb97dcd94a6c6968082a3c4395351a
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Add the hook to apply vendor's performance tune for owner
of rwsem.
Add the hook for the waiter list of rwsem to allow
vendor perform waiting queue enhancement
ANDROID_VENDOR_DATA added to rw_semaphore
Bug: 222402411
Signed-off-by: JianMin Liu <jian-min.liu@mediatek.com>
Signed-off-by: Jino Hsu <jino.hsu@mediatek.com>
Change-Id: I007a5e26f3db2adaeaf4e5ccea414ce7abfa83b8
A previous change [1] inlined find_vma function, resulting in its
removal from the exported kernel symbols and replacement with
__find_vma. This function is implemented in the header file and is
still available to drivers, but exported function is changed to
__find_vma. This causes ABI breakage with the following error:
ERROR: Differences between ksymtab and symbol list detected!
Symbols missing from ksymtab:
- find_vma
Replace find_vma with new __find_vma in the symbol lists.
[1] https://lore.kernel.org/all/20220128131006.67712-13-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I23fdb68b7fd4d907354fc5902dca9ddec8060319
In speculative fault path, while doing page table lookup, offset
is obtained at each level and value at that offset is read and
checks are perfomed on it, later to get next level offset we read
from previous level offset again. A concurrent page table reclaimation
operation could result in change in value at this offset, and we go
ahead and access it, this would result in reading an invalid entry.
Fix this by reading from previous level offset again and comparing
before performing next level access.
Bug: 221005439
Change-Id: I66b3d24ae79c7ee5ccce4ba7a94f028f4cf3fda0
Signed-off-by: Vijayanand Jitta <quic_vjitta@quicinc.com>
We just need to make sure f2fs_filemap_fault() doesn't block in the
speculative case as it is called with an rcu read lock held.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-33-michel@lespinasse.org/
Conflicts:
fs/f2fs/file.c
1. The change in f2fs_filemap_fault is not needed since i_mmap_sem is not
used anymore.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: If7a46e131ee38ca02a4c5b8a76ab4eb742acbe95
We just need to make sure ext4_filemap_fault() doesn't block in the
speculative case as it is called with an rcu read lock held.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-32-michel@lespinasse.org/
Conflicts:
fs/ext4/inode.c
1. The change in fs/ext4/inode.c is not needed since i_mmap_sem is not
used anymore.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Idafc81074cf7f4b31985bdb24e0cc1597c91b875
Introduce vma_can_speculate(), which allows speculative handling for
VMAs mapping supported file types.
From do_handle_mm_fault(), speculative handling will follow through
__handle_mm_fault(), handle_pte_fault() and do_fault().
At this point, we expect speculative faults to continue through one of:
- do_read_fault(), fully implemented;
- do_cow_fault(), which might abort if missing anon vmas,
- do_shared_fault(), not implemented yet
(would require ->page_mkwrite() changes).
vma_can_speculate() provides an early abort for the do_shared_fault() case,
limiting the time spent on trying that unimplemented case.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-31-michel@lespinasse.org/
Conflicts:
include/linux/vm_event_item.h
mm/vmstat.c
1. SPF_ATTEMPT_FILE is taken from https://lore.kernel.org/all/20210407014502.24091-36-michel@lespinasse.org/
since the patch posted upstream at the time had a different structure
with stats for anonymouse and file-backed pagefaults introduced in a
separate patch.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I3a28af63b41b649f02f8b73d53f6494ad114ee5a
Add a speculative field to the vm_operations_struct, which indicates if
the associated file type supports speculative faults.
Initially this is set for files that implement fault() with filemap_fault().
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-30-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ic92efdf13283c45e7da7bf703f4f85f8b392ba69
In the speculative case, we know the page table already exists, and it
must be locked with pte_map_lock(). In the case where no page is found
for the given address, return VM_FAULT_RETRY which will abort the
fault before we get into the vm_ops->fault() callback. This is fine
because if filemap_map_pages does not find the page in page cache,
vm_ops->fault() will not either.
Initialize addr and last_pgoff to correspond to the pte at the original
fault address (which was mapped with pte_map_lock()), rather than the
pte at start_pgoff. The choice of initial values doesn't matter as
they will all be adjusted together before use, so they just need to be
consistent with each other, and using the original fault address and
pte allows us to reuse pte_map_lock() without any changes to it.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-29-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I0acf4f9626ec0126cdc9a95a7ff1cd735c1af2ca
Call the vm_ops->map_pages method within an rcu read locked section.
In the speculative case, verify the mmap sequence lock at the start of
the section. A match guarantees that the original vma is still valid
at that time, and that the associated vma->vm_file stays valid while
the vm_ops->map_pages() method is running.
Do not test vmf->pmd in the speculative case - we only speculate when
a page table already exists, and and this saves us from having to handle
synchronization around the vmf->pmd read.
Change xfs_filemap_map_pages() account for the fact that it can not
block anymore, as it is now running within an rcu read lock.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-28-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Id771c1e6fa9b883595a48d4df63f448a05916eda
In the speculative case, we want to avoid direct pmd checks (which
would require some extra synchronization to be safe), and rely on
pte_map_lock which will both lock the page table and verify that the
pmd has not changed from its initial value.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-27-michel@lespinasse.org/
Conflicts:
mm/memory.c
1. Merge conflict due to new vmf->prealloc_pte usage in finish_fault.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: If6046592083eaf12caf5c51c3fbb287a4dfa1ace
Extend filemap_fault() to handle speculative faults.
In the speculative case, we will only be fishing existing pages out of
the page cache. The logic we use mirrors what is done in the
non-speculative case, assuming that pages are found in the page cache,
are up to date and not already locked, and that readahead is not
necessary at this time. In all other cases, the fault is aborted to be
handled non-speculatively.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-26-michel@lespinasse.org/
Conflicts:
mm/filemap.c
1. Added back file_ra_state variable used by SPF path.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I82eba7fcfc81876245c2e65bc5ae3d33ddfcc368
In the speculative case, call the vm_ops->fault() method from within
an rcu read locked section, and verify the mmap sequence lock at the
start of the section. A match guarantees that the original vma is still
valid at that time, and that the associated vma->vm_file stays valid
while the vm_ops->fault() method is running.
Note that this implies that speculative faults can not sleep within
the vm_ops->fault method. We will only attempt to fetch existing pages
from the page cache during speculative faults; any miss (or prefetch)
will be handled by falling back to non-speculative fault handling.
The speculative handling case also does not preallocate page tables,
as it is always called with a pre-existing page table.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-25-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I995ba94d8e96014ef83ac93fe5a4669afcde34b9
Attempt speculative mm fault handling first, and fall back to the
existing (non-speculative) code if that fails.
This follows the lines of the x86 speculative fault handling code,
but with some minor arch differences such as the way that the
access_pkey_error case is handled
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-36-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ic12bc3d5070d1502fc5df182a19c92b4a8d59723
Attempt speculative mm fault handling first, and fall back to the
existing (non-speculative) code if that fails.
This follows the lines of the x86 speculative fault handling code,
but with some minor arch differences such as the way that the
VM_FAULT_BADACCESS case is handled.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-34-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Iccd87036b15eebf2ff28fbb8022b07c9f91d7353
Split off the definitions necessary to update event counters from vmstat.h
into a new vm_event.h file.
The rationale is to allow header files included from mm.h to update
counter events. vmstat.h can not be included from such header files,
because it refers to page_pgdat() which is only defined later down
in mm.h, and thus results in compile errors. vm_event.h does not refer
to page_pgdat() and thus does not result in such errors.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-31-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ie70dd435b3dcbad80a4a9bfc294b78a9107c1ac2
Performance tuning: as single threaded userspace does not use
speculative page faults, it does not require rcu safe vma freeing.
Turn this off to avoid the related (small) extra overheads.
For multi threaded userspace, we often see a performance benefit from
the rcu safe vma freeing - even in tests that do not have any frequent
concurrent page faults ! This is because rcu safe vma freeing prevents
recently released vmas from being immediately reused in a new thread.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-30-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I81ef7ab43e2757f268c567d5bfe6ab02f1e43a1c
In handle_pte_fault(), allow speculative execution to proceed.
Use pte_spinlock() to validate the mmap sequence count when locking
the page table.
If speculative execution proceeds through do_wp_page(), ensure that we
end up in the wp_page_reuse() or wp_page_copy() paths, rather than
wp_pfn_shared() or wp_page_shared() (both unreachable as we only
handle anon vmas so far) or handle_userfault() (needs an explicit
abort to handle non-speculatively).
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-28-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ia45d095ec7b8e23f1c5d68b7a7f572a3f6f6df97
Change wp_page_copy() to handle the speculative case. This involves
aborting speculative faults if they have to allocate an anon_vma,
read-locking the mmu_notifier_lock to avoid races with
mmu_notifier_register(), and using pte_map_lock() instead of
pte_offset_map_lock() to complete the page fault.
Also change call sites to clear vmf->pte after unmapping the page table,
in order to satisfy pte_map_lock()'s preconditions.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-27-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Icd2188e9facf5a7fea42000a2808bcda1ad6f0fc
Introduce mmu_notifier_lock as a per-mm percpu_rw_semaphore,
as well as the code to initialize and destroy it together with the mm.
This lock will be used to prevent races between mmu_notifier_register()
and speculative fault handlers that need to fire MMU notifications
without holding any of the mmap or rmap locks.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-24-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I453ebe979c8b9dcc6159b41c5ec7a1ea17d85ee2
Change handle_pte_fault() to allow speculative fault execution to proceed
through do_numa_page().
do_swap_page() does not implement speculative execution yet, so it
needs to abort with VM_FAULT_RETRY in that case.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-22-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I0390331facc9ecd37534012abdd9f255ab5bbb12
in x86 fault handler, only attempt spf if the vma is anonymous.
In do_handle_mm_fault(), let speculative page faults proceed as long
as they fall into anonymous vmas. This enables the speculative
handling code in __handle_mm_fault() and do_anonymous_page().
In handle_pte_fault(), if vmf->pte is set (the original pte was not
pte_none), catch speculative faults and return VM_FAULT_RETRY as
those cases are not implemented yet. Also assert that do_fault()
is not reached in the speculative case.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-20-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I875106fcfa1084f570c2bf8f24a129bdce55316b
Change do_anonymous_page() to handle the speculative case.
This involves aborting speculative faults if they have to allocate a new
anon_vma, and using pte_map_lock() instead of pte_offset_map_lock()
to complete the page fault.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-19-michel@lespinasse.org/
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I5ad955323faabc142c21f62415db039ac889066a
pte_map_lock() and pte_spinlock() are used by fault handlers to ensure
the pte is mapped and locked before they commit the faulted page to the
mm's address space at the end of the fault.
The functions differ in their preconditions; pte_map_lock() expects
the pte to be unmapped prior to the call, while pte_spinlock() expects
it to be already mapped.
In the speculative fault case, the functions verify, after locking the pte,
that the mmap sequence count has not changed since the start of the fault,
and thus that no mmap lock writers have been running concurrently with
the fault. After that point the page table lock serializes any further
races with concurrent mmap lock writers.
If the mmap sequence count check fails, both functions will return false
with the pte being left unmapped and unlocked.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20220128131006.67712-18-michel@lespinasse.org/
Conflicts:
include/linux/mm.h
1. Fixed pte_map_lock and pte_spinlock macros not to fail when
CONFIG_SPECULATIVE_PAGE_FAULT=n
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ibd7ccc2ead4fdf29f28c7657b312b2f677ac8836