This reverts commit 06b6de69cf which is
commit a74f8d0aa9 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: If238e0b195375789db1828f9c3e940638f7ab2bf
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 3191a00dbe which is
commit 9355b60e40 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I885ce7e2307f14ecc49ec45c1e1ebd288ca70407
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 7ce0a888d6 which is
commit 1045f5f1ff upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I76bafe33cfdf2c51488aa4850b71bd26305ed35f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit d16ae91186 which is
commit 67df411db3 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I3be3b72b2133285590c3c005ecef2117c2f5fcb4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit f1a68c6a41 which is
commit fd28941cff upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I28617f5a45e886bbb403d55f1eeedc294c00540d
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 80326ce1eb which is
commit 4a63e68a29 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: Ic490aa561e1cb9e974d0bda07106ddaeb35b2e4a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit f354086d1b which is
commit 7822baa844a87cbb93308c1032c3d47d4079bb8a upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: Ib0af4aaa39f5b3f52fddfa0916b06b6046299a6a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit a6f53df52b which is
commit 03a8b0df75 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I7ca9e21f3364783bb57fb574f67facb8d0984ec2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit bfd36b1d18 which is
commit 291e9da914 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I087bfd704d03c5aef4cf14d6130d0c310ea9313a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 36dba3f4cd which is
commit dfd5fe19db upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: Ia856f7ed9677df6e5b5db7d03470fea74b794a17
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit cad6da86ca which is
commit 668abe6dc7b61941fa5c724c06797efb0b87f070 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I6078d6603643beca8acd8e8330d0d5e2c725e41c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Changes in 5.15.152
mmc: mmci: stm32: use a buffer for unaligned DMA requests
mmc: mmci: stm32: fix DMA API overlapping mappings warning
net: lan78xx: fix runtime PM count underflow on link stop
ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}able
i40e: disable NAPI right after disabling irqs when handling xsk_pool
tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string
geneve: make sure to pull inner header in geneve_rx()
net: sparx5: Fix use after free inside sparx5_del_mact_entry
net: ice: Fix potential NULL pointer dereference in ice_bridge_setlink()
net/ipv6: avoid possible UAF in ip6_route_mpath_notify()
cpumap: Zero-initialise xdp_rxq_info struct before running XDP program
net/rds: fix WARNING in rds_conn_connect_if_down
netfilter: nft_ct: fix l3num expectations with inet pseudo family
netfilter: nf_conntrack_h323: Add protection for bmp length out of range
erofs: apply proper VMA alignment for memory mapped files on THP
netrom: Fix a data-race around sysctl_netrom_default_path_quality
netrom: Fix a data-race around sysctl_netrom_obsolescence_count_initialiser
netrom: Fix data-races around sysctl_netrom_network_ttl_initialiser
netrom: Fix a data-race around sysctl_netrom_transport_timeout
netrom: Fix a data-race around sysctl_netrom_transport_maximum_tries
netrom: Fix a data-race around sysctl_netrom_transport_acknowledge_delay
netrom: Fix a data-race around sysctl_netrom_transport_busy_delay
netrom: Fix a data-race around sysctl_netrom_transport_requested_window_size
netrom: Fix a data-race around sysctl_netrom_transport_no_activity_timeout
netrom: Fix a data-race around sysctl_netrom_routing_control
netrom: Fix a data-race around sysctl_netrom_link_fails_count
netrom: Fix data-races around sysctl_net_busy_read
ALSA: usb-audio: Refcount multiple accesses on the single clock
ALSA: usb-audio: Clear fixed clock rate at closing EP
ALSA: usb-audio: Split endpoint setups for hw_params and prepare (take#2)
ALSA: usb-audio: Properly refcounting clock rate
ALSA: usb-audio: Apply mutex around snd_usb_endpoint_set_params()
ALSA: usb-audio: Correct the return code from snd_usb_endpoint_set_params()
ALSA: usb-audio: Avoid superfluous endpoint setup
ALSA: usb-audio: Add quirk for Tascam Model 12
ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless
ALSA: usb-audio: Fix microphone sound on Nexigo webcam.
ALSA: usb-audio: add quirk for RODE NT-USB+
drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions
nfp: flower: add goto_chain_index for ct entry
nfp: flower: add hardware offload check for post ct entry
selftests/mm: switch to bash from sh
selftests: mm: fix map_hugetlb failure on 64K page size systems
xhci: process isoc TD properly when there was a transaction error mid TD.
xhci: handle isoc Babble and Buffer Overrun events properly
serial: max310x: use regmap methods for SPI batch operations
serial: max310x: use a separate regmap for each port
serial: max310x: prevent infinite while() loop in port startup
drm/amd/pm: do not expose the API used internally only in kv_dpm.c
drm/amdgpu: Reset IH OVERFLOW_CLEAR bit
selftests: mptcp: decrease BW in simult flows
hv_netvsc: use netif_is_bond_master() instead of open code
hv_netvsc: Register VF in netvsc_probe if NET_DEVICE_REGISTER missed
drm/amd/display: Re-arrange FPU code structure for dcn2x
drm/amd/display: move calcs folder into DML
drm/amd/display: remove DML Makefile duplicate lines
drm/amd/display: Increase frame-larger-than for all display_mode_vba files
getrusage: add the "signal_struct *sig" local variable
getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
getrusage: use __for_each_thread()
getrusage: use sig->stats_lock rather than lock_task_sighand()
proc: Use task_is_running() for wchan in /proc/$pid/stat
fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
ALSA: usb-audio: Fix wrong kfree issue in snd_usb_endpoint_free_all
ALSA: usb-audio: Always initialize fixed_rate in snd_usb_find_implicit_fb_sync_format()
ALSA: usb-audio: Add FIXED_RATE quirk for JBL Quantum610 Wireless
ALSA: usb-audio: Sort quirk table entries
regmap: allow to define reg_update_bits for no bus configuration
regmap: Add bulk read/write callbacks into regmap_config
serial: max310x: make accessing revision id interface-agnostic
serial: max310x: fix IO data corruption in batched operations
Linux 5.15.152
Change-Id: Ie781753ba68d7fd0ed65ac7f274fc22bfc7da932
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Changes in 5.15.151
netfilter: nf_tables: disallow timeout for anonymous sets
mtd: spinand: gigadevice: Fix the get ecc status issue
netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter
net: ip_tunnel: prevent perpetual headroom growth
tun: Fix xdp_rxq_info's queue_index when detaching
cpufreq: intel_pstate: fix pstate limits enforcement for adjust_perf call back
net: veth: clear GRO when clearing XDP even when down
ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()
lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is detected
net: enable memcg accounting for veth queues
veth: try harder when allocating queue memory
net: usb: dm9601: fix wrong return value in dm9601_mdio_read
uapi: in6: replace temporary label with rfc9486
stmmac: Clear variable when destroying workqueue
Bluetooth: Avoid potential use-after-free in hci_error_reset
Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR
Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
Bluetooth: Enforce validation on max value of connection interval
netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
netfilter: nfnetlink_queue: silence bogus compiler warning
netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
netfilter: make function op structures const
netfilter: let reset rules clean out conntrack entries
netfilter: bridge: confirm multicast packets before passing them up the stack
rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
igb: extend PTP timestamp adjustments to i211
tls: rx: don't store the record type in socket context
tls: rx: don't store the decryption status in socket context
tls: rx: don't issue wake ups when data is decrypted
tls: rx: refactor decrypt_skb_update()
tls: hw: rx: use return value of tls_device_decrypted() to carry status
tls: rx: drop unnecessary arguments from tls_setup_from_iter()
tls: rx: don't report text length from the bowels of decrypt
tls: rx: wrap decryption arguments in a structure
tls: rx: factor out writing ContentType to cmsg
tls: rx: don't track the async count
tls: rx: move counting TlsDecryptErrors for sync
tls: rx: assume crypto always calls our callback
tls: rx: use async as an in-out argument
tls: decrement decrypt_pending if no async completion will be called
efi/capsule-loader: fix incorrect allocation size
power: supply: bq27xxx-i2c: Do not free non existing IRQ
ALSA: Drop leftover snd-rtctimer stuff from Makefile
fbcon: always restore the old font data in fbcon_do_set_font()
afs: Fix endless loop in directory parsing
riscv: Sparse-Memory/vmemmap out-of-bounds fix
tomoyo: fix UAF write bug in tomoyo_write_control()
ALSA: firewire-lib: fix to check cycle continuity
gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
wifi: nl80211: reject iftype change with mesh ID change
btrfs: dev-replace: properly validate device names
dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read
dmaengine: ptdma: use consistent DMA masks
dmaengine: fsl-qdma: init irq after reg initialization
mmc: core: Fix eMMC initialization with 1-bit bus connection
mmc: sdhci-xenon: add timeout for PHY init complete
mmc: sdhci-xenon: fix PHY init clock stability
pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
mptcp: move __mptcp_error_report in protocol.c
mptcp: process pending subflow error on close
mptcp: rename timer related helper to less confusing names
selftests: mptcp: add missing kconfig for NF Filter
selftests: mptcp: add missing kconfig for NF Filter in v6
mptcp: clean up harmless false expressions
mptcp: add needs_id for netlink appending addr
mptcp: push at DSS boundaries
mptcp: fix possible deadlock in subflow diag
cachefiles: fix memory leak in cachefiles_add_cache()
fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
Revert "drm/bridge: lt8912b: Register and attach our DSI device at probe"
af_unix: Drop oob_skb ref before purging queue in GC.
gpio: 74x164: Enable output pins after registers are reset
gpiolib: Fix the error path order in gpiochip_add_data_with_key()
gpio: fix resource unwinding order in error path
Revert "interconnect: Fix locking for runpm vs reclaim"
Revert "interconnect: Teach lockdep about icc_bw_lock order"
bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup
bpf: Add table ID to bpf_fib_lookup BPF helper
bpf: Derive source IP addr via bpf_*_fib_lookup()
net: tls: fix async vs NIC crypto offload
Revert "tls: rx: move counting TlsDecryptErrors for sync"
mptcp: fix double-free on socket dismantle
Linux 5.15.151
Change-Id: I1ed8819c9b7e60991bc7f8afa3e4017dc74560c8
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Catch up with changes made in android14-5.15, including new symbols to
track the ABI. Changes included here are the following:
* d83231efe4 ANDROID: 16K: Handle pad VMA splits and merges
* 19d6e7eb47 ANDROID: 16K: madvise_vma_pad_pages: Remove filemap_fault check
* ae44e8dac8 ANDROID: 16K: Only madvise padding from dynamic linker context
* ae67f18944 ANDROID: Enable CONFIG_LAZY_RCU in x86 gki_defconfig
* d38091b4ff ANDROID: Enable CONFIG_LAZY_RCU in arm64 gki_defconfig
* 37b02c190c FROMLIST: rcu: Provide a boot time parameter to control lazy RCU
* 4adb60810c ANDROID: rcu: Add a minimum time for marking boot as completed
* 16ea06fe44 UPSTREAM: rcu/kvfree: Move need_offload_krc() out of krcp->lock
* 5d1a3986c2 UPSTREAM: rcu/kfree: Fix kfree_rcu_shrink_count() return value
* 88587c1838 UPSTREAM: rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
* 5b47d8411d UPSTREAM: rcu/kvfree: Remove useless monitor_todo flag
* 84828604c7 UPSTREAM: scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu()
* a4124a21b1 ANDROID: rxrpc: Use call_rcu_hurry() instead of call_rcu()
* 930bdc0924 UPSTREAM: net: devinet: Reduce refcount before grace period
* 706e751b33 UPSTREAM: rcu: Disable laziness if lazy-tracking says so
* 8568593719 UPSTREAM: rcu: Track laziness during boot and suspend
* f12c162eac UPSTREAM: net: Use call_rcu_hurry() for dst_release()
* ff22b562f0 UPSTREAM: percpu-refcount: Use call_rcu_hurry() for atomic switch
* a4cc1aa22d UPSTREAM: rcu/sync: Use call_rcu_hurry() instead of call_rcu
* 222a4cd66c UPSTREAM: rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
* f4abe7bb5f BACKPORT: rcu: Shrinker for lazy rcu
* e0297c38a5 BACKPORT: rcu: Make call_rcu() lazy to save power
* 276d33f21a UPSTREAM: rcu: Fix late wakeup when flush of bypass cblist happens
* 24e6758060 BACKPORT: rcu: Fix missing nocb gp wake on rcu_barrier()
* fb310d468a UPSTREAM: netfilter: nft_set_pipapo: do not free live element
* 444a497469 ANDROID: GKI: Update lenovo symbol list
* 978f805a2d ANDROID: GKI: Export css_task_iter_start()
* 0ae4f32634 FROMGIT: coresight: etm4x: Fix access to resource selector registers
* 8ba1802287 BACKPORT: FROMGIT: coresight: etm4x: Safe access for TRCQCLTR
* 6a08c9fb9d FROMGIT: coresight: etm4x: Do not save/restore Data trace control registers
* a02278f990 FROMGIT: coresight: etm4x: Do not hardcode IOMEM access for register restore
* e8e652b8c8 UPSTREAM: af_unix: Fix garbage collector racing against connect()
* 65e0a92c6d UPSTREAM: af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
* 5725caa296 FROMLIST: scsi: ufs: Check for completion from the timeout handler
* 8563ce5895 BACKPORT: FROMLIST: scsi: ufs: Make the polling code report which command has been completed
* 0fcd7a1c7c BACKPORT: FROMLIST: scsi: ufs: Make ufshcd_poll() complain about unsupported arguments
* aa07d6b28d ANDROID: scsi: ufs: Unexport ufshcd_mcq_poll_cqe_nolock()
* 25ebc09178 ANDROID: mm: fix incorrect unlock mmap_lock for speculative swap fault
* 264477e0d8 ANDROID: Update the ABI symbol list
* 084d22016c ANDROID: 16K: Separate padding from ELF LOAD segment mappings
* 37ea0e8485 ANDROID: 16K: Exclude ELF padding for fault around range
* e7bff50b22 ANDROID: 16K: Use MADV_DONTNEED to save VMA padding pages.
* 38cccb9154 ANDROID: 16K: Introduce ELF padding representation for VMAs
* 9274c308d8 ANDROID: 16K: Introduce /sys/kernel/mm/pgsize_miration/enabled
* ceb8c595f8 UPSTREAM: netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
* ea419cda5c UPSTREAM: netfilter: nf_tables: release batch on table validation from abort path
* 6b883cdac2 UPSTREAM: netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout
* f395ea0980 ANDROID: GKI: update mtktv symbol
* a5d03f57d6 UPSTREAM: netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
* 0cf6fdfb0a UPSTREAM: HID: playstation: support updated DualSense rumble mode.
* e3da19b218 UPSTREAM: HID: playstation: stop DualSense output work on remove.
* 62085a0e6d UPSTREAM: HID: playstation: convert to use dev_groups
* adce8aae67 UPSTREAM: HID: playstation: fix return from dualsense_player_led_set_brightness()
* c996cb50e2 UPSTREAM: HID: playstation: expose DualSense player LEDs through LED class.
* f011142fea UPSTREAM: leds: add new LED_FUNCTION_PLAYER for player LEDs for game controllers.
* 19cbe31642 UPSTREAM: HID: playstation: expose DualSense lightbar through a multi-color LED.
* 3507c287a6 UPSTREAM: mm: update mark_victim tracepoints fields
* cd4da4b748 Revert "FROMGIT: mm: update mark_victim tracepoints fields"
* 948f42ca2b UPSTREAM: netfilter: nft_set_pipapo: release elements in clone only from destroy path
* 6a45518094 ANDROID: GKI: Update symbol list for Amlogic
* 3de9177e81 ANDROID: GKI: Update symbol list for lenovo
* 668dfb812d FROMLIST: binder: check offset alignment in binder_get_object()
* 3b3c1c80e8 ANDROID: GKI: Update the ABI symbol list
* f600c62d25 ANDROID: GKI: Update symbol list for Amlogic
* d154026d33 ANDROID: GKI: Update the ABI symbol list
* 5f12c91ab0 Merge tag 'android14-5.15.148_r00' into android14-5.15
* ec86765bae ANDROID: KVM: arm64: Fix TLB invalidation when coalescing into a block
* 5854f4c2af ANDROID: KVM: arm64: Fix missing trace event for nVHE dyn HVCs
* 865e6d9df1 UPSTREAM: netfilter: nf_tables: disallow timeout for anonymous sets
* 537e133918 UPSTREAM: arm64: Apply dynamic shadow call stack patching in two passes
* 96305e30e9 ANDROID: userfaultfd: abort uffdio ops if mmap_lock is contended
* 3673533a09 ANDROID: userfaultfd: add MMAP_TRYLOCK mode for COPY/ZEROPAGE
* 3fd32dc171 ANDROID: fix isolate_migratepages_range return value
* 483395b445 Revert "ANDROID: Add CONFIG_BLK_DEV_NULL_BLK=m to gki_defconfig"
* 7b301c7079 ANDROID: fips140 - fix integrity check by unapplying dynamic SCS
* b1f8c25026 ANDROID: fips140 - add option for debugging the integrity check
* 1225d7ed6c ANDROID: fuse-bpf: Fix readdir for getdents
* 37b83a89de BACKPORT: f2fs: split initial and dynamic conditions for extent_cache
* ac4797cea5 UPSTREAM: usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group
* 5aed5c3435 ANDROID: uid_sys_stat: fix data-error of cputime and io
* c3b70e94f1 UPSTREAM: usb: typec: class: fix typec_altmode_put_partner to put plugs
* 282bfc6c30 UPSTREAM: Revert "usb: typec: class: fix typec_altmode_put_partner to put plugs"
* 2390d58862 ANDROID: GKI: Update the ABI symbol list
* 0d0784d6b2 ANDROID: Update ABI for userfaultfd_ctx
* ee9964b308 ANDROID: userfaultfd: allow SPF for UFFD_FEATURE_SIGBUS on private+anon
* 9cef46f39e ANDROID: remove LTO check from build.config.gki.aarch64.fips140
* b74b4cbe62 Revert "interconnect: Fix locking for runpm vs reclaim"
* f115661832 Revert "interconnect: Teach lockdep about icc_bw_lock order"
* d96725ec1a BACKPORT: FROMGIT: PM: runtime: add tracepoint for runtime_status changes
* 4403e2517a UPSTREAM: netfilter: nft_set_rbtree: skip end interval element from gc
* 288abb8b19 ANDROID: PCI: dwc: Wait for the link only if it has been started
* ff1e211db6 ANDROID: null_blk: Support configuring the maximum segment size
* 0ffd03e67d ANDROID: scsi_debug: Support configuring the maximum segment size
* 3ef8e9009c ANDROID: block: Make sub_page_limit_queues available in debugfs
* bed88e7c4f ANDROID: block: Add support for filesystem requests and small segments
* e99e7de8a6 ANDROID: block: Support submitting passthrough requests with small segments
* 3f6018f1b6 ANDROID: block: Support configuring limits below the page size
* 025c278e84 ANDROID: block: Prepare for supporting sub-page limits
* f56ddffe05 ANDROID: block: Use pr_info() instead of printk(KERN_INFO ...)
Change-Id: I6834aac2be94f461b9f59baa696d5d130fc295d9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
In commit 2a52590ac5 ("virtio-blk: Ensure no requests in virtqueues
before deleting vqs.") the virtio_blk driver adds calls to
blk_mq_freeze_queue and blk_mq_unfreeze_queue, so add them to the
virtual device symbol list so that that target will build properly.
Fixes: 2a52590ac5 ("virtio-blk: Ensure no requests in virtqueues before deleting vqs.")
Change-Id: Iaf7ef825414a5bc3db36cd9479acb0c1a7435e11
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
In some cases a VMA with padding representation may be split, and
therefore the padding flags must be updated accordingly.
There are 3 cases to handle:
Given:
| DDDDPPPP |
where:
- D represents 1 page of data;
- P represents 1 page of padding;
- | represents the boundaries (start/end) of the VMA
1) Split exactly at the padding boundary
| DDDDPPPP | --> | DDDD | PPPP |
- Remove padding flags from the first VMA.
- The second VMA is all padding
2) Split within the padding area
| DDDDPPPP | --> | DDDDPP | PP |
- Subtract the length of the second VMA from the first VMA's
padding.
- The second VMA is all padding, adjust its padding length (flags)
3) Split within the data area
| DDDDPPPP | --> | DD | DDPPPP |
- Remove padding flags from the first VMA.
- The second VMA is has the same padding as from before the split.
To simplify the semantics merging of padding VMAs is not allowed.
If a split produces a VMA that is entirely padding, show_[s]maps()
only outputs the padding VMA entry (as the data entry is of length 0).
Bug: 330117029
Bug: 327600007
Bug: 330767927
Bug: 328266487
Bug: 329803029
Change-Id: Ie2628ced5512e2c7f8af25fabae1f38730c8bb1a
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Some file systems like F2FS use a custom filemap_fault ops. Remove this
check, as checking vm_file is sufficient.
Bug: 330117029
Bug: 327600007
Bug: 330767927
Bug: 328266487
Bug: 329803029
Change-Id: Id6a584d934f06650c0a95afd1823669fc77ba2c2
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Only preform padding advise from the execution context on bionic's
dynamic linker. This ensures that madvise() doesn't have unwanted
side effects.
Also rearrange the order of fail checks in madvise_vma_pad_pages()
in order of ascending cost.
Bug: 330117029
Bug: 327600007
Bug: 330767927
Bug: 328266487
Bug: 329803029
Change-Id: I3e05b8780c6eda78007f86b613f8c11dd18ac28f
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.
Bug: 258241771
Change-Id: Ic9e15b846d58ffa3d5dd81842c568da79352ff2d
Signed-off-by: Qais Yousef <qyousef@google.com>
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.
Bug: 258241771
Change-Id: I11c920aa5edde2fc42ab54245cd198eb8cb47616
Signed-off-by: Qais Yousef <qyousef@google.com>
To allow more flexible arrangements while still provide a single kernel
for distros, provide a boot time parameter to enable/disable lazy RCU.
Specify:
rcutree.enable_rcu_lazy=[y|1|n|0]
Which also requires
rcu_nocbs=all
at boot time to enable/disable lazy RCU.
To disable it by default at build time when CONFIG_RCU_LAZY=y, the new
CONFIG_RCU_LAZY_DEFAULT_OFF can be used.
Bug: 258241771
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/lkml/20231203011252.233748-1-qyousef@layalina.io/
[Fix trivial conflicts rejecting newer code that doesn't exist on 5.15]
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ib5585ae717a2ba7749f2802101b785c4e5de8a90
On many systems, a great deal of boot (in userspace) happens after the
kernel thinks the boot has completed. It is difficult to determine if
the system has really booted from the kernel side. Some features like
lazy-RCU can risk slowing down boot time if, say, a callback has been
added that the boot synchronously depends on. Further expedited callbacks
can get unexpedited way earlier than it should be, thus slowing down
boot (as shown in the data below).
For these reasons, this commit adds a config option
'CONFIG_RCU_BOOT_END_DELAY' and a boot parameter rcupdate.boot_end_delay.
Userspace can also make RCU's view of the system as booted, by writing the
time in milliseconds to: /sys/module/rcupdate/parameters/android_rcu_boot_end_delay
Or even just writing a value of 0 to this sysfs node.
However, under no circumstance will the boot be allowed to end earlier
than just before init is launched.
The default value of CONFIG_RCU_BOOT_END_DELAY is chosen as 15s. This
suites ChromeOS and also a PREEMPT_RT system below very well, which need
no config or parameter changes, and just a simple application of this
patch. A system designer can also choose a specific value here to keep
RCU from marking boot completion. As noted earlier, RCU's perspective
of the system as booted will not be marker until at least
android_rcu_boot_end_delay milliseconds have passed or an update is made
via writing a small value (or 0) in milliseconds to:
/sys/module/rcupdate/parameters/android_rcu_boot_end_delay.
One side-effect of this patch is, there is a risk that a real-time workload
launched just after the kernel boots will suffer interruptions due to expedited
RCU, which previous ended just before init was launched. However, to mitigate
such an issue (however unlikely), the user should either tune
CONFIG_RCU_BOOT_END_DELAY to a smaller value than 15 seconds or write a value
of 0 to /sys/module/rcupdate/parameters/android_rcu_boot_end_delay, once userspace
boots, and before launching the real-time workload.
Qiuxu also noted impressive boot-time improvements with earlier version
of patch. An excerpt from the data he shared:
1) Testing environment:
OS : CentOS Stream 8 (non-RT OS)
Kernel : v6.2
Machine : Intel Cascade Lake server (2 sockets, each with 44 logical threads)
Qemu args : -cpu host -enable-kvm, -smp 88,threads=2,sockets=2, …
2) OS boot time definition:
The time from the start of the kernel boot to the shell command line
prompt is shown from the console. [ Different people may have
different OS boot time definitions. ]
3) Measurement method (very rough method):
A timer in the kernel periodically prints the boot time every 100ms.
As soon as the shell command line prompt is shown from the console,
we record the boot time printed by the timer, then the printed boot
time is the OS boot time.
4) Measured OS boot time (in seconds)
a) Measured 10 times w/o this patch:
8.7s, 8.4s, 8.6s, 8.2s, 9.0s, 8.7s, 8.8s, 9.3s, 8.8s, 8.3s
The average OS boot time was: ~8.7s
b) Measure 10 times w/ this patch:
8.5s, 8.2s, 7.6s, 8.2s, 8.7s, 8.2s, 7.8s, 8.2s, 9.3s, 8.4s
The average OS boot time was: ~8.3s.
(CHROMIUM tag rationale: Submitted upstream but got lots of pushback as
it may harm a PREEMPT_RT system -- the concern is VERY theoretical and
this improves things for ChromeOS. Plus we are not a PREEMPT_RT system.
So I am strongly suggesting this mostly simple change for ChromeOS.)
Bug: 258241771
Bug: 268129466
Test: boot
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Change-Id: Ibd262189d7f92dbcc57f1508efe90fcfba95a6cc
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4350228
Commit-Queue: Joel Fernandes <joelaf@google.com>
Commit-Queue: Vineeth Pillai <vineethrp@google.com>
Tested-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
(cherry picked from commit 7968079ec77b320ee9d4115fe13048a8f7afbc02)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style. Prefix boot param with android_]
Signed-off-by: Qais Yousef <qyousef@google.com>
The need_offload_krc() function currently holds the krcp->lock in order
to safely check krcp->head. This commit removes the need for this lock
in that function by updating the krcp->head pointer using WRITE_ONCE()
macro so that readers can carry out lockless loads of that pointer.
Bug: 258241771
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 8fc5494ad5)
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Iddde5ec15e8574216abc95d8c64efa5c66868508
As per the comments in include/linux/shrinker.h, .count_objects callback
should return the number of freeable items, but if there are no objects
to free, SHRINK_EMPTY should be returned. The only time 0 is returned
should be when we are unable to determine the number of objects, or the
cache should be skipped for another reason.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 3826909635)
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: I5cb380fceaccc85971a47773d9058f0ea044c6dd
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4332178
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
(cherry picked from commit 3243f1e22bf915c9b805a96cc4a8cbc03ed5d7a8)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Currently the monitor work is scheduled with a fixed interval of HZ/20,
which is roughly 50 milliseconds. The drawback of this approach is
low utilization of the 512 page slots in scenarios with infrequence
kvfree_rcu() calls. For example on an Android system:
<snip>
kworker/3:3-507 [003] .... 470.286305: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000d0f0dde5 nr_records=6
kworker/6:1-76 [006] .... 470.416613: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000ea0d6556 nr_records=1
kworker/6:1-76 [006] .... 470.416625: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000003e025849 nr_records=9
kworker/3:3-507 [003] .... 471.390000: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000815a8713 nr_records=48
kworker/1:1-73 [001] .... 471.725785: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000fda9bf20 nr_records=3
kworker/1:1-73 [001] .... 471.725833: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000a425b67b nr_records=76
kworker/0:4-1411 [000] .... 472.085673: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000007996be9d nr_records=1
kworker/0:4-1411 [000] .... 472.085728: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000d0f0dde5 nr_records=5
kworker/6:1-76 [006] .... 472.260340: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x0000000065630ee4 nr_records=102
<snip>
In many cases, out of 512 slots, fewer than 10 were actually used.
In order to improve batching and make utilization more efficient this
commit sets a drain interval to a fixed 5-seconds interval. Floods are
detected when a page fills quickly, and in that case, the reclaim work
is re-scheduled for the next scheduling-clock tick (jiffy).
After this change:
<snip>
kworker/7:1-371 [007] .... 5630.725708: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000005ab0ffb3 nr_records=121
kworker/7:1-371 [007] .... 5630.989702: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x0000000060c84761 nr_records=47
kworker/7:1-371 [007] .... 5630.989714: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000000babf308 nr_records=510
kworker/7:1-371 [007] .... 5631.553790: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000bb7bd0ef nr_records=169
kworker/7:1-371 [007] .... 5631.553808: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x0000000044c78753 nr_records=510
kworker/5:6-9428 [005] .... 5631.746102: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000d98519aa nr_records=123
kworker/4:7-9434 [004] .... 5632.001758: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x00000000526c9d44 nr_records=322
kworker/4:7-9434 [004] .... 5632.002073: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000002c6a8afa nr_records=185
kworker/7:1-371 [007] .... 5632.277515: rcu_invoke_kfree_bulk_callback: rcu_preempt bulk=0x000000007f4a962f nr_records=510
<snip>
Here, all but one of the cases, more than one hundreds slots were used,
representing an order-of-magnitude improvement.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 51824b780b)
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: I4635ba0dbece4e029d5271ef3950b8eaa1ae5e81
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4332177
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
(cherry picked from commit b1bf359877e084383be107bf0008d58d0a6b15e3)
[Conflict due to 71cf9c9835 adding a new
function in the same location.
Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
monitor_todo is not needed as the work struct already tracks
if work is pending. Just use that to know if work is pending
using schedule_delayed_work() helper.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
(cherry picked from commit 82d26c36cc)
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: I4c13f89da735a628a5030ab55a13e338b97da4b8
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4332176
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
(cherry picked from commit bb867be28d6a70b36ff1d6563f794c489072ab7e)
[Minor conflict with 71cf9c9835 where it
added a new function in the same location.
Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
And another call_rcu() instance that cannot be lazy is the one in the
scsi_eh_scmd_add() function. Leaving this instance lazy results in
unacceptably slow boot times.
Therefore, make scsi_eh_scmd_add() use call_rcu_hurry() in order to
revert to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Bug: 258241771
Bug: 222463781
Test: CQ
Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Change-Id: I95bba865e582b0a12b1c09ba1f0bd4f897401c07
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: <linux-scsi@vger.kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 54d87b0a0c)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318056
Commit-Queue: Joel Fernandes <joelaf@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
(cherry picked from commit 5578f9ac27d25e3e57a5b9c4cf0346cfc5162994)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Currently, the inetdev_destroy() function waits for an RCU grace period
before decrementing the refcount and freeing memory. This causes a delay
with a new RCU configuration that tries to save power, which results in the
network interface disappearing later than expected. The resulting delay
causes test failures on ChromeOS.
Refactor the code such that the refcount is freed before the grace period
and memory is freed after. With this a ChromeOS network test passes that
does 'ip netns del' and polls for an interface disappearing, now passes.
Bug: 258241771
Bug: 222463781
Test: CQ
Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Change-Id: I98b13c5a8fb9696c1111219d774cf91c8b14b4c5
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 9d40c84cf5)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318054
Tested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
(cherry picked from commit 3c0f4bb182d6b0be5424947b53019e92bea8b38c)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
During suspend, we see failures to suspend 1 in 300-500 suspends.
Looking closer, it appears that asynchronous RCU callbacks are being
queued as lazy even though synchronous callbacks are expedited. These
delays appear to not be very welcome by the suspend/resume code as
evidenced by these occasional suspend failures.
This commit modifies call_rcu() to check if rcu_async_should_hurry(),
which will return true if we are in suspend or in-kernel boot.
[ paulmck: Alphabetize local variables. ]
Ignoring the lazy hint makes the 3000 suspend/resume cycles pass
reliably on a 12th gen 12-core Intel CPU, and there is some evidence
that it also slightly speeds up boot performance.
Bug: 258241771
Bug: 222463781
Test: CQ
Fixes: 3cb278e73b ("rcu: Make call_rcu() lazy to save power")
Change-Id: I4cfe6f43de8bae9a6c034831c79d9773199d6d29
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit cf7066b97e)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318052
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
(cherry picked from commit e59686da91b689d3771a09f3eae37db5f40d3f75)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Boot and suspend/resume should not be slowed down in kernels built with
CONFIG_RCU_LAZY=y. In particular, suspend can sometimes fail in such
kernels.
This commit therefore adds rcu_async_hurry(), rcu_async_relax(), and
rcu_async_should_hurry() functions that track whether or not either
a boot or a suspend/resume operation is in progress. This will
enable a later commit to refrain from laziness during those times.
Export rcu_async_should_hurry(), rcu_async_hurry(), and rcu_async_relax()
for later use by rcutorture.
[ paulmck: Apply feedback from Steve Rostedt. ]
Bug: 258241771
Bug: 222463781
Test: CQ
Fixes: 3cb278e73b ("rcu: Make call_rcu() lazy to save power")
Change-Id: Ieb2f2d484a33cfbd71f71c8e3dbcfc05cd7efe8c
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 6efdda8bec)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318051
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Tested-by: Joel Fernandes <joelaf@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
(cherry picked from commit 8bc7efc64c84da753f2174a7071c8f1a7823d2bb)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
In a networking test on ChromeOS, kernels built with the new
CONFIG_RCU_LAZY=y Kconfig option fail a networking test in the teardown
phase.
This failure may be reproduced as follows: ip netns del <name>
The CONFIG_RCU_LAZY=y Kconfig option was introduced by earlier commits
in this series for the benefit of certain battery-powered systems.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
Returning to the test failure, use of ftrace showed that this failure
cause caused by the aadded delays due to this new lazy behavior of
call_rcu() in kernels built with CONFIG_RCU_LAZY=y.
Therefore, make dst_release() use call_rcu_hurry() in order to revert
to the old test-failure-free behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: Ifd64083bd210a9dfe94c179152f27d310c179507
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 483c26ff63)
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318050
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
(cherry picked from commit e0886387489fed8a60e7e0f107b95fb9c0241930)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order to
batch callbacks. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
And another call_rcu() instance that cannot be lazy is the one on the
percpu refcounter's "per-CPU to atomic switch" code path, which
uses RCU when switching to atomic mode. The enqueued callback
wakes up waiters waiting in the percpu_ref_switch_waitq. Allowing
this callback to be lazy would result in unacceptable slowdowns for
users of per-CPU refcounts, such as blk_pre_runtime_suspend().
Therefore, make __percpu_ref_switch_to_atomic() use call_rcu_hurry()
in order to revert to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: Icc325f69d0df1a37b6f1de02a284e1fabf20e366
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: <linux-mm@kvack.org>
(cherry picked from commit 343a72e5e3)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318049
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Tested-by: Joel Fernandes <joelaf@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
(cherry picked from commit dfd536f499642cd18679cc64c79a8fb275137f45)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
call_rcu() changes to save power will slow down rcu sync. Use the
call_rcu_hurry() API instead which reverts to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: I5123ba52f47676305dbcfa1233bf3b41f140766c
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 7651d6b250)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318048
Reviewed-by: Sean Paul <sean@poorly.run>
Commit-Queue: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
(cherry picked from commit 183fce4e1bfbbae1266ec90c6bb871b51d7af81c)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
The shrinker is used to speed up the free'ing of memory potentially held
by RCU lazy callbacks. RCU kernel module test cases show this to be
effective. Test is introduced in a later patch.
[Joel: register_shrinker() argument list change.]
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: I6a73a9dae79ff35feca37abe2663e55a0f46dda8
Signed-off-by: Vineeth Pillai <vineeth@bitbyteword.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit c945b4da7a)
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318046
Tested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
(cherry picked from commit 2cf50ca2e7c3bc08f5182fc517a89a65e8dca7e3)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.
By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_hurry() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.
The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.
To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.
[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]
[ paulmck: Apply Zqiang feedback. ]
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
[ joelaf: Small changes for 5.15 backport. ]
Suggested-by: Paul McKenney <paulmck@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Bug: 258241771
Bug: 222463781
Test: CQ
(cherry picked from commit 3cb278e73bhttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master)
Change-Id: I557d5af2a5d317bd66e9ec55ed40822bb5c54390
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4318045
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Commit-Queue: Joel Fernandes <joelaf@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
(cherry picked from commit b30e520b9da88a5de115ed5b2c1b2aa89de9e214)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
When the bypass cblist gets too big or its timeout has occurred, it is
flushed into the main cblist. However, the bypass timer is still running
and the behavior is that it would eventually expire and wake the GP
thread.
Since we are going to use the bypass cblist for lazy CBs, do the wakeup
soon as the flush for "too big or too long" bypass list happens.
Otherwise, long delays can happen for callbacks which get promoted from
lazy to non-lazy.
This is a good thing to do anyway (regardless of future lazy patches),
since it makes the behavior consistent with behavior of other code paths
where flushing into the ->cblist makes the GP kthread into a
non-sleeping state quickly.
[ Frederic Weisbecker: Changes to avoid unnecessary GP-thread wakeups plus
comment changes. ]
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit b50606f35f)
Bug: 258241771
Bug: 222463781
Test: powerIdle lab tests.
Change-Id: If8da96d7ba6ed90a2a70f7d56f7bb03af44fd649
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4065239
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
(cherry picked from commit 75db04e1eed1756a4ee5fb87ef8dd494d19bf53f)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
In preparation for RCU lazy changes, wake up the RCU nocb gp thread if
needed after an entrain. This change prevents the RCU barrier callback
from waiting in the queue for several seconds before the lazy callbacks
in front of it are serviced.
Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit b8f7aca3f0https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next)
(Backport:
Conflicts:
kernel/rcu/tree.c
Due to missing 'rcu: Rework rcu_barrier() and callback-migration logic'
Chose not to backport that.)
Bug: 258241771
Bug: 222463781
Test: CQ
Change-Id: Ib55c5886764b74df22531eca35f076ef7acc08dd
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4062165
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
(cherry picked from commit fc6e55ea65dca9cc52bda6081341f3fcc87f6ee7)
[Cherry picked from chromeos-5.15 tree. Minor tweaks to commit message
to match Android style]
Signed-off-by: Qais Yousef <qyousef@google.com>
[ Upstream commit 3cfc9ec039af60dbd8965ae085b2c2ccdcfbe1cc ]
Pablo reports a crash with large batches of elements with a
back-to-back add/remove pattern. Quoting Pablo:
add_elem("00000000") timeout 100 ms
...
add_elem("0000000X") timeout 100 ms
del_elem("0000000X") <---------------- delete one that was just added
...
add_elem("00005000") timeout 100 ms
1) nft_pipapo_remove() removes element 0000000X
Then, KASAN shows a splat.
Looking at the remove function there is a chance that we will drop a
rule that maps to a non-deactivated element.
Removal happens in two steps, first we do a lookup for key k and return the
to-be-removed element and mark it as inactive in the next generation.
Then, in a second step, the element gets removed from the set/map.
The _remove function does not work correctly if we have more than one
element that share the same key.
This can happen if we insert an element into a set when the set already
holds an element with same key, but the element mapping to the existing
key has timed out or is not active in the next generation.
In such case its possible that removal will unmap the wrong element.
If this happens, we will leak the non-deactivated element, it becomes
unreachable.
The element that got deactivated (and will be freed later) will
remain reachable in the set data structure, this can result in
a crash when such an element is retrieved during lookup (stale
pointer).
Add a check that the fully matching key does in fact map to the element
that we have marked as inactive in the deactivation step.
If not, we need to continue searching.
Add a bug/warn trap at the end of the function as well, the remove
function must not ever be called with an invisible/unreachable/non-existent
element.
v2: avoid uneeded temporary variable (Stefano)
Bug: 336735501
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit ebf7c9746f)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ic9a48ac9ac0f9960fea9e066d9a0a9fb93f7b633
Export css_task_iter_start() and css_task_iter_next() and
css_task_iter_end() inorder to support task iteration in a cgroup in
vendor modules.
Bug: 336967294
Change-Id: Id93963ddd30ab02c7a4d5086f19d15310e4eda14
Signed-off-by: seanwang1 <seanwang1@lenovo.com>
This reverts commit 4d3b2bd995 which is
commit b5f0de6df6 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed. It's not a "real" break, and
we can work around it, but this really does not affect Android systems,
so it's safe to drop for now.
Bug: 161946584
Change-Id: Id2666dca715b44594f71e291a4c01e5b5a0e88d9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 97eaa2955d which is
commit a7d6027790acea24446ddd6632d394096c0f4667 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I05947b1018c5e28cdcb891edddf72163a2a0666a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit ef982fc410 which is
commit 1c9be13846c0b2abc2480602f8ef421360e1ad9e upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed. It's not a "real" break, and
we can work around it, but this really does not affect Android systems,
so it's safe to drop for now.
Bug: 161946584
Change-Id: Ica8f15560c09d1077c4177fb7710c5a24a563360
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 256c3e6192 which is
commit b787a3e781759026a6212736ef8e52cf83d1821a upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed. It's not a "real" break, and
we can work around it, but this really does not affect Android systems,
so it's safe to drop for now.
Bug: 161946584
Change-Id: I46a8368cbf844a05ee18cfdfa33b1b8f50b529ef
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>