To enable out of tree drivers that are based on top of this
functionality.
Bug: 316903397
Change-Id: I96bd84b805b984ebbc3fe0ac4badcd62bb00418b
Signed-off-by: Qais Yousef <qyousef@google.com>
This config will cause the 'CtsHdmiCecHostTestCases' test case to fail.
According to the discussion in bug 309377116, it is recommended to remove this config.
Bug: 322143898
Change-Id: Ied37a6c55f4198dbb9dbb9b6c3156a8a7a0bd945
Signed-off-by: James Tai <james.tai@realtek.com>
There is a 120ms delay implemented for allowing the XHCI host controller to
detect a U3 wakeup pulse. The intention is to wait for the device to retry
the wakeup event if the USB3 PORTSC doesn't reflect the RESUME link status
by the time it is checked. As per the USB3 specification:
tU3WakeupRetryDelay ("Table 7-12. LTSSM State Transition Timeouts")
This would allow the XHCI resume sequence to determine if the root hub
needs to be also resumed. However, in case there is no device connected,
or if there is only a HSUSB device connected, this delay would still affect
the overall resume timing.
Since this delay is solely for detecting U3 wake events (USB3 specific)
then ignore this delay for the disconnected case and the HSUSB connected
only case.
[skip helper function, rename usb3_connected variable -Mathias ]
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20231019102924.2797346-20-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 200589374
(cherry picked from commit 6add6dd345cb754ce18ff992c7264cabf31e59f6 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-next)
[wcheng: removed the need to check for resume type]
Change-Id: I242a426ab0de40fd77705aaef57d228b8721d701
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Adding the following symbols:
- gpiod_set_debounce
Bug: 316820336
Change-Id: I5f89b5ac4f52a05d1e29e3ff90abf6506551ef23
Signed-off-by: Jacky Liu <qsliu@google.com>
It was pointed out that since commit b6115e140102 ("ANDROID:
uid_sys_stat: split the global lock uid_lock to the fine-grained locks
for each hlist in hash_table") taking a spin_lock in uid_lock()
causes a scheduling while atomic error if CONFIG_UID_SYS_STATS_DEBUG
is enabled, as get_full_task_comm() takes the mmap_write_lock()
which is a semaphore, breaking the proper ordering.
In the GKI CONFIG_UID_SYS_STATS_DEBUG is disabled, so this went
unnoticed.
The uid_sys_stats logic isn't ever going to go upstream (it depends
on reverting upstream logic) and will hopefully be replaced eventually.
So there's not much reason to drag around this debug logic that is
unused.
So drop it. Less code to schlep forward.
Bug: 320184870
Change-Id: I2cfce79d5a25a3eba11a5509444c07b4642ef2de
Signed-off-by: John Stultz <jstultz@google.com>
When handling error status from uvcg_video_usb_req_queue,
uvc_video_complete currently calls uvcg_queue_cancel with
video->req_lock held. uvcg_queue_cancel internally locks
queue->irqlock, which nests queue->irqlock inside
video->req_lock. This isn't a functional bug at the
moment, but does open up possibilities for ABBA
deadlocks in the future.
This patch fixes the accidental nesting by dropping
video->req_lock before calling uvcg_queue_cancel.
Fixes: 6acba0345b68 ("usb:gadget:uvc Do not use worker thread to pump isoc usb requests")
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Link: https://lore.kernel.org/r/20240104215009.2252452-2-arakesh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 314338409
(cherry picked from commit 9866dc4314c6c858e451933f965d64532aec00a9)
Change-Id: If25fba6661d55cd972d76068750f3b445c8360aa
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Amit Pundir at Linaro reported seeing crashes in uid_sys_stats
driver when building with GCC.
Looking into it, it seems the uid_entry_tmp value is used
while only partially initialized, causing potential out of bound
access on the uid_entry io arrays.
This likely has gone unnoticed with clang as I believe we're
using the zero initialization for stack variables security
feature.
So change the logic to fully initialize the uid_entry_tmp
value.
Fixes: f68d4f3c3b53 ("ANDROID: uid_sys_stat: instead update_io_stats_uid_locked to update_io_stats_uid")
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: John Stultz <jstultz@google.com>
Change-Id: I78de245e80ef60aabec78a615c7ba582ab5a2242
There is a path that may lead to freed memory being referenced,
causing kernel panics.
The kernel panic has the following stack trace:
Workqueue: uvcgadget uvcg_video_pump.c51fb85fece46625450f86adbf92c56c.cfi_jt
pstate: 60c00085 (nZCv daIf +PAN +UAO -TCO BTYPE=--)
pc : __list_del_entry_valid+0xc0/0xd4
lr : __list_del_entry_valid+0xc0/0xd4
Call trace:
__list_del_entry_valid+0xc0/0xd4
uvc_video_free_request+0x60/0x98
uvcg_video_pump+0x1cc/0x204
process_one_work+0x21c/0x4b8
worker_thread+0x29c/0x574
kthread+0x158/0x1b0
ret_from_fork+0x10/0x30
The root cause is that uvcg_video_usb_req_queue frees the uvc_request
if is_enabled is false and returns an error status. video_pump also
frees the associated request if uvcg_video_usb_req_queue returns an
error status, leading to double free and accessing garbage memory.
To fix the issue, this patch removes freeing logic from
uvcg_video_usb_req_queue, and lets the callers to the function handle
queueing errors as they see fit.
Fixes: 6acba0345b68 ("usb:gadget:uvc Do not use worker thread to pump isoc usb requests")
Tested-by: Avichal Rakesh <arakesh@google.com>
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Link: https://lore.kernel.org/r/20240104215009.2252452-1-arakesh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 314338409
(cherry picked from commit fe814b5b0f3042f1a583734497e726ee53783cc1)
Change-Id: Id13dea3a37e37a79cff3719ced449f0d1902ebd6
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Add symbol list for Nothing at the first time
2 function symbol(s) added
'struct file_system_type* get_fs_type(const char*)'
'void iterate_supers_type(struct file_system_type*, void(*)(struct super_block*, void*), void*)'
Bug: 321604034
Change-Id: I3cdf16cf21bf04df2c0ab10358e7e7da4e99ccd3
Signed-off-by: Dylan Chang <dylan.chang@nothing.tech>
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.
Bug: 258241771
Change-Id: Ic9e15b846d58ffa3d5dd81842c568da79352ff2d
Signed-off-by: Qais Yousef <qyousef@google.com>
fuse_lseek_backing was returning the offset as an int, which would then
be treated as an ERR if in the range 4G-4096 and 4G.
Although the call would appear to work correctly, the file position
would be incorrect according to a subsequent fseek with SEEK_CUR.
Based on a change by chenyuwen <chenyuwen1@meizu.com> who found and
fixed this issue.
Bug: 319219307
Change-Id: I3aef5fb22751a72ce2bd7674ee081956a89fc752
Signed-off-by: chenyuwen <chenyuwen1@meizu.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
It is reported that in low-memory situations the system-wide resume core
code deadlocks, because async_schedule_dev() executes its argument
function synchronously if it cannot allocate memory (and not only in
that case) and that function attempts to acquire a mutex that is already
held. Executing the argument function synchronously from within
dpm_async_fn() may also be problematic for ordering reasons (it may
cause a consumer device's resume callback to be invoked before a
requisite supplier device's one, for example).
Address this by changing the code in question to use
async_schedule_dev_nocall() for scheduling the asynchronous
execution of device suspend and resume functions and to directly
run them synchronously if async_schedule_dev_nocall() returns false.
Link: https://lore.kernel.org/linux-pm/ZYvjiqX6EsL15moe@perf/
Reported-by: Youngmin Nam <youngmin.nam@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 6aa09a5bccd8 async: Split async_schedule_node_domain()
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 7d4b5d7a37bd async: Introduce async_schedule_dev_nocall()
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+
Bug: 319759660
Change-Id: I1164a6a0b9899ab2f01d5efb413827b9d0983d98
(cherry picked from commit 7839d0078e0d5e6cc2fa0b0dfbee71de74f1e557)
[Youngmin: Resolved minor conflict in drivers/base/power/main.c]
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
In preparation for subsequent changes, introduce a specialized variant
of async_schedule_dev() that will not invoke the argument function
synchronously when it cannot be scheduled for asynchronous execution.
The new function, async_schedule_dev_nocall(), will be used for fixing
possible deadlocks in the system-wide power management core code.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> for the series.
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Bug: 319759660
Change-Id: I497f1a9655d80c2d9710c3c814f6a99a31bcf019
(cherry picked from commit 7d4b5d7a37bdd63a5a3371b988744b060d5bb86f)
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
In preparation for subsequent changes, split async_schedule_node_domain()
in two pieces so as to allow the bottom part of it to be called from a
somewhat different code path.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Bug: 319759660
Change-Id: I6405b388d9a0286208b48f7a321b0042d85abb4b
(cherry picked from commit 6aa09a5bccd8e224d917afdb4c278fc66aacde4d)
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
The current implementation of the mark_victim tracepoint provides only the
process ID (pid) of the victim process. This limitation poses challenges
for userspace tools that need additional information about the OOM victim.
The association between pid and the additional data may be lost after the
kill, making it difficult for userspace to correlate the OOM event with
the specific process.
In order to mitigate this limitation, add the following fields:
- UID
In Android each installed application has a unique UID. Including
the `uid` assists in correlating OOM events with specific apps.
- Process Name (comm)
Enables identification of the affected process.
- OOM Score
Allows userspace to get additional insights of the relative kill
priority of the OOM victim.
Link: https://lkml.kernel.org/r/20240111210539.636607-1-carlosgalo@google.com
Change-Id: Icc3ed013a9dfff9bb09f1d7588757e6028c17069
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 649ffb4cbb90a7f60f17dd74e57d814e762ea01d mm-unstable)
[ carlosgalo: Manually added struct cred change in mark_oom_victim function ]
Bug: 315560026
Change-Id: I81fb6f3447f432100ad4cd25e22db23768003388
Signed-off-by: Carlos Galo <carlosgalo@google.com>
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.
Bug: 258241771
Change-Id: I11c920aa5edde2fc42ab54245cd198eb8cb47616
Signed-off-by: Qais Yousef <qyousef@google.com>
To allow more flexible arrangements while still provide a single kernel
for distros, provide a boot time parameter to enable/disable lazy RCU.
Specify:
rcutree.enable_rcu_lazy=[y|1|n|0]
Which also requires
rcu_nocbs=all
at boot time to enable/disable lazy RCU.
To disable it by default at build time when CONFIG_RCU_LAZY=y, the new
CONFIG_RCU_LAZY_DEFAULT_OFF can be used.
Bug: 258241771
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/lkml/20231203011252.233748-1-qyousef@layalina.io/
[Fix trivial conflicts rejecting newer code that doesn't exist on 6.1]
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ib5585ae717a2ba7749f2802101b785c4e5de8a90
On many systems, a great deal of boot (in userspace) happens after the
kernel thinks the boot has completed. It is difficult to determine if
the system has really booted from the kernel side. Some features like
lazy-RCU can risk slowing down boot time if, say, a callback has been
added that the boot synchronously depends on. Further expedited callbacks
can get unexpedited way earlier than it should be, thus slowing down
boot (as shown in the data below).
For these reasons, this commit adds a config option
'CONFIG_RCU_BOOT_END_DELAY' and a boot parameter rcupdate.boot_end_delay.
Userspace can also make RCU's view of the system as booted, by writing the
time in milliseconds to: /sys/module/rcupdate/parameters/rcu_boot_end_delay
Or even just writing a value of 0 to this sysfs node.
However, under no circumstance will the boot be allowed to end earlier
than just before init is launched.
The default value of CONFIG_RCU_BOOT_END_DELAY is chosen as 15s. This
suites ChromeOS and also a PREEMPT_RT system below very well, which need
no config or parameter changes, and just a simple application of this patch. A
system designer can also choose a specific value here to keep RCU from marking
boot completion. As noted earlier, RCU's perspective of the system as booted
will not be marker until at least rcu_boot_end_delay milliseconds have passed
or an update is made via writing a small value (or 0) in milliseconds to:
/sys/module/rcupdate/parameters/rcu_boot_end_delay.
One side-effect of this patch is, there is a risk that a real-time workload
launched just after the kernel boots will suffer interruptions due to expedited
RCU, which previous ended just before init was launched. However, to mitigate
such an issue (however unlikely), the user should either tune
CONFIG_RCU_BOOT_END_DELAY to a smaller value than 15 seconds or write a value
of 0 to /sys/module/rcupdate/parameters/rcu_boot_end_delay, once userspace
boots, and before launching the real-time workload.
Qiuxu also noted impressive boot-time improvements with earlier version
of patch. An excerpt from the data he shared:
1) Testing environment:
OS : CentOS Stream 8 (non-RT OS)
Kernel : v6.2
Machine : Intel Cascade Lake server (2 sockets, each with 44 logical threads)
Qemu args : -cpu host -enable-kvm, -smp 88,threads=2,sockets=2, …
2) OS boot time definition:
The time from the start of the kernel boot to the shell command line
prompt is shown from the console. [ Different people may have
different OS boot time definitions. ]
3) Measurement method (very rough method):
A timer in the kernel periodically prints the boot time every 100ms.
As soon as the shell command line prompt is shown from the console,
we record the boot time printed by the timer, then the printed boot
time is the OS boot time.
4) Measured OS boot time (in seconds)
a) Measured 10 times w/o this patch:
8.7s, 8.4s, 8.6s, 8.2s, 9.0s, 8.7s, 8.8s, 9.3s, 8.8s, 8.3s
The average OS boot time was: ~8.7s
b) Measure 10 times w/ this patch:
8.5s, 8.2s, 7.6s, 8.2s, 8.7s, 8.2s, 7.8s, 8.2s, 9.3s, 8.4s
The average OS boot time was: ~8.3s.
(CHROMIUM tag rationale: Submitted upstream but got lots of pushback as
it may harm a PREEMPT_RT system -- the concern is VERY theoretical and
this improves things for ChromeOS. Plus we are not a PREEMPT_RT system.
So I am strongly suggesting this mostly simple change for ChromeOS.)
Bug: 258241771
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4350228
Commit-Queue: Joel Fernandes <joelaf@google.com>
Commit-Queue: Vineeth Pillai <vineethrp@google.com>
Tested-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909180
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ibd262189d7f92dbcc57f1508efe90fcfba95a6cc
During suspend, we see failures to suspend 1 in 300-500 suspends.
Looking closer, it appears that asynchronous RCU callbacks are being
queued as lazy even though synchronous callbacks are expedited. These
delays appear to not be very welcome by the suspend/resume code as
evidenced by these occasional suspend failures.
This commit modifies call_rcu() to check if rcu_async_should_hurry(),
which will return true if we are in suspend or in-kernel boot.
[ paulmck: Alphabetize local variables. ]
Ignoring the lazy hint makes the 3000 suspend/resume cycles pass
reliably on a 12th gen 12-core Intel CPU, and there is some evidence
that it also slightly speeds up boot performance.
Fixes: 3cb278e73b ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit cf7066b97e)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909179
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I4cfe6f43de8bae9a6c034831c79d9773199d6d29
Boot and suspend/resume should not be slowed down in kernels built with
CONFIG_RCU_LAZY=y. In particular, suspend can sometimes fail in such
kernels.
This commit therefore adds rcu_async_hurry(), rcu_async_relax(), and
rcu_async_should_hurry() functions that track whether or not either
a boot or a suspend/resume operation is in progress. This will
enable a later commit to refrain from laziness during those times.
Export rcu_async_should_hurry(), rcu_async_hurry(), and rcu_async_relax()
for later use by rcutorture.
[ paulmck: Apply feedback from Steve Rostedt. ]
Fixes: 3cb278e73b ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 6efdda8bec)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909178
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ieb2f2d484a33cfbd71f71c8e3dbcfc05cd7efe8c
In a networking test on ChromeOS, kernels built with the new
CONFIG_RCU_LAZY=y Kconfig option fail a networking test in the teardown
phase.
This failure may be reproduced as follows: ip netns del <name>
The CONFIG_RCU_LAZY=y Kconfig option was introduced by earlier commits
in this series for the benefit of certain battery-powered systems.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
Returning to the test failure, use of ftrace showed that this failure
cause caused by the aadded delays due to this new lazy behavior of
call_rcu() in kernels built with CONFIG_RCU_LAZY=y.
Therefore, make dst_release() use call_rcu_hurry() in order to revert
to the old test-failure-free behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 483c26ff63)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909041
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ifd64083bd210a9dfe94c179152f27d310c179507
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
And another call_rcu() instance that cannot be lazy is the one
in queue_rcu_work(), given that callers to queue_rcu_work() are
not necessarily OK with long delays.
Therefore, make queue_rcu_work() use call_rcu_hurry() in order to revert
to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit a7e30c0e9a)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909040
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I1dd4cedd1fb02626fa47f88a7fbaa7cacfa95d11
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order to
batch callbacks. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
And another call_rcu() instance that cannot be lazy is the one on the
percpu refcounter's "per-CPU to atomic switch" code path, which
uses RCU when switching to atomic mode. The enqueued callback
wakes up waiters waiting in the percpu_ref_switch_waitq. Allowing
this callback to be lazy would result in unacceptable slowdowns for
users of per-CPU refcounts, such as blk_pre_runtime_suspend().
Therefore, make __percpu_ref_switch_to_atomic() use call_rcu_hurry()
in order to revert to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: <linux-mm@kvack.org>
(cherry picked from commit 343a72e5e3)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909039
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Icc325f69d0df1a37b6f1de02a284e1fabf20e366
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.
This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.
Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.
However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.
Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.
And another call_rcu() instance that cannot be lazy is the one in the
scsi_eh_scmd_add() function. Leaving this instance lazy results in
unacceptably slow boot times.
Therefore, make scsi_eh_scmd_add() use call_rcu_hurry() in order to
revert to the old behavior.
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: <linux-scsi@vger.kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 54d87b0a0c)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909036
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I95bba865e582b0a12b1c09ba1f0bd4f897401c07
Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.
By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_hurry() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.
The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.
To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.
[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]
[ paulmck: Apply Zqiang feedback. ]
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Suggested-by: Paul McKenney <paulmck@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 3cb278e73b)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909030
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I557d5af2a5d317bd66e9ec55ed40822bb5c54390
When the bypass cblist gets too big or its timeout has occurred, it is
flushed into the main cblist. However, the bypass timer is still running
and the behavior is that it would eventually expire and wake the GP
thread.
Since we are going to use the bypass cblist for lazy CBs, do the wakeup
soon as the flush for "too big or too long" bypass list happens.
Otherwise, long delays can happen for callbacks which get promoted from
lazy to non-lazy.
This is a good thing to do anyway (regardless of future lazy patches),
since it makes the behavior consistent with behavior of other code paths
where flushing into the ->cblist makes the GP kthread into a
non-sleeping state quickly.
[ Frederic Weisbecker: Changes to avoid unnecessary GP-thread wakeups plus
comment changes. ]
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit b50606f35f)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909028
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: If8da96d7ba6ed90a2a70f7d56f7bb03af44fd649
This patch adds GKI symbol list for Exynosauto Soc.
We need to add 1 function(flush_signals) symbol to send buffer
to other domains.
1 function symbol(s) added
'void flush_signals(struct task_struct*)'
Bug: 320368458
Signed-off-by: Daehyun Seo <dae.seo@samsung.com>
Change-Id: I66a9264b70dc24f30029b413077363996b3339cd
[ Upstream commit 3701cd390fd731ee7ae8b8006246c8db82c72bea ]
If dynset expressions provided by userspace is larger than the declared
set expressions, then bail out.
Bug: 316085841
Fixes: 48b0ae046e ("netfilter: nftables: netlink support for several set element expressions")
Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit cf5f113c41)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I4bd3f7e9148d4bc12bbc67ecdd605c2957eb8010
Add hooks to support oem's binder feature of improving certain
scenarios sched priority by moving these scenarios' work to
a fixed binder thread.
Add the following new vendor hooks to drivers/android/binder.c:
1 trace_android_vh_binder_spawn_new_thread
in our os, some binder_transaction will be marked as vip flag,
it can be named vip transaction, the binder_work within the vip
transaction can be named vip work.
here will force a thread (named vip thread) to be spawned and
skip the normal conditions to spawn a thread. vip thread will
just select vip transaction to process.
2 trace_android_vh_binder_ioctl_end
in our os, in binder_proc, about binder threads,special thread
(called vip thread) will work for special binder_transaction
(called vip transaction).
here will expand one ioctl cmd for binder driver, it will
set the max count about special thread (called vip thread) in
binder_proc, and if it has vip thread in the binder_proc.
3 trace_android_vh_binder_looper_exited
while BC_REGISTER_LOOPER cmd, will set special thread as vip
thread. the flag saved in binder_thread:looper
here will unset the vip flag saved in binder_thread,
while BC_EXIT_LOOPER cmd, if the thread is vip thread(reference
above about vip thread)
4 trace_android_vh_binder_has_special_work_ilocked
for special binder thread(called vip thread), it will deal with
special binder_work (called vip work) within special transaction
(call vip transaction),
here, if the thread is vip thread, it will check the vip work if
exist or not.
5 trace_android_vh_binder_select_special_worklist
for special binder thread(called vip thread), it will select the
worklist for special binder_work(called vip work) within special
binder_transaction(called vip transaction)
here, it will make sure the selected worklist, the head of it is
vip work within vip transaction
Bug: 318782978
Change-Id: I8e544d9be2644a6144a9cfbd477e087d46b0073f
Signed-off-by: songfeng <songfeng@oppo.com>