mirror of
https://github.com/hardkernel/linux.git
synced 2026-06-06 19:08:57 +09:00
Merge 1e57930e9f ("Merge tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu") into android-mainline
Steps on the way to 5.19-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I0b3c706909f2309bcfe1d4748e6bcb67df0337a0
This commit is contained in:
51
Documentation/ABI/testing/securityfs-secrets-coco
Normal file
51
Documentation/ABI/testing/securityfs-secrets-coco
Normal file
@@ -0,0 +1,51 @@
|
||||
What: security/secrets/coco
|
||||
Date: February 2022
|
||||
Contact: Dov Murik <dovmurik@linux.ibm.com>
|
||||
Description:
|
||||
Exposes confidential computing (coco) EFI secrets to
|
||||
userspace via securityfs.
|
||||
|
||||
EFI can declare memory area used by confidential computing
|
||||
platforms (such as AMD SEV and SEV-ES) for secret injection by
|
||||
the Guest Owner during VM's launch. The secrets are encrypted
|
||||
by the Guest Owner and decrypted inside the trusted enclave,
|
||||
and therefore are not readable by the untrusted host.
|
||||
|
||||
The efi_secret module exposes the secrets to userspace. Each
|
||||
secret appears as a file under <securityfs>/secrets/coco,
|
||||
where the filename is the GUID of the entry in the secrets
|
||||
table. This module is loaded automatically by the EFI driver
|
||||
if the EFI secret area is populated.
|
||||
|
||||
Two operations are supported for the files: read and unlink.
|
||||
Reading the file returns the content of secret entry.
|
||||
Unlinking the file overwrites the secret data with zeroes and
|
||||
removes the entry from the filesystem. A secret cannot be read
|
||||
after it has been unlinked.
|
||||
|
||||
For example, listing the available secrets::
|
||||
|
||||
# modprobe efi_secret
|
||||
# ls -l /sys/kernel/security/secrets/coco
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
|
||||
Reading the secret data by reading a file::
|
||||
|
||||
# cat /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
the-content-of-the-secret-data
|
||||
|
||||
Wiping a secret by unlinking a file::
|
||||
|
||||
# rm /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
# ls -l /sys/kernel/security/secrets/coco
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2
|
||||
|
||||
Note: The binary format of the secrets table injected by the
|
||||
Guest Owner is described in
|
||||
drivers/virt/coco/efi_secret/efi_secret.c under "Structure of
|
||||
the EFI secret area".
|
||||
@@ -973,7 +973,7 @@ The ``->dynticks`` field counts the corresponding CPU's transitions to
|
||||
and from either dyntick-idle or user mode, so that this counter has an
|
||||
even value when the CPU is in dyntick-idle mode or user mode and an odd
|
||||
value otherwise. The transitions to/from user mode need to be counted
|
||||
for user mode adaptive-ticks support (see timers/NO_HZ.txt).
|
||||
for user mode adaptive-ticks support (see Documentation/timers/no_hz.rst).
|
||||
|
||||
The ``->rcu_need_heavy_qs`` field is used to record the fact that the
|
||||
RCU core code would really like to see a quiescent state from the
|
||||
|
||||
@@ -406,7 +406,7 @@ In earlier implementations, the task requesting the expedited grace
|
||||
period also drove it to completion. This straightforward approach had
|
||||
the disadvantage of needing to account for POSIX signals sent to user
|
||||
tasks, so more recent implemementations use the Linux kernel's
|
||||
`workqueues <https://www.kernel.org/doc/Documentation/core-api/workqueue.rst>`__.
|
||||
workqueues (see Documentation/core-api/workqueue.rst).
|
||||
|
||||
The requesting task still does counter snapshotting and funnel-lock
|
||||
processing, but the task reaching the top of the funnel lock does a
|
||||
|
||||
@@ -370,8 +370,8 @@ pointer fetched by rcu_dereference() may not be used outside of the
|
||||
outermost RCU read-side critical section containing that
|
||||
rcu_dereference(), unless protection of the corresponding data
|
||||
element has been passed from RCU to some other synchronization
|
||||
mechanism, most commonly locking or `reference
|
||||
counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
|
||||
mechanism, most commonly locking or reference counting
|
||||
(see ../../rcuref.rst).
|
||||
|
||||
.. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF]
|
||||
.. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf
|
||||
@@ -2654,6 +2654,38 @@ synchronize_rcu(), and rcu_barrier(), respectively. In
|
||||
three APIs are therefore implemented by separate functions that check
|
||||
for voluntary context switches.
|
||||
|
||||
Tasks Rude RCU
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Some forms of tracing need to wait for all preemption-disabled regions
|
||||
of code running on any online CPU, including those executed when RCU is
|
||||
not watching. This means that synchronize_rcu() is insufficient, and
|
||||
Tasks Rude RCU must be used instead. This flavor of RCU does its work by
|
||||
forcing a workqueue to be scheduled on each online CPU, hence the "Rude"
|
||||
moniker. And this operation is considered to be quite rude by real-time
|
||||
workloads that don't want their ``nohz_full`` CPUs receiving IPIs and
|
||||
by battery-powered systems that don't want their idle CPUs to be awakened.
|
||||
|
||||
The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
|
||||
consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(),
|
||||
and rcu_barrier_tasks_rude().
|
||||
|
||||
Tasks Trace RCU
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Some forms of tracing need to sleep in readers, but cannot tolerate
|
||||
SRCU's read-side overhead, which includes a full memory barrier in both
|
||||
srcu_read_lock() and srcu_read_unlock(). This need is handled by a
|
||||
Tasks Trace RCU that uses scheduler locking and IPIs to synchronize with
|
||||
readers. Real-time systems that cannot tolerate IPIs may build their
|
||||
kernels with ``CONFIG_TASKS_TRACE_RCU_READ_MB=y``, which avoids the IPIs at
|
||||
the expense of adding full memory barriers to the read-side primitives.
|
||||
|
||||
The tasks-trace-RCU API is also reasonably compact,
|
||||
consisting of rcu_read_lock_trace(), rcu_read_unlock_trace(),
|
||||
rcu_read_lock_trace_held(), call_rcu_tasks_trace(),
|
||||
synchronize_rcu_tasks_trace(), and rcu_barrier_tasks_trace().
|
||||
|
||||
Possible Future Changes
|
||||
-----------------------
|
||||
|
||||
|
||||
@@ -33,8 +33,8 @@ Situation 1: Hash Tables
|
||||
|
||||
Hash tables are often implemented as an array, where each array entry
|
||||
has a linked-list hash chain. Each hash chain can be protected by RCU
|
||||
as described in the listRCU.txt document. This approach also applies
|
||||
to other array-of-list situations, such as radix trees.
|
||||
as described in listRCU.rst. This approach also applies to other
|
||||
array-of-list situations, such as radix trees.
|
||||
|
||||
.. _static_arrays:
|
||||
|
||||
|
||||
@@ -140,8 +140,7 @@ over a rather long period of time, but improvements are always welcome!
|
||||
prevents destructive compiler optimizations. However,
|
||||
with a bit of devious creativity, it is possible to
|
||||
mishandle the return value from rcu_dereference().
|
||||
Please see rcu_dereference.txt in this directory for
|
||||
more information.
|
||||
Please see rcu_dereference.rst for more information.
|
||||
|
||||
The rcu_dereference() primitive is used by the
|
||||
various "_rcu()" list-traversal primitives, such
|
||||
@@ -151,7 +150,7 @@ over a rather long period of time, but improvements are always welcome!
|
||||
primitives. This is particularly useful in code that
|
||||
is common to readers and updaters. However, lockdep
|
||||
will complain if you access rcu_dereference() outside
|
||||
of an RCU read-side critical section. See lockdep.txt
|
||||
of an RCU read-side critical section. See lockdep.rst
|
||||
to learn what to do about this.
|
||||
|
||||
Of course, neither rcu_dereference() nor the "_rcu()"
|
||||
@@ -323,7 +322,7 @@ over a rather long period of time, but improvements are always welcome!
|
||||
primitives when the update-side lock is held is that doing so
|
||||
can be quite helpful in reducing code bloat when common code is
|
||||
shared between readers and updaters. Additional primitives
|
||||
are provided for this case, as discussed in lockdep.txt.
|
||||
are provided for this case, as discussed in lockdep.rst.
|
||||
|
||||
One exception to this rule is when data is only ever added to
|
||||
the linked data structure, and is never removed during any
|
||||
@@ -480,4 +479,4 @@ over a rather long period of time, but improvements are always welcome!
|
||||
both rcu_barrier() and synchronize_rcu(), if necessary, using
|
||||
something like workqueues to to execute them concurrently.
|
||||
|
||||
See rcubarrier.txt for more information.
|
||||
See rcubarrier.rst for more information.
|
||||
|
||||
@@ -10,9 +10,8 @@ A "grace period" must elapse between the two parts, and this grace period
|
||||
must be long enough that any readers accessing the item being deleted have
|
||||
since dropped their references. For example, an RCU-protected deletion
|
||||
from a linked list would first remove the item from the list, wait for
|
||||
a grace period to elapse, then free the element. See the
|
||||
:ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` for more information on
|
||||
using RCU with linked lists.
|
||||
a grace period to elapse, then free the element. See listRCU.rst for more
|
||||
information on using RCU with linked lists.
|
||||
|
||||
Frequently Asked Questions
|
||||
--------------------------
|
||||
@@ -50,7 +49,7 @@ Frequently Asked Questions
|
||||
- If I am running on a uniprocessor kernel, which can only do one
|
||||
thing at a time, why should I wait for a grace period?
|
||||
|
||||
See :ref:`Documentation/RCU/UP.rst <up_doc>` for more information.
|
||||
See UP.rst for more information.
|
||||
|
||||
- How can I see where RCU is currently used in the Linux kernel?
|
||||
|
||||
@@ -64,13 +63,13 @@ Frequently Asked Questions
|
||||
|
||||
- What guidelines should I follow when writing code that uses RCU?
|
||||
|
||||
See the checklist.txt file in this directory.
|
||||
See checklist.rst.
|
||||
|
||||
- Why the name "RCU"?
|
||||
|
||||
"RCU" stands for "read-copy update".
|
||||
:ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` has more information on where
|
||||
this name came from, search for "read-copy update" to find it.
|
||||
listRCU.rst has more information on where this name came from, search
|
||||
for "read-copy update" to find it.
|
||||
|
||||
- I hear that RCU is patented? What is with that?
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ This section describes how to use hlist_nulls to
|
||||
protect read-mostly linked lists and
|
||||
objects using SLAB_TYPESAFE_BY_RCU allocations.
|
||||
|
||||
Please read the basics in Documentation/RCU/listRCU.rst
|
||||
Please read the basics in listRCU.rst.
|
||||
|
||||
Using 'nulls'
|
||||
=============
|
||||
|
||||
@@ -162,6 +162,26 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
|
||||
Stall-warning messages may be enabled and disabled completely via
|
||||
/sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.
|
||||
|
||||
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
|
||||
--------------------------------
|
||||
|
||||
Same as the CONFIG_RCU_CPU_STALL_TIMEOUT parameter but only for
|
||||
the expedited grace period. This parameter defines the period
|
||||
of time that RCU will wait from the beginning of an expedited
|
||||
grace period until it issues an RCU CPU stall warning. This time
|
||||
period is normally 20 milliseconds on Android devices. A zero
|
||||
value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be used,
|
||||
after conversion to milliseconds.
|
||||
|
||||
This configuration parameter may be changed at runtime via the
|
||||
/sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout, however
|
||||
this parameter is checked only at the beginning of a cycle. If you
|
||||
are in a current stall cycle, setting it to a new value will change
|
||||
the timeout for the -next- stall.
|
||||
|
||||
Stall-warning messages may be enabled and disabled completely via
|
||||
/sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.
|
||||
|
||||
RCU_STALL_DELAY_DELTA
|
||||
---------------------
|
||||
|
||||
|
||||
@@ -224,7 +224,7 @@ synchronize_rcu()
|
||||
be delayed. This property results in system resilience in face
|
||||
of denial-of-service attacks. Code using call_rcu() should limit
|
||||
update rate in order to gain this same sort of resilience. See
|
||||
checklist.txt for some approaches to limiting the update rate.
|
||||
checklist.rst for some approaches to limiting the update rate.
|
||||
|
||||
rcu_assign_pointer()
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
@@ -318,7 +318,7 @@ rcu_dereference()
|
||||
must prohibit. The rcu_dereference_protected() variant takes
|
||||
a lockdep expression to indicate which locks must be acquired
|
||||
by the caller. If the indicated protection is not provided,
|
||||
a lockdep splat is emitted. See Documentation/RCU/Design/Requirements/Requirements.rst
|
||||
a lockdep splat is emitted. See Design/Requirements/Requirements.rst
|
||||
and the API's code comments for more details and example usage.
|
||||
|
||||
.. [2] If the list_for_each_entry_rcu() instance might be used by
|
||||
@@ -399,8 +399,7 @@ for specialized uses, but are relatively uncommon.
|
||||
|
||||
This section shows a simple use of the core RCU API to protect a
|
||||
global pointer to a dynamically allocated structure. More-typical
|
||||
uses of RCU may be found in :ref:`listRCU.rst <list_rcu_doc>`,
|
||||
:ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst <NMI_rcu_doc>`.
|
||||
uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst.
|
||||
::
|
||||
|
||||
struct foo {
|
||||
@@ -482,10 +481,9 @@ So, to sum up:
|
||||
RCU read-side critical sections that might be referencing that
|
||||
data item.
|
||||
|
||||
See checklist.txt for additional rules to follow when using RCU.
|
||||
And again, more-typical uses of RCU may be found in :ref:`listRCU.rst
|
||||
<list_rcu_doc>`, :ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst
|
||||
<NMI_rcu_doc>`.
|
||||
See checklist.rst for additional rules to follow when using RCU.
|
||||
And again, more-typical uses of RCU may be found in listRCU.rst,
|
||||
arrayRCU.rst, and NMI-RCU.rst.
|
||||
|
||||
.. _4_whatisRCU:
|
||||
|
||||
@@ -579,7 +577,7 @@ to avoid having to write your own callback::
|
||||
|
||||
kfree_rcu(old_fp, rcu);
|
||||
|
||||
Again, see checklist.txt for additional rules governing the use of RCU.
|
||||
Again, see checklist.rst for additional rules governing the use of RCU.
|
||||
|
||||
.. _5_whatisRCU:
|
||||
|
||||
@@ -663,7 +661,7 @@ been able to write-acquire the lock otherwise. The smp_mb__after_spinlock()
|
||||
promotes synchronize_rcu() to a full memory barrier in compliance with
|
||||
the "Memory-Barrier Guarantees" listed in:
|
||||
|
||||
Documentation/RCU/Design/Requirements/Requirements.rst
|
||||
Design/Requirements/Requirements.rst
|
||||
|
||||
It is possible to nest rcu_read_lock(), since reader-writer locks may
|
||||
be recursively acquired. Note also that rcu_read_lock() is immune
|
||||
|
||||
@@ -4901,6 +4901,18 @@
|
||||
|
||||
rcupdate.rcu_cpu_stall_timeout= [KNL]
|
||||
Set timeout for RCU CPU stall warning messages.
|
||||
The value is in seconds and the maximum allowed
|
||||
value is 300 seconds.
|
||||
|
||||
rcupdate.rcu_exp_cpu_stall_timeout= [KNL]
|
||||
Set timeout for expedited RCU CPU stall warning
|
||||
messages. The value is in milliseconds
|
||||
and the maximum allowed value is 21000
|
||||
milliseconds. Please note that this value is
|
||||
adjusted to an arch timer tick resolution.
|
||||
Setting this to zero causes the value from
|
||||
rcupdate.rcu_cpu_stall_timeout to be used (after
|
||||
conversion from seconds to milliseconds).
|
||||
|
||||
rcupdate.rcu_expedited= [KNL]
|
||||
Use expedited grace-period primitives, for
|
||||
@@ -4963,10 +4975,34 @@
|
||||
number avoids disturbing real-time workloads,
|
||||
but lengthens grace periods.
|
||||
|
||||
rcupdate.rcu_task_stall_info= [KNL]
|
||||
Set initial timeout in jiffies for RCU task stall
|
||||
informational messages, which give some indication
|
||||
of the problem for those not patient enough to
|
||||
wait for ten minutes. Informational messages are
|
||||
only printed prior to the stall-warning message
|
||||
for a given grace period. Disable with a value
|
||||
less than or equal to zero. Defaults to ten
|
||||
seconds. A change in value does not take effect
|
||||
until the beginning of the next grace period.
|
||||
|
||||
rcupdate.rcu_task_stall_info_mult= [KNL]
|
||||
Multiplier for time interval between successive
|
||||
RCU task stall informational messages for a given
|
||||
RCU tasks grace period. This value is clamped
|
||||
to one through ten, inclusive. It defaults to
|
||||
the value three, so that the first informational
|
||||
message is printed 10 seconds into the grace
|
||||
period, the second at 40 seconds, the third at
|
||||
160 seconds, and then the stall warning at 600
|
||||
seconds would prevent a fourth at 640 seconds.
|
||||
|
||||
rcupdate.rcu_task_stall_timeout= [KNL]
|
||||
Set timeout in jiffies for RCU task stall warning
|
||||
messages. Disable with a value less than or equal
|
||||
to zero.
|
||||
Set timeout in jiffies for RCU task stall
|
||||
warning messages. Disable with a value less
|
||||
than or equal to zero. Defaults to ten minutes.
|
||||
A change in value does not take effect until
|
||||
the beginning of the next grace period.
|
||||
|
||||
rcupdate.rcu_self_test= [KNL]
|
||||
Run the RCU early boot self tests
|
||||
@@ -5385,6 +5421,17 @@
|
||||
smart2= [HW]
|
||||
Format: <io1>[,<io2>[,...,<io8>]]
|
||||
|
||||
smp.csd_lock_timeout= [KNL]
|
||||
Specify the period of time in milliseconds
|
||||
that smp_call_function() and friends will wait
|
||||
for a CPU to release the CSD lock. This is
|
||||
useful when diagnosing bugs involving CPUs
|
||||
disabling interrupts for extended periods
|
||||
of time. Defaults to 5,000 milliseconds, and
|
||||
setting a value of zero disables this feature.
|
||||
This feature may be more efficiently disabled
|
||||
using the csdlock_debug- kernel parameter.
|
||||
|
||||
smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices
|
||||
smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port
|
||||
smsc-ircc2.ircc_sir= [HW] SIR base I/O port
|
||||
@@ -5616,6 +5663,30 @@
|
||||
off: Disable mitigation and remove
|
||||
performance impact to RDRAND and RDSEED
|
||||
|
||||
srcutree.big_cpu_lim [KNL]
|
||||
Specifies the number of CPUs constituting a
|
||||
large system, such that srcu_struct structures
|
||||
should immediately allocate an srcu_node array.
|
||||
This kernel-boot parameter defaults to 128,
|
||||
but takes effect only when the low-order four
|
||||
bits of srcutree.convert_to_big is equal to 3
|
||||
(decide at boot).
|
||||
|
||||
srcutree.convert_to_big [KNL]
|
||||
Specifies under what conditions an SRCU tree
|
||||
srcu_struct structure will be converted to big
|
||||
form, that is, with an rcu_node tree:
|
||||
|
||||
0: Never.
|
||||
1: At init_srcu_struct() time.
|
||||
2: When rcutorture decides to.
|
||||
3: Decide at boot time (default).
|
||||
0x1X: Above plus if high contention.
|
||||
|
||||
Either way, the srcu_node tree will be sized based
|
||||
on the actual runtime number of CPUs (nr_cpu_ids)
|
||||
instead of the compile-time CONFIG_NR_CPUS.
|
||||
|
||||
srcutree.counter_wrap_check [KNL]
|
||||
Specifies how frequently to check for
|
||||
grace-period sequence counter wrap for the
|
||||
@@ -5633,6 +5704,14 @@
|
||||
expediting. Set to zero to disable automatic
|
||||
expediting.
|
||||
|
||||
srcutree.small_contention_lim [KNL]
|
||||
Specifies the number of update-side contention
|
||||
events per jiffy will be tolerated before
|
||||
initiating a conversion of an srcu_struct
|
||||
structure to big form. Note that the value of
|
||||
srcutree.convert_to_big must have the 0x10 bit
|
||||
set for contention-based conversions to occur.
|
||||
|
||||
ssbd= [ARM64,HW]
|
||||
Speculative Store Bypass Disable control
|
||||
|
||||
|
||||
@@ -17,3 +17,4 @@ Security Documentation
|
||||
tpm/index
|
||||
digsig
|
||||
landlock
|
||||
secrets/index
|
||||
|
||||
103
Documentation/security/secrets/coco.rst
Normal file
103
Documentation/security/secrets/coco.rst
Normal file
@@ -0,0 +1,103 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==============================
|
||||
Confidential Computing secrets
|
||||
==============================
|
||||
|
||||
This document describes how Confidential Computing secret injection is handled
|
||||
from the firmware to the operating system, in the EFI driver and the efi_secret
|
||||
kernel module.
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
|
||||
Virtualization) allows guest owners to inject secrets into the VMs
|
||||
memory without the host/hypervisor being able to read them. In SEV,
|
||||
secret injection is performed early in the VM launch process, before the
|
||||
guest starts running.
|
||||
|
||||
The efi_secret kernel module allows userspace applications to access these
|
||||
secrets via securityfs.
|
||||
|
||||
|
||||
Secret data flow
|
||||
================
|
||||
|
||||
The guest firmware may reserve a designated memory area for secret injection,
|
||||
and publish its location (base GPA and length) in the EFI configuration table
|
||||
under a ``LINUX_EFI_COCO_SECRET_AREA_GUID`` entry
|
||||
(``adf956ad-e98c-484c-ae11-b51c7d336447``). This memory area should be marked
|
||||
by the firmware as ``EFI_RESERVED_TYPE``, and therefore the kernel should not
|
||||
be use it for its own purposes.
|
||||
|
||||
During the VM's launch, the virtual machine manager may inject a secret to that
|
||||
area. In AMD SEV and SEV-ES this is performed using the
|
||||
``KVM_SEV_LAUNCH_SECRET`` command (see [sev]_). The strucutre of the injected
|
||||
Guest Owner secret data should be a GUIDed table of secret values; the binary
|
||||
format is described in ``drivers/virt/coco/efi_secret/efi_secret.c`` under
|
||||
"Structure of the EFI secret area".
|
||||
|
||||
On kernel start, the kernel's EFI driver saves the location of the secret area
|
||||
(taken from the EFI configuration table) in the ``efi.coco_secret`` field.
|
||||
Later it checks if the secret area is populated: it maps the area and checks
|
||||
whether its content begins with ``EFI_SECRET_TABLE_HEADER_GUID``
|
||||
(``1e74f542-71dd-4d66-963e-ef4287ff173b``). If the secret area is populated,
|
||||
the EFI driver will autoload the efi_secret kernel module, which exposes the
|
||||
secrets to userspace applications via securityfs. The details of the
|
||||
efi_secret filesystem interface are in [secrets-coco-abi]_.
|
||||
|
||||
|
||||
Application usage example
|
||||
=========================
|
||||
|
||||
Consider a guest performing computations on encrypted files. The Guest Owner
|
||||
provides the decryption key (= secret) using the secret injection mechanism.
|
||||
The guest application reads the secret from the efi_secret filesystem and
|
||||
proceeds to decrypt the files into memory and then performs the needed
|
||||
computations on the content.
|
||||
|
||||
In this example, the host can't read the files from the disk image
|
||||
because they are encrypted. Host can't read the decryption key because
|
||||
it is passed using the secret injection mechanism (= secure channel).
|
||||
Host can't read the decrypted content from memory because it's a
|
||||
confidential (memory-encrypted) guest.
|
||||
|
||||
Here is a simple example for usage of the efi_secret module in a guest
|
||||
to which an EFI secret area with 4 secrets was injected during launch::
|
||||
|
||||
# ls -la /sys/kernel/security/secrets/coco
|
||||
total 0
|
||||
drwxr-xr-x 2 root root 0 Jun 28 11:54 .
|
||||
drwxr-xr-x 3 root root 0 Jun 28 11:54 ..
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
|
||||
# hd /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
00000000 74 68 65 73 65 2d 61 72 65 2d 74 68 65 2d 6b 61 |these-are-the-ka|
|
||||
00000010 74 61 2d 73 65 63 72 65 74 73 00 01 02 03 04 05 |ta-secrets......|
|
||||
00000020 06 07 |..|
|
||||
00000022
|
||||
|
||||
# rm /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910
|
||||
|
||||
# ls -la /sys/kernel/security/secrets/coco
|
||||
total 0
|
||||
drwxr-xr-x 2 root root 0 Jun 28 11:55 .
|
||||
drwxr-xr-x 3 root root 0 Jun 28 11:54 ..
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6
|
||||
-r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
See [sev-api-spec]_ for more info regarding SEV ``LAUNCH_SECRET`` operation.
|
||||
|
||||
.. [sev] Documentation/virt/kvm/amd-memory-encryption.rst
|
||||
.. [secrets-coco-abi] Documentation/ABI/testing/securityfs-secrets-coco
|
||||
.. [sev-api-spec] https://www.amd.com/system/files/TechDocs/55766_SEV-KM_API_Specification.pdf
|
||||
9
Documentation/security/secrets/index.rst
Normal file
9
Documentation/security/secrets/index.rst
Normal file
@@ -0,0 +1,9 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=====================
|
||||
Secrets documentation
|
||||
=====================
|
||||
|
||||
.. toctree::
|
||||
|
||||
coco
|
||||
@@ -35,6 +35,7 @@ config KPROBES
|
||||
depends on MODULES
|
||||
depends on HAVE_KPROBES
|
||||
select KALLSYMS
|
||||
select TASKS_RCU if PREEMPTION
|
||||
help
|
||||
Kprobes allows you to trap at almost any kernel address and
|
||||
execute a callback function. register_kprobe() establishes
|
||||
|
||||
@@ -163,7 +163,11 @@ extra_header_fields:
|
||||
.long 0x200 # SizeOfHeaders
|
||||
.long 0 # CheckSum
|
||||
.word IMAGE_SUBSYSTEM_EFI_APPLICATION # Subsystem (EFI application)
|
||||
#ifdef CONFIG_DXE_MEM_ATTRIBUTES
|
||||
.word IMAGE_DLL_CHARACTERISTICS_NX_COMPAT # DllCharacteristics
|
||||
#else
|
||||
.word 0 # DllCharacteristics
|
||||
#endif
|
||||
#ifdef CONFIG_X86_32
|
||||
.long 0 # SizeOfStackReserve
|
||||
.long 0 # SizeOfStackCommit
|
||||
|
||||
@@ -357,6 +357,11 @@ static inline u32 efi64_convert_status(efi_status_t status)
|
||||
runtime), \
|
||||
func, __VA_ARGS__))
|
||||
|
||||
#define efi_dxe_call(func, ...) \
|
||||
(efi_is_native() \
|
||||
? efi_dxe_table->func(__VA_ARGS__) \
|
||||
: __efi64_thunk_map(efi_dxe_table, func, __VA_ARGS__))
|
||||
|
||||
#else /* CONFIG_EFI_MIXED */
|
||||
|
||||
static inline bool efi_is_64bit(void)
|
||||
|
||||
@@ -93,6 +93,9 @@ static const unsigned long * const efi_tables[] = {
|
||||
#ifdef CONFIG_LOAD_UEFI_KEYS
|
||||
&efi.mokvar_table,
|
||||
#endif
|
||||
#ifdef CONFIG_EFI_COCO_SECRET
|
||||
&efi.coco_secret,
|
||||
#endif
|
||||
};
|
||||
|
||||
u64 efi_setup; /* efi setup_data physical address */
|
||||
|
||||
@@ -91,6 +91,18 @@ config EFI_SOFT_RESERVE
|
||||
|
||||
If unsure, say Y.
|
||||
|
||||
config EFI_DXE_MEM_ATTRIBUTES
|
||||
bool "Adjust memory attributes in EFISTUB"
|
||||
depends on EFI && EFI_STUB && X86
|
||||
default y
|
||||
help
|
||||
UEFI specification does not guarantee all memory to be
|
||||
accessible for both write and execute as the kernel expects
|
||||
it to be.
|
||||
Use DXE services to check and alter memory protection
|
||||
attributes during boot via EFISTUB to ensure that memory
|
||||
ranges used by the kernel are writable and executable.
|
||||
|
||||
config EFI_PARAMS_FROM_FDT
|
||||
bool
|
||||
help
|
||||
@@ -284,3 +296,34 @@ config EFI_CUSTOM_SSDT_OVERLAYS
|
||||
|
||||
See Documentation/admin-guide/acpi/ssdt-overlays.rst for more
|
||||
information.
|
||||
|
||||
config EFI_DISABLE_RUNTIME
|
||||
bool "Disable EFI runtime services support by default"
|
||||
default y if PREEMPT_RT
|
||||
help
|
||||
Allow to disable the EFI runtime services support by default. This can
|
||||
already be achieved by using the efi=noruntime option, but it could be
|
||||
useful to have this default without any kernel command line parameter.
|
||||
|
||||
The EFI runtime services are disabled by default when PREEMPT_RT is
|
||||
enabled, because measurements have shown that some EFI functions calls
|
||||
might take too much time to complete, causing large latencies which is
|
||||
an issue for Real-Time kernels.
|
||||
|
||||
This default can be overridden by using the efi=runtime option.
|
||||
|
||||
config EFI_COCO_SECRET
|
||||
bool "EFI Confidential Computing Secret Area Support"
|
||||
depends on EFI
|
||||
help
|
||||
Confidential Computing platforms (such as AMD SEV) allow the
|
||||
Guest Owner to securely inject secrets during guest VM launch.
|
||||
The secrets are placed in a designated EFI reserved memory area.
|
||||
|
||||
In order to use the secrets in the kernel, the location of the secret
|
||||
area (as published in the EFI config table) must be kept.
|
||||
|
||||
If you say Y here, the address of the EFI secret area will be kept
|
||||
for usage inside the kernel. This will allow the
|
||||
virt/coco/efi_secret module to access the secrets, which in turn
|
||||
allows userspace programs to access the injected secrets.
|
||||
|
||||
@@ -46,6 +46,9 @@ struct efi __read_mostly efi = {
|
||||
#ifdef CONFIG_LOAD_UEFI_KEYS
|
||||
.mokvar_table = EFI_INVALID_TABLE_ADDR,
|
||||
#endif
|
||||
#ifdef CONFIG_EFI_COCO_SECRET
|
||||
.coco_secret = EFI_INVALID_TABLE_ADDR,
|
||||
#endif
|
||||
};
|
||||
EXPORT_SYMBOL(efi);
|
||||
|
||||
@@ -66,7 +69,7 @@ struct mm_struct efi_mm = {
|
||||
|
||||
struct workqueue_struct *efi_rts_wq;
|
||||
|
||||
static bool disable_runtime = IS_ENABLED(CONFIG_PREEMPT_RT);
|
||||
static bool disable_runtime = IS_ENABLED(CONFIG_EFI_DISABLE_RUNTIME);
|
||||
static int __init setup_noefi(char *arg)
|
||||
{
|
||||
disable_runtime = true;
|
||||
@@ -422,6 +425,11 @@ static int __init efisubsys_init(void)
|
||||
if (efi_enabled(EFI_DBG) && efi_enabled(EFI_PRESERVE_BS_REGIONS))
|
||||
efi_debugfs_init();
|
||||
|
||||
#ifdef CONFIG_EFI_COCO_SECRET
|
||||
if (efi.coco_secret != EFI_INVALID_TABLE_ADDR)
|
||||
platform_device_register_simple("efi_secret", 0, NULL, 0);
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
|
||||
err_remove_group:
|
||||
@@ -528,6 +536,9 @@ static const efi_config_table_type_t common_tables[] __initconst = {
|
||||
#endif
|
||||
#ifdef CONFIG_LOAD_UEFI_KEYS
|
||||
{LINUX_EFI_MOK_VARIABLE_TABLE_GUID, &efi.mokvar_table, "MOKvar" },
|
||||
#endif
|
||||
#ifdef CONFIG_EFI_COCO_SECRET
|
||||
{LINUX_EFI_COCO_SECRET_AREA_GUID, &efi.coco_secret, "CocoSecret" },
|
||||
#endif
|
||||
{},
|
||||
};
|
||||
|
||||
@@ -117,7 +117,8 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
unsigned long *image_size,
|
||||
unsigned long *reserve_addr,
|
||||
unsigned long *reserve_size,
|
||||
efi_loaded_image_t *image)
|
||||
efi_loaded_image_t *image,
|
||||
efi_handle_t image_handle)
|
||||
{
|
||||
const int slack = TEXT_OFFSET - 5 * PAGE_SIZE;
|
||||
int alloc_size = MAX_UNCOMP_KERNEL_SIZE + EFI_PHYS_ALIGN;
|
||||
|
||||
@@ -83,7 +83,8 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
unsigned long *image_size,
|
||||
unsigned long *reserve_addr,
|
||||
unsigned long *reserve_size,
|
||||
efi_loaded_image_t *image)
|
||||
efi_loaded_image_t *image,
|
||||
efi_handle_t image_handle)
|
||||
{
|
||||
efi_status_t status;
|
||||
unsigned long kernel_size, kernel_memsize = 0;
|
||||
@@ -100,7 +101,15 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
u64 min_kimg_align = efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
|
||||
|
||||
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
|
||||
if (!efi_nokaslr) {
|
||||
efi_guid_t li_fixed_proto = LINUX_EFI_LOADED_IMAGE_FIXED_GUID;
|
||||
void *p;
|
||||
|
||||
if (efi_nokaslr) {
|
||||
efi_info("KASLR disabled on kernel command line\n");
|
||||
} else if (efi_bs_call(handle_protocol, image_handle,
|
||||
&li_fixed_proto, &p) == EFI_SUCCESS) {
|
||||
efi_info("Image placement fixed by loader\n");
|
||||
} else {
|
||||
status = efi_get_random_bytes(sizeof(phys_seed),
|
||||
(u8 *)&phys_seed);
|
||||
if (status == EFI_NOT_FOUND) {
|
||||
@@ -111,8 +120,6 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
status);
|
||||
efi_nokaslr = true;
|
||||
}
|
||||
} else {
|
||||
efi_info("KASLR disabled on kernel command line\n");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -198,7 +198,7 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle,
|
||||
status = handle_kernel_image(&image_addr, &image_size,
|
||||
&reserve_addr,
|
||||
&reserve_size,
|
||||
image);
|
||||
image, handle);
|
||||
if (status != EFI_SUCCESS) {
|
||||
efi_err("Failed to relocate kernel\n");
|
||||
goto fail_free_screeninfo;
|
||||
|
||||
@@ -36,6 +36,9 @@ extern bool efi_novamap;
|
||||
|
||||
extern const efi_system_table_t *efi_system_table;
|
||||
|
||||
typedef union efi_dxe_services_table efi_dxe_services_table_t;
|
||||
extern const efi_dxe_services_table_t *efi_dxe_table;
|
||||
|
||||
efi_status_t __efiapi efi_pe_entry(efi_handle_t handle,
|
||||
efi_system_table_t *sys_table_arg);
|
||||
|
||||
@@ -44,6 +47,7 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle,
|
||||
#define efi_is_native() (true)
|
||||
#define efi_bs_call(func, ...) efi_system_table->boottime->func(__VA_ARGS__)
|
||||
#define efi_rt_call(func, ...) efi_system_table->runtime->func(__VA_ARGS__)
|
||||
#define efi_dxe_call(func, ...) efi_dxe_table->func(__VA_ARGS__)
|
||||
#define efi_table_attr(inst, attr) (inst->attr)
|
||||
#define efi_call_proto(inst, func, ...) inst->func(inst, ##__VA_ARGS__)
|
||||
|
||||
@@ -329,6 +333,76 @@ union efi_boot_services {
|
||||
} mixed_mode;
|
||||
};
|
||||
|
||||
typedef enum {
|
||||
EfiGcdMemoryTypeNonExistent,
|
||||
EfiGcdMemoryTypeReserved,
|
||||
EfiGcdMemoryTypeSystemMemory,
|
||||
EfiGcdMemoryTypeMemoryMappedIo,
|
||||
EfiGcdMemoryTypePersistent,
|
||||
EfiGcdMemoryTypeMoreReliable,
|
||||
EfiGcdMemoryTypeMaximum
|
||||
} efi_gcd_memory_type_t;
|
||||
|
||||
typedef struct {
|
||||
efi_physical_addr_t base_address;
|
||||
u64 length;
|
||||
u64 capabilities;
|
||||
u64 attributes;
|
||||
efi_gcd_memory_type_t gcd_memory_type;
|
||||
void *image_handle;
|
||||
void *device_handle;
|
||||
} efi_gcd_memory_space_desc_t;
|
||||
|
||||
/*
|
||||
* EFI DXE Services table
|
||||
*/
|
||||
union efi_dxe_services_table {
|
||||
struct {
|
||||
efi_table_hdr_t hdr;
|
||||
void *add_memory_space;
|
||||
void *allocate_memory_space;
|
||||
void *free_memory_space;
|
||||
void *remove_memory_space;
|
||||
efi_status_t (__efiapi *get_memory_space_descriptor)(efi_physical_addr_t,
|
||||
efi_gcd_memory_space_desc_t *);
|
||||
efi_status_t (__efiapi *set_memory_space_attributes)(efi_physical_addr_t,
|
||||
u64, u64);
|
||||
void *get_memory_space_map;
|
||||
void *add_io_space;
|
||||
void *allocate_io_space;
|
||||
void *free_io_space;
|
||||
void *remove_io_space;
|
||||
void *get_io_space_descriptor;
|
||||
void *get_io_space_map;
|
||||
void *dispatch;
|
||||
void *schedule;
|
||||
void *trust;
|
||||
void *process_firmware_volume;
|
||||
void *set_memory_space_capabilities;
|
||||
};
|
||||
struct {
|
||||
efi_table_hdr_t hdr;
|
||||
u32 add_memory_space;
|
||||
u32 allocate_memory_space;
|
||||
u32 free_memory_space;
|
||||
u32 remove_memory_space;
|
||||
u32 get_memory_space_descriptor;
|
||||
u32 set_memory_space_attributes;
|
||||
u32 get_memory_space_map;
|
||||
u32 add_io_space;
|
||||
u32 allocate_io_space;
|
||||
u32 free_io_space;
|
||||
u32 remove_io_space;
|
||||
u32 get_io_space_descriptor;
|
||||
u32 get_io_space_map;
|
||||
u32 dispatch;
|
||||
u32 schedule;
|
||||
u32 trust;
|
||||
u32 process_firmware_volume;
|
||||
u32 set_memory_space_capabilities;
|
||||
} mixed_mode;
|
||||
};
|
||||
|
||||
typedef union efi_uga_draw_protocol efi_uga_draw_protocol_t;
|
||||
|
||||
union efi_uga_draw_protocol {
|
||||
@@ -720,6 +794,13 @@ union efi_tcg2_protocol {
|
||||
} mixed_mode;
|
||||
};
|
||||
|
||||
struct riscv_efi_boot_protocol {
|
||||
u64 revision;
|
||||
|
||||
efi_status_t (__efiapi *get_boot_hartid)(struct riscv_efi_boot_protocol *,
|
||||
unsigned long *boot_hartid);
|
||||
};
|
||||
|
||||
typedef union efi_load_file_protocol efi_load_file_protocol_t;
|
||||
typedef union efi_load_file_protocol efi_load_file2_protocol_t;
|
||||
|
||||
@@ -865,7 +946,8 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
unsigned long *image_size,
|
||||
unsigned long *reserve_addr,
|
||||
unsigned long *reserve_size,
|
||||
efi_loaded_image_t *image);
|
||||
efi_loaded_image_t *image,
|
||||
efi_handle_t image_handle);
|
||||
|
||||
asmlinkage void __noreturn efi_enter_kernel(unsigned long entrypoint,
|
||||
unsigned long fdt_addr,
|
||||
|
||||
@@ -56,6 +56,7 @@ efi_status_t efi_random_alloc(unsigned long size,
|
||||
unsigned long random_seed)
|
||||
{
|
||||
unsigned long map_size, desc_size, total_slots = 0, target_slot;
|
||||
unsigned long total_mirrored_slots = 0;
|
||||
unsigned long buff_size;
|
||||
efi_status_t status;
|
||||
efi_memory_desc_t *memory_map;
|
||||
@@ -86,8 +87,14 @@ efi_status_t efi_random_alloc(unsigned long size,
|
||||
slots = get_entry_num_slots(md, size, ilog2(align));
|
||||
MD_NUM_SLOTS(md) = slots;
|
||||
total_slots += slots;
|
||||
if (md->attribute & EFI_MEMORY_MORE_RELIABLE)
|
||||
total_mirrored_slots += slots;
|
||||
}
|
||||
|
||||
/* consider only mirrored slots for randomization if any exist */
|
||||
if (total_mirrored_slots > 0)
|
||||
total_slots = total_mirrored_slots;
|
||||
|
||||
/* find a random number between 0 and total_slots */
|
||||
target_slot = (total_slots * (u64)(random_seed & U32_MAX)) >> 32;
|
||||
|
||||
@@ -107,6 +114,10 @@ efi_status_t efi_random_alloc(unsigned long size,
|
||||
efi_physical_addr_t target;
|
||||
unsigned long pages;
|
||||
|
||||
if (total_mirrored_slots > 0 &&
|
||||
!(md->attribute & EFI_MEMORY_MORE_RELIABLE))
|
||||
continue;
|
||||
|
||||
if (target_slot >= MD_NUM_SLOTS(md)) {
|
||||
target_slot -= MD_NUM_SLOTS(md);
|
||||
continue;
|
||||
|
||||
@@ -21,9 +21,9 @@
|
||||
#define MIN_KIMG_ALIGN SZ_4M
|
||||
#endif
|
||||
|
||||
typedef void __noreturn (*jump_kernel_func)(unsigned int, unsigned long);
|
||||
typedef void __noreturn (*jump_kernel_func)(unsigned long, unsigned long);
|
||||
|
||||
static u32 hartid;
|
||||
static unsigned long hartid;
|
||||
|
||||
static int get_boot_hartid_from_fdt(void)
|
||||
{
|
||||
@@ -47,14 +47,31 @@ static int get_boot_hartid_from_fdt(void)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static efi_status_t get_boot_hartid_from_efi(void)
|
||||
{
|
||||
efi_guid_t boot_protocol_guid = RISCV_EFI_BOOT_PROTOCOL_GUID;
|
||||
struct riscv_efi_boot_protocol *boot_protocol;
|
||||
efi_status_t status;
|
||||
|
||||
status = efi_bs_call(locate_protocol, &boot_protocol_guid, NULL,
|
||||
(void **)&boot_protocol);
|
||||
if (status != EFI_SUCCESS)
|
||||
return status;
|
||||
return efi_call_proto(boot_protocol, get_boot_hartid, &hartid);
|
||||
}
|
||||
|
||||
efi_status_t check_platform_features(void)
|
||||
{
|
||||
efi_status_t status;
|
||||
int ret;
|
||||
|
||||
ret = get_boot_hartid_from_fdt();
|
||||
if (ret) {
|
||||
efi_err("/chosen/boot-hartid missing or invalid!\n");
|
||||
return EFI_UNSUPPORTED;
|
||||
status = get_boot_hartid_from_efi();
|
||||
if (status != EFI_SUCCESS) {
|
||||
ret = get_boot_hartid_from_fdt();
|
||||
if (ret) {
|
||||
efi_err("Failed to get boot hartid!\n");
|
||||
return EFI_UNSUPPORTED;
|
||||
}
|
||||
}
|
||||
return EFI_SUCCESS;
|
||||
}
|
||||
@@ -80,7 +97,8 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
|
||||
unsigned long *image_size,
|
||||
unsigned long *reserve_addr,
|
||||
unsigned long *reserve_size,
|
||||
efi_loaded_image_t *image)
|
||||
efi_loaded_image_t *image,
|
||||
efi_handle_t image_handle)
|
||||
{
|
||||
unsigned long kernel_size = 0;
|
||||
unsigned long preferred_addr;
|
||||
|
||||
@@ -22,6 +22,7 @@
|
||||
#define MAXMEM_X86_64_4LEVEL (1ull << 46)
|
||||
|
||||
const efi_system_table_t *efi_system_table;
|
||||
const efi_dxe_services_table_t *efi_dxe_table;
|
||||
extern u32 image_offset;
|
||||
static efi_loaded_image_t *image = NULL;
|
||||
|
||||
@@ -211,9 +212,110 @@ static void retrieve_apple_device_properties(struct boot_params *boot_params)
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
adjust_memory_range_protection(unsigned long start, unsigned long size)
|
||||
{
|
||||
efi_status_t status;
|
||||
efi_gcd_memory_space_desc_t desc;
|
||||
unsigned long end, next;
|
||||
unsigned long rounded_start, rounded_end;
|
||||
unsigned long unprotect_start, unprotect_size;
|
||||
int has_system_memory = 0;
|
||||
|
||||
if (efi_dxe_table == NULL)
|
||||
return;
|
||||
|
||||
rounded_start = rounddown(start, EFI_PAGE_SIZE);
|
||||
rounded_end = roundup(start + size, EFI_PAGE_SIZE);
|
||||
|
||||
/*
|
||||
* Don't modify memory region attributes, they are
|
||||
* already suitable, to lower the possibility to
|
||||
* encounter firmware bugs.
|
||||
*/
|
||||
|
||||
for (end = start + size; start < end; start = next) {
|
||||
|
||||
status = efi_dxe_call(get_memory_space_descriptor, start, &desc);
|
||||
|
||||
if (status != EFI_SUCCESS)
|
||||
return;
|
||||
|
||||
next = desc.base_address + desc.length;
|
||||
|
||||
/*
|
||||
* Only system memory is suitable for trampoline/kernel image placement,
|
||||
* so only this type of memory needs its attributes to be modified.
|
||||
*/
|
||||
|
||||
if (desc.gcd_memory_type != EfiGcdMemoryTypeSystemMemory ||
|
||||
(desc.attributes & (EFI_MEMORY_RO | EFI_MEMORY_XP)) == 0)
|
||||
continue;
|
||||
|
||||
unprotect_start = max(rounded_start, (unsigned long)desc.base_address);
|
||||
unprotect_size = min(rounded_end, next) - unprotect_start;
|
||||
|
||||
status = efi_dxe_call(set_memory_space_attributes,
|
||||
unprotect_start, unprotect_size,
|
||||
EFI_MEMORY_WB);
|
||||
|
||||
if (status != EFI_SUCCESS) {
|
||||
efi_warn("Unable to unprotect memory range [%08lx,%08lx]: %d\n",
|
||||
unprotect_start,
|
||||
unprotect_start + unprotect_size,
|
||||
(int)status);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Trampoline takes 2 pages and can be loaded in first megabyte of memory
|
||||
* with its end placed between 128k and 640k where BIOS might start.
|
||||
* (see arch/x86/boot/compressed/pgtable_64.c)
|
||||
*
|
||||
* We cannot find exact trampoline placement since memory map
|
||||
* can be modified by UEFI, and it can alter the computed address.
|
||||
*/
|
||||
|
||||
#define TRAMPOLINE_PLACEMENT_BASE ((128 - 8)*1024)
|
||||
#define TRAMPOLINE_PLACEMENT_SIZE (640*1024 - (128 - 8)*1024)
|
||||
|
||||
void startup_32(struct boot_params *boot_params);
|
||||
|
||||
static void
|
||||
setup_memory_protection(unsigned long image_base, unsigned long image_size)
|
||||
{
|
||||
/*
|
||||
* Allow execution of possible trampoline used
|
||||
* for switching between 4- and 5-level page tables
|
||||
* and relocated kernel image.
|
||||
*/
|
||||
|
||||
adjust_memory_range_protection(TRAMPOLINE_PLACEMENT_BASE,
|
||||
TRAMPOLINE_PLACEMENT_SIZE);
|
||||
|
||||
#ifdef CONFIG_64BIT
|
||||
if (image_base != (unsigned long)startup_32)
|
||||
adjust_memory_range_protection(image_base, image_size);
|
||||
#else
|
||||
/*
|
||||
* Clear protection flags on a whole range of possible
|
||||
* addresses used for KASLR. We don't need to do that
|
||||
* on x86_64, since KASLR/extraction is performed after
|
||||
* dedicated identity page tables are built and we only
|
||||
* need to remove possible protection on relocated image
|
||||
* itself disregarding further relocations.
|
||||
*/
|
||||
adjust_memory_range_protection(LOAD_PHYSICAL_ADDR,
|
||||
KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR);
|
||||
#endif
|
||||
}
|
||||
|
||||
static const efi_char16_t apple[] = L"Apple";
|
||||
|
||||
static void setup_quirks(struct boot_params *boot_params)
|
||||
static void setup_quirks(struct boot_params *boot_params,
|
||||
unsigned long image_base,
|
||||
unsigned long image_size)
|
||||
{
|
||||
efi_char16_t *fw_vendor = (efi_char16_t *)(unsigned long)
|
||||
efi_table_attr(efi_system_table, fw_vendor);
|
||||
@@ -222,6 +324,9 @@ static void setup_quirks(struct boot_params *boot_params)
|
||||
if (IS_ENABLED(CONFIG_APPLE_PROPERTIES))
|
||||
retrieve_apple_device_properties(boot_params);
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_EFI_DXE_MEM_ATTRIBUTES))
|
||||
setup_memory_protection(image_base, image_size);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -341,8 +446,6 @@ static void __noreturn efi_exit(efi_handle_t handle, efi_status_t status)
|
||||
asm("hlt");
|
||||
}
|
||||
|
||||
void startup_32(struct boot_params *boot_params);
|
||||
|
||||
void __noreturn efi_stub_entry(efi_handle_t handle,
|
||||
efi_system_table_t *sys_table_arg,
|
||||
struct boot_params *boot_params);
|
||||
@@ -677,11 +780,17 @@ unsigned long efi_main(efi_handle_t handle,
|
||||
efi_status_t status;
|
||||
|
||||
efi_system_table = sys_table_arg;
|
||||
|
||||
/* Check if we were booted by the EFI firmware */
|
||||
if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
|
||||
efi_exit(handle, EFI_INVALID_PARAMETER);
|
||||
|
||||
efi_dxe_table = get_efi_config_table(EFI_DXE_SERVICES_TABLE_GUID);
|
||||
if (efi_dxe_table &&
|
||||
efi_dxe_table->hdr.signature != EFI_DXE_SERVICES_TABLE_SIGNATURE) {
|
||||
efi_warn("Ignoring DXE services table: invalid signature\n");
|
||||
efi_dxe_table = NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* If the kernel isn't already loaded at a suitable address,
|
||||
* relocate it.
|
||||
@@ -791,7 +900,7 @@ unsigned long efi_main(efi_handle_t handle,
|
||||
|
||||
setup_efi_pci(boot_params);
|
||||
|
||||
setup_quirks(boot_params);
|
||||
setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start);
|
||||
|
||||
status = exit_boot(boot_params, handle);
|
||||
if (status != EFI_SUCCESS) {
|
||||
|
||||
@@ -47,4 +47,7 @@ source "drivers/virt/vboxguest/Kconfig"
|
||||
source "drivers/virt/nitro_enclaves/Kconfig"
|
||||
|
||||
source "drivers/virt/acrn/Kconfig"
|
||||
|
||||
source "drivers/virt/coco/efi_secret/Kconfig"
|
||||
|
||||
endif
|
||||
|
||||
@@ -9,3 +9,4 @@ obj-y += vboxguest/
|
||||
|
||||
obj-$(CONFIG_NITRO_ENCLAVES) += nitro_enclaves/
|
||||
obj-$(CONFIG_ACRN_HSM) += acrn/
|
||||
obj-$(CONFIG_EFI_SECRET) += coco/efi_secret/
|
||||
|
||||
16
drivers/virt/coco/efi_secret/Kconfig
Normal file
16
drivers/virt/coco/efi_secret/Kconfig
Normal file
@@ -0,0 +1,16 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
config EFI_SECRET
|
||||
tristate "EFI secret area securityfs support"
|
||||
depends on EFI && X86_64
|
||||
select EFI_COCO_SECRET
|
||||
select SECURITYFS
|
||||
help
|
||||
This is a driver for accessing the EFI secret area via securityfs.
|
||||
The EFI secret area is a memory area designated by the firmware for
|
||||
confidential computing secret injection (for example for AMD SEV
|
||||
guests). The driver exposes the secrets as files in
|
||||
<securityfs>/secrets/coco. Files can be read and deleted (deleting
|
||||
a file wipes the secret from memory).
|
||||
|
||||
To compile this driver as a module, choose M here.
|
||||
The module will be called efi_secret.
|
||||
2
drivers/virt/coco/efi_secret/Makefile
Normal file
2
drivers/virt/coco/efi_secret/Makefile
Normal file
@@ -0,0 +1,2 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
obj-$(CONFIG_EFI_SECRET) += efi_secret.o
|
||||
349
drivers/virt/coco/efi_secret/efi_secret.c
Normal file
349
drivers/virt/coco/efi_secret/efi_secret.c
Normal file
@@ -0,0 +1,349 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* efi_secret module
|
||||
*
|
||||
* Copyright (C) 2022 IBM Corporation
|
||||
* Author: Dov Murik <dovmurik@linux.ibm.com>
|
||||
*/
|
||||
|
||||
/**
|
||||
* DOC: efi_secret: Allow reading EFI confidential computing (coco) secret area
|
||||
* via securityfs interface.
|
||||
*
|
||||
* When the module is loaded (and securityfs is mounted, typically under
|
||||
* /sys/kernel/security), a "secrets/coco" directory is created in securityfs.
|
||||
* In it, a file is created for each secret entry. The name of each such file
|
||||
* is the GUID of the secret entry, and its content is the secret data.
|
||||
*/
|
||||
|
||||
#include <linux/platform_device.h>
|
||||
#include <linux/seq_file.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/io.h>
|
||||
#include <linux/security.h>
|
||||
#include <linux/efi.h>
|
||||
#include <linux/cacheflush.h>
|
||||
|
||||
#define EFI_SECRET_NUM_FILES 64
|
||||
|
||||
struct efi_secret {
|
||||
struct dentry *secrets_dir;
|
||||
struct dentry *fs_dir;
|
||||
struct dentry *fs_files[EFI_SECRET_NUM_FILES];
|
||||
void __iomem *secret_data;
|
||||
u64 secret_data_len;
|
||||
};
|
||||
|
||||
/*
|
||||
* Structure of the EFI secret area
|
||||
*
|
||||
* Offset Length
|
||||
* (bytes) (bytes) Usage
|
||||
* ------- ------- -----
|
||||
* 0 16 Secret table header GUID (must be 1e74f542-71dd-4d66-963e-ef4287ff173b)
|
||||
* 16 4 Length of bytes of the entire secret area
|
||||
*
|
||||
* 20 16 First secret entry's GUID
|
||||
* 36 4 First secret entry's length in bytes (= 16 + 4 + x)
|
||||
* 40 x First secret entry's data
|
||||
*
|
||||
* 40+x 16 Second secret entry's GUID
|
||||
* 56+x 4 Second secret entry's length in bytes (= 16 + 4 + y)
|
||||
* 60+x y Second secret entry's data
|
||||
*
|
||||
* (... and so on for additional entries)
|
||||
*
|
||||
* The GUID of each secret entry designates the usage of the secret data.
|
||||
*/
|
||||
|
||||
/**
|
||||
* struct secret_header - Header of entire secret area; this should be followed
|
||||
* by instances of struct secret_entry.
|
||||
* @guid: Must be EFI_SECRET_TABLE_HEADER_GUID
|
||||
* @len: Length in bytes of entire secret area, including header
|
||||
*/
|
||||
struct secret_header {
|
||||
efi_guid_t guid;
|
||||
u32 len;
|
||||
} __attribute((packed));
|
||||
|
||||
/**
|
||||
* struct secret_entry - Holds one secret entry
|
||||
* @guid: Secret-specific GUID (or NULL_GUID if this secret entry was deleted)
|
||||
* @len: Length of secret entry, including its guid and len fields
|
||||
* @data: The secret data (full of zeros if this secret entry was deleted)
|
||||
*/
|
||||
struct secret_entry {
|
||||
efi_guid_t guid;
|
||||
u32 len;
|
||||
u8 data[];
|
||||
} __attribute((packed));
|
||||
|
||||
static size_t secret_entry_data_len(struct secret_entry *e)
|
||||
{
|
||||
return e->len - sizeof(*e);
|
||||
}
|
||||
|
||||
static struct efi_secret the_efi_secret;
|
||||
|
||||
static inline struct efi_secret *efi_secret_get(void)
|
||||
{
|
||||
return &the_efi_secret;
|
||||
}
|
||||
|
||||
static int efi_secret_bin_file_show(struct seq_file *file, void *data)
|
||||
{
|
||||
struct secret_entry *e = file->private;
|
||||
|
||||
if (e)
|
||||
seq_write(file, e->data, secret_entry_data_len(e));
|
||||
|
||||
return 0;
|
||||
}
|
||||
DEFINE_SHOW_ATTRIBUTE(efi_secret_bin_file);
|
||||
|
||||
/*
|
||||
* Overwrite memory content with zeroes, and ensure that dirty cache lines are
|
||||
* actually written back to memory, to clear out the secret.
|
||||
*/
|
||||
static void wipe_memory(void *addr, size_t size)
|
||||
{
|
||||
memzero_explicit(addr, size);
|
||||
#ifdef CONFIG_X86
|
||||
clflush_cache_range(addr, size);
|
||||
#endif
|
||||
}
|
||||
|
||||
static int efi_secret_unlink(struct inode *dir, struct dentry *dentry)
|
||||
{
|
||||
struct efi_secret *s = efi_secret_get();
|
||||
struct inode *inode = d_inode(dentry);
|
||||
struct secret_entry *e = (struct secret_entry *)inode->i_private;
|
||||
int i;
|
||||
|
||||
if (e) {
|
||||
/* Zero out the secret data */
|
||||
wipe_memory(e->data, secret_entry_data_len(e));
|
||||
e->guid = NULL_GUID;
|
||||
}
|
||||
|
||||
inode->i_private = NULL;
|
||||
|
||||
for (i = 0; i < EFI_SECRET_NUM_FILES; i++)
|
||||
if (s->fs_files[i] == dentry)
|
||||
s->fs_files[i] = NULL;
|
||||
|
||||
/*
|
||||
* securityfs_remove tries to lock the directory's inode, but we reach
|
||||
* the unlink callback when it's already locked
|
||||
*/
|
||||
inode_unlock(dir);
|
||||
securityfs_remove(dentry);
|
||||
inode_lock(dir);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct inode_operations efi_secret_dir_inode_operations = {
|
||||
.lookup = simple_lookup,
|
||||
.unlink = efi_secret_unlink,
|
||||
};
|
||||
|
||||
static int efi_secret_map_area(struct platform_device *dev)
|
||||
{
|
||||
int ret;
|
||||
struct efi_secret *s = efi_secret_get();
|
||||
struct linux_efi_coco_secret_area *secret_area;
|
||||
|
||||
if (efi.coco_secret == EFI_INVALID_TABLE_ADDR) {
|
||||
dev_err(&dev->dev, "Secret area address is not available\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
secret_area = memremap(efi.coco_secret, sizeof(*secret_area), MEMREMAP_WB);
|
||||
if (secret_area == NULL) {
|
||||
dev_err(&dev->dev, "Could not map secret area EFI config entry\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (!secret_area->base_pa || secret_area->size < sizeof(struct secret_header)) {
|
||||
dev_err(&dev->dev,
|
||||
"Invalid secret area memory location (base_pa=0x%llx size=0x%llx)\n",
|
||||
secret_area->base_pa, secret_area->size);
|
||||
ret = -EINVAL;
|
||||
goto unmap;
|
||||
}
|
||||
|
||||
s->secret_data = ioremap_encrypted(secret_area->base_pa, secret_area->size);
|
||||
if (s->secret_data == NULL) {
|
||||
dev_err(&dev->dev, "Could not map secret area\n");
|
||||
ret = -ENOMEM;
|
||||
goto unmap;
|
||||
}
|
||||
|
||||
s->secret_data_len = secret_area->size;
|
||||
ret = 0;
|
||||
|
||||
unmap:
|
||||
memunmap(secret_area);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void efi_secret_securityfs_teardown(struct platform_device *dev)
|
||||
{
|
||||
struct efi_secret *s = efi_secret_get();
|
||||
int i;
|
||||
|
||||
for (i = (EFI_SECRET_NUM_FILES - 1); i >= 0; i--) {
|
||||
securityfs_remove(s->fs_files[i]);
|
||||
s->fs_files[i] = NULL;
|
||||
}
|
||||
|
||||
securityfs_remove(s->fs_dir);
|
||||
s->fs_dir = NULL;
|
||||
|
||||
securityfs_remove(s->secrets_dir);
|
||||
s->secrets_dir = NULL;
|
||||
|
||||
dev_dbg(&dev->dev, "Removed securityfs entries\n");
|
||||
}
|
||||
|
||||
static int efi_secret_securityfs_setup(struct platform_device *dev)
|
||||
{
|
||||
struct efi_secret *s = efi_secret_get();
|
||||
int ret = 0, i = 0, bytes_left;
|
||||
unsigned char *ptr;
|
||||
struct secret_header *h;
|
||||
struct secret_entry *e;
|
||||
struct dentry *dent;
|
||||
char guid_str[EFI_VARIABLE_GUID_LEN + 1];
|
||||
|
||||
ptr = (void __force *)s->secret_data;
|
||||
h = (struct secret_header *)ptr;
|
||||
if (efi_guidcmp(h->guid, EFI_SECRET_TABLE_HEADER_GUID)) {
|
||||
/*
|
||||
* This is not an error: it just means that EFI defines secret
|
||||
* area but it was not populated by the Guest Owner.
|
||||
*/
|
||||
dev_dbg(&dev->dev, "EFI secret area does not start with correct GUID\n");
|
||||
return -ENODEV;
|
||||
}
|
||||
if (h->len < sizeof(*h)) {
|
||||
dev_err(&dev->dev, "EFI secret area reported length is too small\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
if (h->len > s->secret_data_len) {
|
||||
dev_err(&dev->dev, "EFI secret area reported length is too big\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
s->secrets_dir = NULL;
|
||||
s->fs_dir = NULL;
|
||||
memset(s->fs_files, 0, sizeof(s->fs_files));
|
||||
|
||||
dent = securityfs_create_dir("secrets", NULL);
|
||||
if (IS_ERR(dent)) {
|
||||
dev_err(&dev->dev, "Error creating secrets securityfs directory entry err=%ld\n",
|
||||
PTR_ERR(dent));
|
||||
return PTR_ERR(dent);
|
||||
}
|
||||
s->secrets_dir = dent;
|
||||
|
||||
dent = securityfs_create_dir("coco", s->secrets_dir);
|
||||
if (IS_ERR(dent)) {
|
||||
dev_err(&dev->dev, "Error creating coco securityfs directory entry err=%ld\n",
|
||||
PTR_ERR(dent));
|
||||
return PTR_ERR(dent);
|
||||
}
|
||||
d_inode(dent)->i_op = &efi_secret_dir_inode_operations;
|
||||
s->fs_dir = dent;
|
||||
|
||||
bytes_left = h->len - sizeof(*h);
|
||||
ptr += sizeof(*h);
|
||||
while (bytes_left >= (int)sizeof(*e) && i < EFI_SECRET_NUM_FILES) {
|
||||
e = (struct secret_entry *)ptr;
|
||||
if (e->len < sizeof(*e) || e->len > (unsigned int)bytes_left) {
|
||||
dev_err(&dev->dev, "EFI secret area is corrupted\n");
|
||||
ret = -EINVAL;
|
||||
goto err_cleanup;
|
||||
}
|
||||
|
||||
/* Skip deleted entries (which will have NULL_GUID) */
|
||||
if (efi_guidcmp(e->guid, NULL_GUID)) {
|
||||
efi_guid_to_str(&e->guid, guid_str);
|
||||
|
||||
dent = securityfs_create_file(guid_str, 0440, s->fs_dir, (void *)e,
|
||||
&efi_secret_bin_file_fops);
|
||||
if (IS_ERR(dent)) {
|
||||
dev_err(&dev->dev, "Error creating efi_secret securityfs entry\n");
|
||||
ret = PTR_ERR(dent);
|
||||
goto err_cleanup;
|
||||
}
|
||||
|
||||
s->fs_files[i++] = dent;
|
||||
}
|
||||
ptr += e->len;
|
||||
bytes_left -= e->len;
|
||||
}
|
||||
|
||||
dev_info(&dev->dev, "Created %d entries in securityfs secrets/coco\n", i);
|
||||
return 0;
|
||||
|
||||
err_cleanup:
|
||||
efi_secret_securityfs_teardown(dev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void efi_secret_unmap_area(void)
|
||||
{
|
||||
struct efi_secret *s = efi_secret_get();
|
||||
|
||||
if (s->secret_data) {
|
||||
iounmap(s->secret_data);
|
||||
s->secret_data = NULL;
|
||||
s->secret_data_len = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static int efi_secret_probe(struct platform_device *dev)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = efi_secret_map_area(dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = efi_secret_securityfs_setup(dev);
|
||||
if (ret)
|
||||
goto err_unmap;
|
||||
|
||||
return ret;
|
||||
|
||||
err_unmap:
|
||||
efi_secret_unmap_area();
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int efi_secret_remove(struct platform_device *dev)
|
||||
{
|
||||
efi_secret_securityfs_teardown(dev);
|
||||
efi_secret_unmap_area();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct platform_driver efi_secret_driver = {
|
||||
.probe = efi_secret_probe,
|
||||
.remove = efi_secret_remove,
|
||||
.driver = {
|
||||
.name = "efi_secret",
|
||||
},
|
||||
};
|
||||
|
||||
module_platform_driver(efi_secret_driver);
|
||||
|
||||
MODULE_DESCRIPTION("Confidential computing EFI secret area access");
|
||||
MODULE_AUTHOR("IBM");
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_ALIAS("platform:efi_secret");
|
||||
@@ -213,6 +213,8 @@ struct capsule_info {
|
||||
size_t page_bytes_remain;
|
||||
};
|
||||
|
||||
int efi_capsule_setup_info(struct capsule_info *cap_info, void *kbuff,
|
||||
size_t hdr_bytes);
|
||||
int __efi_capsule_setup_info(struct capsule_info *cap_info);
|
||||
|
||||
/*
|
||||
@@ -383,6 +385,7 @@ void efi_native_runtime_setup(void);
|
||||
#define EFI_LOAD_FILE_PROTOCOL_GUID EFI_GUID(0x56ec3091, 0x954c, 0x11d2, 0x8e, 0x3f, 0x00, 0xa0, 0xc9, 0x69, 0x72, 0x3b)
|
||||
#define EFI_LOAD_FILE2_PROTOCOL_GUID EFI_GUID(0x4006c0c1, 0xfcb3, 0x403e, 0x99, 0x6d, 0x4a, 0x6c, 0x87, 0x24, 0xe0, 0x6d)
|
||||
#define EFI_RT_PROPERTIES_TABLE_GUID EFI_GUID(0xeb66918a, 0x7eef, 0x402a, 0x84, 0x2e, 0x93, 0x1d, 0x21, 0xc3, 0x8a, 0xe9)
|
||||
#define EFI_DXE_SERVICES_TABLE_GUID EFI_GUID(0x05ad34ba, 0x6f02, 0x4214, 0x95, 0x2e, 0x4d, 0xa0, 0x39, 0x8e, 0x2b, 0xb9)
|
||||
|
||||
#define EFI_IMAGE_SECURITY_DATABASE_GUID EFI_GUID(0xd719b2cb, 0x3d3a, 0x4596, 0xa3, 0xbc, 0xda, 0xd0, 0x0e, 0x67, 0x65, 0x6f)
|
||||
#define EFI_SHIM_LOCK_GUID EFI_GUID(0x605dab50, 0xe046, 0x4300, 0xab, 0xb6, 0x3d, 0xd8, 0x10, 0xdd, 0x8b, 0x23)
|
||||
@@ -405,6 +408,20 @@ void efi_native_runtime_setup(void);
|
||||
#define LINUX_EFI_MEMRESERVE_TABLE_GUID EFI_GUID(0x888eb0c6, 0x8ede, 0x4ff5, 0xa8, 0xf0, 0x9a, 0xee, 0x5c, 0xb9, 0x77, 0xc2)
|
||||
#define LINUX_EFI_INITRD_MEDIA_GUID EFI_GUID(0x5568e427, 0x68fc, 0x4f3d, 0xac, 0x74, 0xca, 0x55, 0x52, 0x31, 0xcc, 0x68)
|
||||
#define LINUX_EFI_MOK_VARIABLE_TABLE_GUID EFI_GUID(0xc451ed2b, 0x9694, 0x45d3, 0xba, 0xba, 0xed, 0x9f, 0x89, 0x88, 0xa3, 0x89)
|
||||
#define LINUX_EFI_COCO_SECRET_AREA_GUID EFI_GUID(0xadf956ad, 0xe98c, 0x484c, 0xae, 0x11, 0xb5, 0x1c, 0x7d, 0x33, 0x64, 0x47)
|
||||
|
||||
#define RISCV_EFI_BOOT_PROTOCOL_GUID EFI_GUID(0xccd15fec, 0x6f73, 0x4eec, 0x83, 0x95, 0x3e, 0x69, 0xe4, 0xb9, 0x40, 0xbf)
|
||||
|
||||
/*
|
||||
* This GUID may be installed onto the kernel image's handle as a NULL protocol
|
||||
* to signal to the stub that the placement of the image should be respected,
|
||||
* and moving the image in physical memory is undesirable. To ensure
|
||||
* compatibility with 64k pages kernels with virtually mapped stacks, and to
|
||||
* avoid defeating physical randomization, this protocol should only be
|
||||
* installed if the image was placed at a randomized 128k aligned address in
|
||||
* memory.
|
||||
*/
|
||||
#define LINUX_EFI_LOADED_IMAGE_FIXED_GUID EFI_GUID(0xf5a37b6d, 0x3344, 0x42a5, 0xb6, 0xbb, 0x97, 0x86, 0x48, 0xc1, 0x89, 0x0a)
|
||||
|
||||
/* OEM GUIDs */
|
||||
#define DELLEMC_EFI_RCI2_TABLE_GUID EFI_GUID(0x2d9f28a2, 0xa886, 0x456a, 0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55)
|
||||
@@ -435,6 +452,7 @@ typedef struct {
|
||||
} efi_config_table_type_t;
|
||||
|
||||
#define EFI_SYSTEM_TABLE_SIGNATURE ((u64)0x5453595320494249ULL)
|
||||
#define EFI_DXE_SERVICES_TABLE_SIGNATURE ((u64)0x565245535f455844ULL)
|
||||
|
||||
#define EFI_2_30_SYSTEM_TABLE_REVISION ((2 << 16) | (30))
|
||||
#define EFI_2_20_SYSTEM_TABLE_REVISION ((2 << 16) | (20))
|
||||
@@ -596,6 +614,7 @@ extern struct efi {
|
||||
unsigned long tpm_log; /* TPM2 Event Log table */
|
||||
unsigned long tpm_final_log; /* TPM2 Final Events Log table */
|
||||
unsigned long mokvar_table; /* MOK variable config table */
|
||||
unsigned long coco_secret; /* Confidential computing secret table */
|
||||
|
||||
efi_get_time_t *get_time;
|
||||
efi_set_time_t *set_time;
|
||||
@@ -1335,4 +1354,12 @@ extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
|
||||
static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt) { }
|
||||
#endif
|
||||
|
||||
struct linux_efi_coco_secret_area {
|
||||
u64 base_pa;
|
||||
u64 size;
|
||||
};
|
||||
|
||||
/* Header of a populated EFI secret area */
|
||||
#define EFI_SECRET_TABLE_HEADER_GUID EFI_GUID(0x1e74f542, 0x71dd, 0x4d66, 0x96, 0x3e, 0xef, 0x42, 0x87, 0xff, 0x17, 0x3b)
|
||||
|
||||
#endif /* _LINUX_EFI_H */
|
||||
|
||||
@@ -196,6 +196,7 @@ void synchronize_rcu_tasks_rude(void);
|
||||
void exit_tasks_rcu_start(void);
|
||||
void exit_tasks_rcu_finish(void);
|
||||
#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
|
||||
#define rcu_tasks_classic_qs(t, preempt) do { } while (0)
|
||||
#define rcu_tasks_qs(t, preempt) do { } while (0)
|
||||
#define rcu_note_voluntary_context_switch(t) do { } while (0)
|
||||
#define call_rcu_tasks call_rcu
|
||||
|
||||
@@ -2126,6 +2126,47 @@ static inline void cond_resched_rcu(void)
|
||||
#endif
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PREEMPT_DYNAMIC
|
||||
|
||||
extern bool preempt_model_none(void);
|
||||
extern bool preempt_model_voluntary(void);
|
||||
extern bool preempt_model_full(void);
|
||||
|
||||
#else
|
||||
|
||||
static inline bool preempt_model_none(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_PREEMPT_NONE);
|
||||
}
|
||||
static inline bool preempt_model_voluntary(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY);
|
||||
}
|
||||
static inline bool preempt_model_full(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_PREEMPT);
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
static inline bool preempt_model_rt(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_PREEMPT_RT);
|
||||
}
|
||||
|
||||
/*
|
||||
* Does the preemption model allow non-cooperative preemption?
|
||||
*
|
||||
* For !CONFIG_PREEMPT_DYNAMIC kernels this is an exact match with
|
||||
* CONFIG_PREEMPTION; for CONFIG_PREEMPT_DYNAMIC this doesn't work as the
|
||||
* kernel is *built* with CONFIG_PREEMPTION=y but may run with e.g. the
|
||||
* PREEMPT_NONE model.
|
||||
*/
|
||||
static inline bool preempt_model_preemptible(void)
|
||||
{
|
||||
return preempt_model_full() || preempt_model_rt();
|
||||
}
|
||||
|
||||
/*
|
||||
* Does a critical section need to be broken due to another
|
||||
* task waiting?: (technically does not depend on CONFIG_PREEMPTION,
|
||||
|
||||
@@ -47,11 +47,9 @@ struct srcu_data {
|
||||
*/
|
||||
struct srcu_node {
|
||||
spinlock_t __private lock;
|
||||
unsigned long srcu_have_cbs[4]; /* GP seq for children */
|
||||
/* having CBs, but only */
|
||||
/* is > ->srcu_gq_seq. */
|
||||
unsigned long srcu_data_have_cbs[4]; /* Which srcu_data structs */
|
||||
/* have CBs for given GP? */
|
||||
unsigned long srcu_have_cbs[4]; /* GP seq for children having CBs, but only */
|
||||
/* if greater than ->srcu_gq_seq. */
|
||||
unsigned long srcu_data_have_cbs[4]; /* Which srcu_data structs have CBs for given GP? */
|
||||
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
|
||||
struct srcu_node *srcu_parent; /* Next up in tree. */
|
||||
int grplo; /* Least CPU for node. */
|
||||
@@ -62,18 +60,24 @@ struct srcu_node {
|
||||
* Per-SRCU-domain structure, similar in function to rcu_state.
|
||||
*/
|
||||
struct srcu_struct {
|
||||
struct srcu_node node[NUM_RCU_NODES]; /* Combining tree. */
|
||||
struct srcu_node *node; /* Combining tree. */
|
||||
struct srcu_node *level[RCU_NUM_LVLS + 1];
|
||||
/* First node at each level. */
|
||||
int srcu_size_state; /* Small-to-big transition state. */
|
||||
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
|
||||
spinlock_t __private lock; /* Protect counters */
|
||||
spinlock_t __private lock; /* Protect counters and size state. */
|
||||
struct mutex srcu_gp_mutex; /* Serialize GP work. */
|
||||
unsigned int srcu_idx; /* Current rdr array element. */
|
||||
unsigned long srcu_gp_seq; /* Grace-period seq #. */
|
||||
unsigned long srcu_gp_seq_needed; /* Latest gp_seq needed. */
|
||||
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
|
||||
unsigned long srcu_gp_start; /* Last GP start timestamp (jiffies) */
|
||||
unsigned long srcu_last_gp_end; /* Last GP end timestamp (ns) */
|
||||
unsigned long srcu_size_jiffies; /* Current contention-measurement interval. */
|
||||
unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
|
||||
unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
|
||||
struct srcu_data __percpu *sda; /* Per-CPU srcu_data array. */
|
||||
bool sda_is_static; /* May ->sda be passed to free_percpu()? */
|
||||
unsigned long srcu_barrier_seq; /* srcu_barrier seq #. */
|
||||
struct mutex srcu_barrier_mutex; /* Serialize barrier ops. */
|
||||
struct completion srcu_barrier_completion;
|
||||
@@ -81,10 +85,23 @@ struct srcu_struct {
|
||||
atomic_t srcu_barrier_cpu_cnt; /* # CPUs not yet posting a */
|
||||
/* callback for the barrier */
|
||||
/* operation. */
|
||||
unsigned long reschedule_jiffies;
|
||||
unsigned long reschedule_count;
|
||||
struct delayed_work work;
|
||||
struct lockdep_map dep_map;
|
||||
};
|
||||
|
||||
/* Values for size state variable (->srcu_size_state). */
|
||||
#define SRCU_SIZE_SMALL 0
|
||||
#define SRCU_SIZE_ALLOC 1
|
||||
#define SRCU_SIZE_WAIT_BARRIER 2
|
||||
#define SRCU_SIZE_WAIT_CALL 3
|
||||
#define SRCU_SIZE_WAIT_CBS1 4
|
||||
#define SRCU_SIZE_WAIT_CBS2 5
|
||||
#define SRCU_SIZE_WAIT_CBS3 6
|
||||
#define SRCU_SIZE_WAIT_CBS4 7
|
||||
#define SRCU_SIZE_BIG 8
|
||||
|
||||
/* Values for state variable (bottom bits of ->srcu_gp_seq). */
|
||||
#define SRCU_STATE_IDLE 0
|
||||
#define SRCU_STATE_SCAN1 1
|
||||
@@ -121,6 +138,7 @@ struct srcu_struct {
|
||||
#ifdef MODULE
|
||||
# define __DEFINE_SRCU(name, is_static) \
|
||||
is_static struct srcu_struct name; \
|
||||
extern struct srcu_struct * const __srcu_struct_##name; \
|
||||
struct srcu_struct * const __srcu_struct_##name \
|
||||
__section("___srcu_struct_ptrs") = &name
|
||||
#else
|
||||
|
||||
@@ -118,7 +118,7 @@ void _torture_stop_kthread(char *m, struct task_struct **tp);
|
||||
_torture_stop_kthread("Stopping " #n " task", &(tp))
|
||||
|
||||
#ifdef CONFIG_PREEMPTION
|
||||
#define torture_preempt_schedule() preempt_schedule()
|
||||
#define torture_preempt_schedule() __preempt_schedule()
|
||||
#else
|
||||
#define torture_preempt_schedule() do { } while (0)
|
||||
#endif
|
||||
|
||||
@@ -27,6 +27,7 @@ config BPF_SYSCALL
|
||||
bool "Enable bpf() system call"
|
||||
select BPF
|
||||
select IRQ_WORK
|
||||
select TASKS_RCU if PREEMPTION
|
||||
select TASKS_TRACE_RCU
|
||||
select BINARY_PRINTF
|
||||
select NET_SOCK_MSG if NET
|
||||
|
||||
@@ -77,31 +77,56 @@ config TASKS_RCU_GENERIC
|
||||
This option enables generic infrastructure code supporting
|
||||
task-based RCU implementations. Not for manual selection.
|
||||
|
||||
config TASKS_RCU
|
||||
def_bool PREEMPTION
|
||||
config FORCE_TASKS_RCU
|
||||
bool "Force selection of TASKS_RCU"
|
||||
depends on RCU_EXPERT
|
||||
select TASKS_RCU
|
||||
default n
|
||||
help
|
||||
This option enables a task-based RCU implementation that uses
|
||||
only voluntary context switch (not preemption!), idle, and
|
||||
user-mode execution as quiescent states. Not for manual selection.
|
||||
This option force-enables a task-based RCU implementation
|
||||
that uses only voluntary context switch (not preemption!),
|
||||
idle, and user-mode execution as quiescent states. Not for
|
||||
manual selection in most cases.
|
||||
|
||||
config TASKS_RCU
|
||||
bool
|
||||
default n
|
||||
select IRQ_WORK
|
||||
|
||||
config FORCE_TASKS_RUDE_RCU
|
||||
bool "Force selection of Tasks Rude RCU"
|
||||
depends on RCU_EXPERT
|
||||
select TASKS_RUDE_RCU
|
||||
default n
|
||||
help
|
||||
This option force-enables a task-based RCU implementation
|
||||
that uses only context switch (including preemption) and
|
||||
user-mode execution as quiescent states. It forces IPIs and
|
||||
context switches on all online CPUs, including idle ones,
|
||||
so use with caution. Not for manual selection in most cases.
|
||||
|
||||
config TASKS_RUDE_RCU
|
||||
def_bool 0
|
||||
help
|
||||
This option enables a task-based RCU implementation that uses
|
||||
only context switch (including preemption) and user-mode
|
||||
execution as quiescent states. It forces IPIs and context
|
||||
switches on all online CPUs, including idle ones, so use
|
||||
with caution.
|
||||
|
||||
config TASKS_TRACE_RCU
|
||||
def_bool 0
|
||||
bool
|
||||
default n
|
||||
select IRQ_WORK
|
||||
|
||||
config FORCE_TASKS_TRACE_RCU
|
||||
bool "Force selection of Tasks Trace RCU"
|
||||
depends on RCU_EXPERT
|
||||
select TASKS_TRACE_RCU
|
||||
default n
|
||||
help
|
||||
This option enables a task-based RCU implementation that uses
|
||||
explicit rcu_read_lock_trace() read-side markers, and allows
|
||||
these readers to appear in the idle loop as well as on the CPU
|
||||
hotplug code paths. It can force IPIs on online CPUs, including
|
||||
idle ones, so use with caution.
|
||||
these readers to appear in the idle loop as well as on the
|
||||
CPU hotplug code paths. It can force IPIs on online CPUs,
|
||||
including idle ones, so use with caution. Not for manual
|
||||
selection in most cases.
|
||||
|
||||
config TASKS_TRACE_RCU
|
||||
bool
|
||||
default n
|
||||
select IRQ_WORK
|
||||
|
||||
config RCU_STALL_COMMON
|
||||
def_bool TREE_RCU
|
||||
@@ -195,6 +220,20 @@ config RCU_BOOST_DELAY
|
||||
|
||||
Accept the default if unsure.
|
||||
|
||||
config RCU_EXP_KTHREAD
|
||||
bool "Perform RCU expedited work in a real-time kthread"
|
||||
depends on RCU_BOOST && RCU_EXPERT
|
||||
default !PREEMPT_RT && NR_CPUS <= 32
|
||||
help
|
||||
Use this option to further reduce the latencies of expedited
|
||||
grace periods at the expense of being more disruptive.
|
||||
|
||||
This option is disabled by default on PREEMPT_RT=y kernels which
|
||||
disable expedited grace periods after boot by unconditionally
|
||||
setting rcupdate.rcu_normal_after_boot=1.
|
||||
|
||||
Accept the default if unsure.
|
||||
|
||||
config RCU_NOCB_CPU
|
||||
bool "Offload RCU callback processing from boot-selected CPUs"
|
||||
depends on TREE_RCU
|
||||
@@ -225,7 +264,7 @@ config RCU_NOCB_CPU
|
||||
|
||||
config TASKS_TRACE_RCU_READ_MB
|
||||
bool "Tasks Trace RCU readers use memory barriers in user and idle"
|
||||
depends on RCU_EXPERT
|
||||
depends on RCU_EXPERT && TASKS_TRACE_RCU
|
||||
default PREEMPT_RT || NR_CPUS < 8
|
||||
help
|
||||
Use this option to further reduce the number of IPIs sent
|
||||
|
||||
@@ -28,9 +28,6 @@ config RCU_SCALE_TEST
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
select TASKS_RUDE_RCU
|
||||
select TASKS_TRACE_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs performance
|
||||
@@ -47,9 +44,6 @@ config RCU_TORTURE_TEST
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
select TASKS_RUDE_RCU
|
||||
select TASKS_TRACE_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs torture tests
|
||||
@@ -66,9 +60,6 @@ config RCU_REF_SCALE_TEST
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
select TASKS_RUDE_RCU
|
||||
select TASKS_TRACE_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs performance tests
|
||||
@@ -91,6 +82,20 @@ config RCU_CPU_STALL_TIMEOUT
|
||||
RCU grace period persists, additional CPU stall warnings are
|
||||
printed at more widely spaced intervals.
|
||||
|
||||
config RCU_EXP_CPU_STALL_TIMEOUT
|
||||
int "Expedited RCU CPU stall timeout in milliseconds"
|
||||
depends on RCU_STALL_COMMON
|
||||
range 0 21000
|
||||
default 20 if ANDROID
|
||||
default 0 if !ANDROID
|
||||
help
|
||||
If a given expedited RCU grace period extends more than the
|
||||
specified number of milliseconds, a CPU stall warning is printed.
|
||||
If the RCU grace period persists, additional CPU stall warnings
|
||||
are printed at more widely spaced intervals. A value of zero
|
||||
says to use the RCU_CPU_STALL_TIMEOUT value converted from
|
||||
seconds to milliseconds.
|
||||
|
||||
config RCU_TRACE
|
||||
bool "Enable tracing for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
|
||||
@@ -210,7 +210,9 @@ static inline bool rcu_stall_is_suppressed_at_boot(void)
|
||||
extern int rcu_cpu_stall_ftrace_dump;
|
||||
extern int rcu_cpu_stall_suppress;
|
||||
extern int rcu_cpu_stall_timeout;
|
||||
extern int rcu_exp_cpu_stall_timeout;
|
||||
int rcu_jiffies_till_stall_check(void);
|
||||
int rcu_exp_jiffies_till_stall_check(void);
|
||||
|
||||
static inline bool rcu_stall_is_suppressed(void)
|
||||
{
|
||||
@@ -523,6 +525,8 @@ static inline bool rcu_check_boost_fail(unsigned long gp_state, int *cpup) { ret
|
||||
static inline void show_rcu_gp_kthreads(void) { }
|
||||
static inline int rcu_get_gp_kthreads_prio(void) { return 0; }
|
||||
static inline void rcu_fwd_progress_check(unsigned long j) { }
|
||||
static inline void rcu_gp_slow_register(atomic_t *rgssp) { }
|
||||
static inline void rcu_gp_slow_unregister(atomic_t *rgssp) { }
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
bool rcu_dynticks_zero_in_eqs(int cpu, int *vp);
|
||||
unsigned long rcu_get_gp_seq(void);
|
||||
@@ -534,14 +538,19 @@ int rcu_get_gp_kthreads_prio(void);
|
||||
void rcu_fwd_progress_check(unsigned long j);
|
||||
void rcu_force_quiescent_state(void);
|
||||
extern struct workqueue_struct *rcu_gp_wq;
|
||||
#ifdef CONFIG_RCU_EXP_KTHREAD
|
||||
extern struct kthread_worker *rcu_exp_gp_kworker;
|
||||
extern struct kthread_worker *rcu_exp_par_gp_kworker;
|
||||
#else /* !CONFIG_RCU_EXP_KTHREAD */
|
||||
extern struct workqueue_struct *rcu_par_gp_wq;
|
||||
#endif /* CONFIG_RCU_EXP_KTHREAD */
|
||||
void rcu_gp_slow_register(atomic_t *rgssp);
|
||||
void rcu_gp_slow_unregister(atomic_t *rgssp);
|
||||
#endif /* #else #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
bool rcu_is_nocb_cpu(int cpu);
|
||||
void rcu_bind_current_to_nocb(void);
|
||||
#else
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
|
||||
static inline void rcu_bind_current_to_nocb(void) { }
|
||||
#endif
|
||||
|
||||
|
||||
@@ -505,10 +505,10 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
|
||||
WRITE_ONCE(rsclp->tails[j], rsclp->tails[RCU_DONE_TAIL]);
|
||||
|
||||
/*
|
||||
* Callbacks moved, so clean up the misordered ->tails[] pointers
|
||||
* that now point into the middle of the list of ready-to-invoke
|
||||
* callbacks. The overall effect is to copy down the later pointers
|
||||
* into the gap that was created by the now-ready segments.
|
||||
* Callbacks moved, so there might be an empty RCU_WAIT_TAIL
|
||||
* and a non-empty RCU_NEXT_READY_TAIL. If so, copy the
|
||||
* RCU_NEXT_READY_TAIL segment to fill the RCU_WAIT_TAIL gap
|
||||
* created by the now-ready-to-invoke segments.
|
||||
*/
|
||||
for (j = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++, j++) {
|
||||
if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
|
||||
|
||||
@@ -268,6 +268,8 @@ static struct rcu_scale_ops srcud_ops = {
|
||||
.name = "srcud"
|
||||
};
|
||||
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
|
||||
/*
|
||||
* Definitions for RCU-tasks scalability testing.
|
||||
*/
|
||||
@@ -295,6 +297,16 @@ static struct rcu_scale_ops tasks_ops = {
|
||||
.name = "tasks"
|
||||
};
|
||||
|
||||
#define TASKS_OPS &tasks_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
#define TASKS_OPS
|
||||
|
||||
#endif // #else // #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
/*
|
||||
* Definitions for RCU-tasks-trace scalability testing.
|
||||
*/
|
||||
@@ -324,6 +336,14 @@ static struct rcu_scale_ops tasks_tracing_ops = {
|
||||
.name = "tasks-tracing"
|
||||
};
|
||||
|
||||
#define TASKS_TRACING_OPS &tasks_tracing_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
#define TASKS_TRACING_OPS
|
||||
|
||||
#endif // #else // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
static unsigned long rcuscale_seq_diff(unsigned long new, unsigned long old)
|
||||
{
|
||||
if (!cur_ops->gp_diff)
|
||||
@@ -797,7 +817,7 @@ rcu_scale_init(void)
|
||||
long i;
|
||||
int firsterr = 0;
|
||||
static struct rcu_scale_ops *scale_ops[] = {
|
||||
&rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops, &tasks_tracing_ops
|
||||
&rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_TRACING_OPS
|
||||
};
|
||||
|
||||
if (!torture_init_begin(scale_type, verbose))
|
||||
|
||||
@@ -737,6 +737,50 @@ static struct rcu_torture_ops busted_srcud_ops = {
|
||||
.name = "busted_srcud"
|
||||
};
|
||||
|
||||
/*
|
||||
* Definitions for trivial CONFIG_PREEMPT=n-only torture testing.
|
||||
* This implementation does not necessarily work well with CPU hotplug.
|
||||
*/
|
||||
|
||||
static void synchronize_rcu_trivial(void)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
for_each_online_cpu(cpu) {
|
||||
rcutorture_sched_setaffinity(current->pid, cpumask_of(cpu));
|
||||
WARN_ON_ONCE(raw_smp_processor_id() != cpu);
|
||||
}
|
||||
}
|
||||
|
||||
static int rcu_torture_read_lock_trivial(void) __acquires(RCU)
|
||||
{
|
||||
preempt_disable();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void rcu_torture_read_unlock_trivial(int idx) __releases(RCU)
|
||||
{
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct rcu_torture_ops trivial_ops = {
|
||||
.ttype = RCU_TRIVIAL_FLAVOR,
|
||||
.init = rcu_sync_torture_init,
|
||||
.readlock = rcu_torture_read_lock_trivial,
|
||||
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
|
||||
.readunlock = rcu_torture_read_unlock_trivial,
|
||||
.readlock_held = torture_readlock_not_held,
|
||||
.get_gp_seq = rcu_no_completed,
|
||||
.sync = synchronize_rcu_trivial,
|
||||
.exp_sync = synchronize_rcu_trivial,
|
||||
.fqs = NULL,
|
||||
.stats = NULL,
|
||||
.irq_capable = 1,
|
||||
.name = "trivial"
|
||||
};
|
||||
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
|
||||
/*
|
||||
* Definitions for RCU-tasks torture testing.
|
||||
*/
|
||||
@@ -780,47 +824,16 @@ static struct rcu_torture_ops tasks_ops = {
|
||||
.name = "tasks"
|
||||
};
|
||||
|
||||
/*
|
||||
* Definitions for trivial CONFIG_PREEMPT=n-only torture testing.
|
||||
* This implementation does not necessarily work well with CPU hotplug.
|
||||
*/
|
||||
#define TASKS_OPS &tasks_ops,
|
||||
|
||||
static void synchronize_rcu_trivial(void)
|
||||
{
|
||||
int cpu;
|
||||
#else // #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
for_each_online_cpu(cpu) {
|
||||
rcutorture_sched_setaffinity(current->pid, cpumask_of(cpu));
|
||||
WARN_ON_ONCE(raw_smp_processor_id() != cpu);
|
||||
}
|
||||
}
|
||||
#define TASKS_OPS
|
||||
|
||||
static int rcu_torture_read_lock_trivial(void) __acquires(RCU)
|
||||
{
|
||||
preempt_disable();
|
||||
return 0;
|
||||
}
|
||||
#endif // #else #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
static void rcu_torture_read_unlock_trivial(int idx) __releases(RCU)
|
||||
{
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct rcu_torture_ops trivial_ops = {
|
||||
.ttype = RCU_TRIVIAL_FLAVOR,
|
||||
.init = rcu_sync_torture_init,
|
||||
.readlock = rcu_torture_read_lock_trivial,
|
||||
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
|
||||
.readunlock = rcu_torture_read_unlock_trivial,
|
||||
.readlock_held = torture_readlock_not_held,
|
||||
.get_gp_seq = rcu_no_completed,
|
||||
.sync = synchronize_rcu_trivial,
|
||||
.exp_sync = synchronize_rcu_trivial,
|
||||
.fqs = NULL,
|
||||
.stats = NULL,
|
||||
.irq_capable = 1,
|
||||
.name = "trivial"
|
||||
};
|
||||
#ifdef CONFIG_TASKS_RUDE_RCU
|
||||
|
||||
/*
|
||||
* Definitions for rude RCU-tasks torture testing.
|
||||
@@ -851,6 +864,17 @@ static struct rcu_torture_ops tasks_rude_ops = {
|
||||
.name = "tasks-rude"
|
||||
};
|
||||
|
||||
#define TASKS_RUDE_OPS &tasks_rude_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_RUDE_RCU
|
||||
|
||||
#define TASKS_RUDE_OPS
|
||||
|
||||
#endif // #else #ifdef CONFIG_TASKS_RUDE_RCU
|
||||
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
/*
|
||||
* Definitions for tracing RCU-tasks torture testing.
|
||||
*/
|
||||
@@ -893,6 +917,15 @@ static struct rcu_torture_ops tasks_tracing_ops = {
|
||||
.name = "tasks-tracing"
|
||||
};
|
||||
|
||||
#define TASKS_TRACING_OPS &tasks_tracing_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
#define TASKS_TRACING_OPS
|
||||
|
||||
#endif // #else #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
|
||||
static unsigned long rcutorture_seq_diff(unsigned long new, unsigned long old)
|
||||
{
|
||||
if (!cur_ops->gp_diff)
|
||||
@@ -1178,7 +1211,7 @@ rcu_torture_writer(void *arg)
|
||||
" GP expediting controlled from boot/sysfs for %s.\n",
|
||||
torture_type, cur_ops->name);
|
||||
if (WARN_ONCE(nsynctypes == 0,
|
||||
"rcu_torture_writer: No update-side primitives.\n")) {
|
||||
"%s: No update-side primitives.\n", __func__)) {
|
||||
/*
|
||||
* No updates primitives, so don't try updating.
|
||||
* The resulting test won't be testing much, hence the
|
||||
@@ -1186,6 +1219,7 @@ rcu_torture_writer(void *arg)
|
||||
*/
|
||||
rcu_torture_writer_state = RTWS_STOPPING;
|
||||
torture_kthread_stopping("rcu_torture_writer");
|
||||
return 0;
|
||||
}
|
||||
|
||||
do {
|
||||
@@ -1322,6 +1356,17 @@ rcu_torture_fakewriter(void *arg)
|
||||
VERBOSE_TOROUT_STRING("rcu_torture_fakewriter task started");
|
||||
set_user_nice(current, MAX_NICE);
|
||||
|
||||
if (WARN_ONCE(nsynctypes == 0,
|
||||
"%s: No update-side primitives.\n", __func__)) {
|
||||
/*
|
||||
* No updates primitives, so don't try updating.
|
||||
* The resulting test won't be testing much, hence the
|
||||
* above WARN_ONCE().
|
||||
*/
|
||||
torture_kthread_stopping("rcu_torture_fakewriter");
|
||||
return 0;
|
||||
}
|
||||
|
||||
do {
|
||||
torture_hrtimeout_jiffies(torture_random(&rand) % 10, &rand);
|
||||
if (cur_ops->cb_barrier != NULL &&
|
||||
@@ -2916,10 +2961,12 @@ rcu_torture_cleanup(void)
|
||||
pr_info("%s: Invoking %pS().\n", __func__, cur_ops->cb_barrier);
|
||||
cur_ops->cb_barrier();
|
||||
}
|
||||
rcu_gp_slow_unregister(NULL);
|
||||
return;
|
||||
}
|
||||
if (!cur_ops) {
|
||||
torture_cleanup_end();
|
||||
rcu_gp_slow_unregister(NULL);
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -3016,6 +3063,7 @@ rcu_torture_cleanup(void)
|
||||
else
|
||||
rcu_torture_print_module_parms(cur_ops, "End of test: SUCCESS");
|
||||
torture_cleanup_end();
|
||||
rcu_gp_slow_unregister(&rcu_fwd_cb_nodelay);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
|
||||
@@ -3096,9 +3144,9 @@ rcu_torture_init(void)
|
||||
int flags = 0;
|
||||
unsigned long gp_seq = 0;
|
||||
static struct rcu_torture_ops *torture_ops[] = {
|
||||
&rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
|
||||
&busted_srcud_ops, &tasks_ops, &tasks_rude_ops,
|
||||
&tasks_tracing_ops, &trivial_ops,
|
||||
&rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops, &busted_srcud_ops,
|
||||
TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS
|
||||
&trivial_ops,
|
||||
};
|
||||
|
||||
if (!torture_init_begin(torture_type, verbose))
|
||||
@@ -3320,6 +3368,7 @@ rcu_torture_init(void)
|
||||
if (object_debug)
|
||||
rcu_test_debug_objects();
|
||||
torture_init_end();
|
||||
rcu_gp_slow_register(&rcu_fwd_cb_nodelay);
|
||||
return 0;
|
||||
|
||||
unwind:
|
||||
|
||||
@@ -207,6 +207,8 @@ static struct ref_scale_ops srcu_ops = {
|
||||
.name = "srcu"
|
||||
};
|
||||
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
|
||||
// Definitions for RCU Tasks ref scale testing: Empty read markers.
|
||||
// These definitions also work for RCU Rude readers.
|
||||
static void rcu_tasks_ref_scale_read_section(const int nloops)
|
||||
@@ -232,6 +234,16 @@ static struct ref_scale_ops rcu_tasks_ops = {
|
||||
.name = "rcu-tasks"
|
||||
};
|
||||
|
||||
#define RCU_TASKS_OPS &rcu_tasks_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
#define RCU_TASKS_OPS
|
||||
|
||||
#endif // #else // #ifdef CONFIG_TASKS_RCU
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
// Definitions for RCU Tasks Trace ref scale testing.
|
||||
static void rcu_trace_ref_scale_read_section(const int nloops)
|
||||
{
|
||||
@@ -261,6 +273,14 @@ static struct ref_scale_ops rcu_trace_ops = {
|
||||
.name = "rcu-trace"
|
||||
};
|
||||
|
||||
#define RCU_TRACE_OPS &rcu_trace_ops,
|
||||
|
||||
#else // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
#define RCU_TRACE_OPS
|
||||
|
||||
#endif // #else // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
// Definitions for reference count
|
||||
static atomic_t refcnt;
|
||||
|
||||
@@ -790,7 +810,7 @@ ref_scale_init(void)
|
||||
long i;
|
||||
int firsterr = 0;
|
||||
static struct ref_scale_ops *scale_ops[] = {
|
||||
&rcu_ops, &srcu_ops, &rcu_trace_ops, &rcu_tasks_ops, &refcnt_ops, &rwlock_ops,
|
||||
&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
|
||||
&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops,
|
||||
};
|
||||
|
||||
|
||||
@@ -24,6 +24,7 @@
|
||||
#include <linux/smp.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/srcu.h>
|
||||
|
||||
#include "rcu.h"
|
||||
@@ -38,6 +39,35 @@ module_param(exp_holdoff, ulong, 0444);
|
||||
static ulong counter_wrap_check = (ULONG_MAX >> 2);
|
||||
module_param(counter_wrap_check, ulong, 0444);
|
||||
|
||||
/*
|
||||
* Control conversion to SRCU_SIZE_BIG:
|
||||
* 0: Don't convert at all.
|
||||
* 1: Convert at init_srcu_struct() time.
|
||||
* 2: Convert when rcutorture invokes srcu_torture_stats_print().
|
||||
* 3: Decide at boot time based on system shape (default).
|
||||
* 0x1x: Convert when excessive contention encountered.
|
||||
*/
|
||||
#define SRCU_SIZING_NONE 0
|
||||
#define SRCU_SIZING_INIT 1
|
||||
#define SRCU_SIZING_TORTURE 2
|
||||
#define SRCU_SIZING_AUTO 3
|
||||
#define SRCU_SIZING_CONTEND 0x10
|
||||
#define SRCU_SIZING_IS(x) ((convert_to_big & ~SRCU_SIZING_CONTEND) == x)
|
||||
#define SRCU_SIZING_IS_NONE() (SRCU_SIZING_IS(SRCU_SIZING_NONE))
|
||||
#define SRCU_SIZING_IS_INIT() (SRCU_SIZING_IS(SRCU_SIZING_INIT))
|
||||
#define SRCU_SIZING_IS_TORTURE() (SRCU_SIZING_IS(SRCU_SIZING_TORTURE))
|
||||
#define SRCU_SIZING_IS_CONTEND() (convert_to_big & SRCU_SIZING_CONTEND)
|
||||
static int convert_to_big = SRCU_SIZING_AUTO;
|
||||
module_param(convert_to_big, int, 0444);
|
||||
|
||||
/* Number of CPUs to trigger init_srcu_struct()-time transition to big. */
|
||||
static int big_cpu_lim __read_mostly = 128;
|
||||
module_param(big_cpu_lim, int, 0444);
|
||||
|
||||
/* Contention events per jiffy to initiate transition to big. */
|
||||
static int small_contention_lim __read_mostly = 100;
|
||||
module_param(small_contention_lim, int, 0444);
|
||||
|
||||
/* Early-boot callback-management, so early that no lock is required! */
|
||||
static LIST_HEAD(srcu_boot_list);
|
||||
static bool __read_mostly srcu_init_done;
|
||||
@@ -48,39 +78,90 @@ static void process_srcu(struct work_struct *work);
|
||||
static void srcu_delay_timer(struct timer_list *t);
|
||||
|
||||
/* Wrappers for lock acquisition and release, see raw_spin_lock_rcu_node(). */
|
||||
#define spin_lock_rcu_node(p) \
|
||||
do { \
|
||||
spin_lock(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
#define spin_lock_rcu_node(p) \
|
||||
do { \
|
||||
spin_lock(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define spin_unlock_rcu_node(p) spin_unlock(&ACCESS_PRIVATE(p, lock))
|
||||
|
||||
#define spin_lock_irq_rcu_node(p) \
|
||||
do { \
|
||||
spin_lock_irq(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
#define spin_lock_irq_rcu_node(p) \
|
||||
do { \
|
||||
spin_lock_irq(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define spin_unlock_irq_rcu_node(p) \
|
||||
#define spin_unlock_irq_rcu_node(p) \
|
||||
spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
|
||||
|
||||
#define spin_lock_irqsave_rcu_node(p, flags) \
|
||||
do { \
|
||||
spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
#define spin_lock_irqsave_rcu_node(p, flags) \
|
||||
do { \
|
||||
spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define spin_unlock_irqrestore_rcu_node(p, flags) \
|
||||
spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \
|
||||
#define spin_trylock_irqsave_rcu_node(p, flags) \
|
||||
({ \
|
||||
bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
|
||||
\
|
||||
if (___locked) \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
___locked; \
|
||||
})
|
||||
|
||||
#define spin_unlock_irqrestore_rcu_node(p, flags) \
|
||||
spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \
|
||||
|
||||
/*
|
||||
* Initialize SRCU combining tree. Note that statically allocated
|
||||
* Initialize SRCU per-CPU data. Note that statically allocated
|
||||
* srcu_struct structures might already have srcu_read_lock() and
|
||||
* srcu_read_unlock() running against them. So if the is_static parameter
|
||||
* is set, don't initialize ->srcu_lock_count[] and ->srcu_unlock_count[].
|
||||
*/
|
||||
static void init_srcu_struct_nodes(struct srcu_struct *ssp)
|
||||
static void init_srcu_struct_data(struct srcu_struct *ssp)
|
||||
{
|
||||
int cpu;
|
||||
struct srcu_data *sdp;
|
||||
|
||||
/*
|
||||
* Initialize the per-CPU srcu_data array, which feeds into the
|
||||
* leaves of the srcu_node tree.
|
||||
*/
|
||||
WARN_ON_ONCE(ARRAY_SIZE(sdp->srcu_lock_count) !=
|
||||
ARRAY_SIZE(sdp->srcu_unlock_count));
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
|
||||
rcu_segcblist_init(&sdp->srcu_cblist);
|
||||
sdp->srcu_cblist_invoking = false;
|
||||
sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
|
||||
sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
|
||||
sdp->mynode = NULL;
|
||||
sdp->cpu = cpu;
|
||||
INIT_WORK(&sdp->work, srcu_invoke_callbacks);
|
||||
timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
|
||||
sdp->ssp = ssp;
|
||||
}
|
||||
}
|
||||
|
||||
/* Invalid seq state, used during snp node initialization */
|
||||
#define SRCU_SNP_INIT_SEQ 0x2
|
||||
|
||||
/*
|
||||
* Check whether sequence number corresponding to snp node,
|
||||
* is invalid.
|
||||
*/
|
||||
static inline bool srcu_invl_snp_seq(unsigned long s)
|
||||
{
|
||||
return rcu_seq_state(s) == SRCU_SNP_INIT_SEQ;
|
||||
}
|
||||
|
||||
/*
|
||||
* Allocated and initialize SRCU combining tree. Returns @true if
|
||||
* allocation succeeded and @false otherwise.
|
||||
*/
|
||||
static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
|
||||
{
|
||||
int cpu;
|
||||
int i;
|
||||
@@ -92,6 +173,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
|
||||
|
||||
/* Initialize geometry if it has not already been initialized. */
|
||||
rcu_init_geometry();
|
||||
ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
|
||||
if (!ssp->node)
|
||||
return false;
|
||||
|
||||
/* Work out the overall tree geometry. */
|
||||
ssp->level[0] = &ssp->node[0];
|
||||
@@ -105,10 +189,10 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
|
||||
WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
|
||||
ARRAY_SIZE(snp->srcu_data_have_cbs));
|
||||
for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) {
|
||||
snp->srcu_have_cbs[i] = 0;
|
||||
snp->srcu_have_cbs[i] = SRCU_SNP_INIT_SEQ;
|
||||
snp->srcu_data_have_cbs[i] = 0;
|
||||
}
|
||||
snp->srcu_gp_seq_needed_exp = 0;
|
||||
snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
|
||||
snp->grplo = -1;
|
||||
snp->grphi = -1;
|
||||
if (snp == &ssp->node[0]) {
|
||||
@@ -129,39 +213,31 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
|
||||
* Initialize the per-CPU srcu_data array, which feeds into the
|
||||
* leaves of the srcu_node tree.
|
||||
*/
|
||||
WARN_ON_ONCE(ARRAY_SIZE(sdp->srcu_lock_count) !=
|
||||
ARRAY_SIZE(sdp->srcu_unlock_count));
|
||||
level = rcu_num_lvls - 1;
|
||||
snp_first = ssp->level[level];
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
|
||||
rcu_segcblist_init(&sdp->srcu_cblist);
|
||||
sdp->srcu_cblist_invoking = false;
|
||||
sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
|
||||
sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
|
||||
sdp->mynode = &snp_first[cpu / levelspread[level]];
|
||||
for (snp = sdp->mynode; snp != NULL; snp = snp->srcu_parent) {
|
||||
if (snp->grplo < 0)
|
||||
snp->grplo = cpu;
|
||||
snp->grphi = cpu;
|
||||
}
|
||||
sdp->cpu = cpu;
|
||||
INIT_WORK(&sdp->work, srcu_invoke_callbacks);
|
||||
timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
|
||||
sdp->ssp = ssp;
|
||||
sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
|
||||
}
|
||||
smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
|
||||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* Initialize non-compile-time initialized fields, including the
|
||||
* associated srcu_node and srcu_data structures. The is_static
|
||||
* parameter is passed through to init_srcu_struct_nodes(), and
|
||||
* also tells us that ->sda has already been wired up to srcu_data.
|
||||
* associated srcu_node and srcu_data structures. The is_static parameter
|
||||
* tells us that ->sda has already been wired up to srcu_data.
|
||||
*/
|
||||
static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
||||
{
|
||||
ssp->srcu_size_state = SRCU_SIZE_SMALL;
|
||||
ssp->node = NULL;
|
||||
mutex_init(&ssp->srcu_cb_mutex);
|
||||
mutex_init(&ssp->srcu_gp_mutex);
|
||||
ssp->srcu_idx = 0;
|
||||
@@ -170,13 +246,25 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
||||
mutex_init(&ssp->srcu_barrier_mutex);
|
||||
atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
|
||||
INIT_DELAYED_WORK(&ssp->work, process_srcu);
|
||||
ssp->sda_is_static = is_static;
|
||||
if (!is_static)
|
||||
ssp->sda = alloc_percpu(struct srcu_data);
|
||||
if (!ssp->sda)
|
||||
return -ENOMEM;
|
||||
init_srcu_struct_nodes(ssp);
|
||||
init_srcu_struct_data(ssp);
|
||||
ssp->srcu_gp_seq_needed_exp = 0;
|
||||
ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
|
||||
if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
|
||||
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
|
||||
if (!ssp->sda_is_static) {
|
||||
free_percpu(ssp->sda);
|
||||
ssp->sda = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
} else {
|
||||
WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_BIG);
|
||||
}
|
||||
}
|
||||
smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
|
||||
return 0;
|
||||
}
|
||||
@@ -213,6 +301,86 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
|
||||
|
||||
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
|
||||
/*
|
||||
* Initiate a transition to SRCU_SIZE_BIG with lock held.
|
||||
*/
|
||||
static void __srcu_transition_to_big(struct srcu_struct *ssp)
|
||||
{
|
||||
lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
|
||||
smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
|
||||
}
|
||||
|
||||
/*
|
||||
* Initiate an idempotent transition to SRCU_SIZE_BIG.
|
||||
*/
|
||||
static void srcu_transition_to_big(struct srcu_struct *ssp)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
/* Double-checked locking on ->srcu_size-state. */
|
||||
if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL)
|
||||
return;
|
||||
spin_lock_irqsave_rcu_node(ssp, flags);
|
||||
if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL) {
|
||||
spin_unlock_irqrestore_rcu_node(ssp, flags);
|
||||
return;
|
||||
}
|
||||
__srcu_transition_to_big(ssp);
|
||||
spin_unlock_irqrestore_rcu_node(ssp, flags);
|
||||
}
|
||||
|
||||
/*
|
||||
* Check to see if the just-encountered contention event justifies
|
||||
* a transition to SRCU_SIZE_BIG.
|
||||
*/
|
||||
static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
|
||||
{
|
||||
unsigned long j;
|
||||
|
||||
if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
|
||||
return;
|
||||
j = jiffies;
|
||||
if (ssp->srcu_size_jiffies != j) {
|
||||
ssp->srcu_size_jiffies = j;
|
||||
ssp->srcu_n_lock_retries = 0;
|
||||
}
|
||||
if (++ssp->srcu_n_lock_retries <= small_contention_lim)
|
||||
return;
|
||||
__srcu_transition_to_big(ssp);
|
||||
}
|
||||
|
||||
/*
|
||||
* Acquire the specified srcu_data structure's ->lock, but check for
|
||||
* excessive contention, which results in initiation of a transition
|
||||
* to SRCU_SIZE_BIG. But only if the srcutree.convert_to_big module
|
||||
* parameter permits this.
|
||||
*/
|
||||
static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned long *flags)
|
||||
{
|
||||
struct srcu_struct *ssp = sdp->ssp;
|
||||
|
||||
if (spin_trylock_irqsave_rcu_node(sdp, *flags))
|
||||
return;
|
||||
spin_lock_irqsave_rcu_node(ssp, *flags);
|
||||
spin_lock_irqsave_check_contention(ssp);
|
||||
spin_unlock_irqrestore_rcu_node(ssp, *flags);
|
||||
spin_lock_irqsave_rcu_node(sdp, *flags);
|
||||
}
|
||||
|
||||
/*
|
||||
* Acquire the specified srcu_struct structure's ->lock, but check for
|
||||
* excessive contention, which results in initiation of a transition
|
||||
* to SRCU_SIZE_BIG. But only if the srcutree.convert_to_big module
|
||||
* parameter permits this.
|
||||
*/
|
||||
static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
|
||||
{
|
||||
if (spin_trylock_irqsave_rcu_node(ssp, *flags))
|
||||
return;
|
||||
spin_lock_irqsave_rcu_node(ssp, *flags);
|
||||
spin_lock_irqsave_check_contention(ssp);
|
||||
}
|
||||
|
||||
/*
|
||||
* First-use initialization of statically allocated srcu_struct
|
||||
* structure. Wiring up the combining tree is more than can be
|
||||
@@ -343,7 +511,10 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
|
||||
return sum;
|
||||
}
|
||||
|
||||
#define SRCU_INTERVAL 1
|
||||
#define SRCU_INTERVAL 1 // Base delay if no expedited GPs pending.
|
||||
#define SRCU_MAX_INTERVAL 10 // Maximum incremental delay from slow readers.
|
||||
#define SRCU_MAX_NODELAY_PHASE 1 // Maximum per-GP-phase consecutive no-delay instances.
|
||||
#define SRCU_MAX_NODELAY 100 // Maximum consecutive no-delay instances.
|
||||
|
||||
/*
|
||||
* Return grace-period delay, zero if there are expedited grace
|
||||
@@ -351,10 +522,18 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
|
||||
*/
|
||||
static unsigned long srcu_get_delay(struct srcu_struct *ssp)
|
||||
{
|
||||
if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq),
|
||||
READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
|
||||
return 0;
|
||||
return SRCU_INTERVAL;
|
||||
unsigned long jbase = SRCU_INTERVAL;
|
||||
|
||||
if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
|
||||
jbase = 0;
|
||||
if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)))
|
||||
jbase += jiffies - READ_ONCE(ssp->srcu_gp_start);
|
||||
if (!jbase) {
|
||||
WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
|
||||
if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE)
|
||||
jbase = 1;
|
||||
}
|
||||
return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase;
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -382,13 +561,20 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
|
||||
return; /* Forgot srcu_barrier(), so just leak it! */
|
||||
}
|
||||
if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
|
||||
WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
|
||||
WARN_ON(srcu_readers_active(ssp))) {
|
||||
pr_info("%s: Active srcu_struct %p state: %d\n",
|
||||
__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)));
|
||||
pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
|
||||
__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
|
||||
rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
|
||||
return; /* Caller forgot to stop doing call_srcu()? */
|
||||
}
|
||||
free_percpu(ssp->sda);
|
||||
ssp->sda = NULL;
|
||||
if (!ssp->sda_is_static) {
|
||||
free_percpu(ssp->sda);
|
||||
ssp->sda = NULL;
|
||||
}
|
||||
kfree(ssp->node);
|
||||
ssp->node = NULL;
|
||||
ssp->srcu_size_state = SRCU_SIZE_SMALL;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||
|
||||
@@ -434,9 +620,13 @@ EXPORT_SYMBOL_GPL(__srcu_read_unlock);
|
||||
*/
|
||||
static void srcu_gp_start(struct srcu_struct *ssp)
|
||||
{
|
||||
struct srcu_data *sdp = this_cpu_ptr(ssp->sda);
|
||||
struct srcu_data *sdp;
|
||||
int state;
|
||||
|
||||
if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
|
||||
sdp = per_cpu_ptr(ssp->sda, 0);
|
||||
else
|
||||
sdp = this_cpu_ptr(ssp->sda);
|
||||
lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
|
||||
WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
|
||||
spin_lock_rcu_node(sdp); /* Interrupts already disabled. */
|
||||
@@ -445,6 +635,8 @@ static void srcu_gp_start(struct srcu_struct *ssp)
|
||||
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
|
||||
rcu_seq_snap(&ssp->srcu_gp_seq));
|
||||
spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */
|
||||
WRITE_ONCE(ssp->srcu_gp_start, jiffies);
|
||||
WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
|
||||
smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
|
||||
rcu_seq_start(&ssp->srcu_gp_seq);
|
||||
state = rcu_seq_state(ssp->srcu_gp_seq);
|
||||
@@ -517,7 +709,9 @@ static void srcu_gp_end(struct srcu_struct *ssp)
|
||||
int idx;
|
||||
unsigned long mask;
|
||||
struct srcu_data *sdp;
|
||||
unsigned long sgsne;
|
||||
struct srcu_node *snp;
|
||||
int ss_state;
|
||||
|
||||
/* Prevent more than one additional grace period. */
|
||||
mutex_lock(&ssp->srcu_cb_mutex);
|
||||
@@ -526,7 +720,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
|
||||
spin_lock_irq_rcu_node(ssp);
|
||||
idx = rcu_seq_state(ssp->srcu_gp_seq);
|
||||
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
|
||||
cbdelay = srcu_get_delay(ssp);
|
||||
cbdelay = !!srcu_get_delay(ssp);
|
||||
WRITE_ONCE(ssp->srcu_last_gp_end, ktime_get_mono_fast_ns());
|
||||
rcu_seq_end(&ssp->srcu_gp_seq);
|
||||
gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
|
||||
@@ -537,38 +731,45 @@ static void srcu_gp_end(struct srcu_struct *ssp)
|
||||
/* A new grace period can start at this point. But only one. */
|
||||
|
||||
/* Initiate callback invocation as needed. */
|
||||
idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
|
||||
srcu_for_each_node_breadth_first(ssp, snp) {
|
||||
spin_lock_irq_rcu_node(snp);
|
||||
cbs = false;
|
||||
last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
|
||||
if (last_lvl)
|
||||
cbs = snp->srcu_have_cbs[idx] == gpseq;
|
||||
snp->srcu_have_cbs[idx] = gpseq;
|
||||
rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1);
|
||||
if (ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, gpseq))
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq);
|
||||
mask = snp->srcu_data_have_cbs[idx];
|
||||
snp->srcu_data_have_cbs[idx] = 0;
|
||||
spin_unlock_irq_rcu_node(snp);
|
||||
if (cbs)
|
||||
srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay);
|
||||
|
||||
/* Occasionally prevent srcu_data counter wrap. */
|
||||
if (!(gpseq & counter_wrap_check) && last_lvl)
|
||||
for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
spin_lock_irqsave_rcu_node(sdp, flags);
|
||||
if (ULONG_CMP_GE(gpseq,
|
||||
sdp->srcu_gp_seq_needed + 100))
|
||||
sdp->srcu_gp_seq_needed = gpseq;
|
||||
if (ULONG_CMP_GE(gpseq,
|
||||
sdp->srcu_gp_seq_needed_exp + 100))
|
||||
sdp->srcu_gp_seq_needed_exp = gpseq;
|
||||
spin_unlock_irqrestore_rcu_node(sdp, flags);
|
||||
}
|
||||
ss_state = smp_load_acquire(&ssp->srcu_size_state);
|
||||
if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
|
||||
srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, 0), cbdelay);
|
||||
} else {
|
||||
idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
|
||||
srcu_for_each_node_breadth_first(ssp, snp) {
|
||||
spin_lock_irq_rcu_node(snp);
|
||||
cbs = false;
|
||||
last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
|
||||
if (last_lvl)
|
||||
cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
|
||||
snp->srcu_have_cbs[idx] = gpseq;
|
||||
rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1);
|
||||
sgsne = snp->srcu_gp_seq_needed_exp;
|
||||
if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq))
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq);
|
||||
if (ss_state < SRCU_SIZE_BIG)
|
||||
mask = ~0;
|
||||
else
|
||||
mask = snp->srcu_data_have_cbs[idx];
|
||||
snp->srcu_data_have_cbs[idx] = 0;
|
||||
spin_unlock_irq_rcu_node(snp);
|
||||
if (cbs)
|
||||
srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay);
|
||||
}
|
||||
}
|
||||
|
||||
/* Occasionally prevent srcu_data counter wrap. */
|
||||
if (!(gpseq & counter_wrap_check))
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
spin_lock_irqsave_rcu_node(sdp, flags);
|
||||
if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed + 100))
|
||||
sdp->srcu_gp_seq_needed = gpseq;
|
||||
if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed_exp + 100))
|
||||
sdp->srcu_gp_seq_needed_exp = gpseq;
|
||||
spin_unlock_irqrestore_rcu_node(sdp, flags);
|
||||
}
|
||||
|
||||
/* Callback initiation done, allow grace periods after next. */
|
||||
mutex_unlock(&ssp->srcu_cb_mutex);
|
||||
|
||||
@@ -583,6 +784,14 @@ static void srcu_gp_end(struct srcu_struct *ssp)
|
||||
} else {
|
||||
spin_unlock_irq_rcu_node(ssp);
|
||||
}
|
||||
|
||||
/* Transition to big if needed. */
|
||||
if (ss_state != SRCU_SIZE_SMALL && ss_state != SRCU_SIZE_BIG) {
|
||||
if (ss_state == SRCU_SIZE_ALLOC)
|
||||
init_srcu_struct_nodes(ssp, GFP_KERNEL);
|
||||
else
|
||||
smp_store_release(&ssp->srcu_size_state, ss_state + 1);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -596,20 +805,24 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
|
||||
unsigned long s)
|
||||
{
|
||||
unsigned long flags;
|
||||
unsigned long sgsne;
|
||||
|
||||
for (; snp != NULL; snp = snp->srcu_parent) {
|
||||
if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
|
||||
ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
|
||||
return;
|
||||
spin_lock_irqsave_rcu_node(snp, flags);
|
||||
if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
|
||||
if (snp)
|
||||
for (; snp != NULL; snp = snp->srcu_parent) {
|
||||
sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
|
||||
if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
|
||||
(!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)))
|
||||
return;
|
||||
spin_lock_irqsave_rcu_node(snp, flags);
|
||||
sgsne = snp->srcu_gp_seq_needed_exp;
|
||||
if (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)) {
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
return;
|
||||
}
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
return;
|
||||
}
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
}
|
||||
spin_lock_irqsave_rcu_node(ssp, flags);
|
||||
spin_lock_irqsave_ssp_contention(ssp, &flags);
|
||||
if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
|
||||
WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore_rcu_node(ssp, flags);
|
||||
@@ -630,39 +843,47 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
|
||||
{
|
||||
unsigned long flags;
|
||||
int idx = rcu_seq_ctr(s) % ARRAY_SIZE(sdp->mynode->srcu_have_cbs);
|
||||
struct srcu_node *snp = sdp->mynode;
|
||||
unsigned long sgsne;
|
||||
struct srcu_node *snp;
|
||||
struct srcu_node *snp_leaf;
|
||||
unsigned long snp_seq;
|
||||
|
||||
/* Each pass through the loop does one level of the srcu_node tree. */
|
||||
for (; snp != NULL; snp = snp->srcu_parent) {
|
||||
if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != sdp->mynode)
|
||||
return; /* GP already done and CBs recorded. */
|
||||
spin_lock_irqsave_rcu_node(snp, flags);
|
||||
if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
|
||||
/* Ensure that snp node tree is fully initialized before traversing it */
|
||||
if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
|
||||
snp_leaf = NULL;
|
||||
else
|
||||
snp_leaf = sdp->mynode;
|
||||
|
||||
if (snp_leaf)
|
||||
/* Each pass through the loop does one level of the srcu_node tree. */
|
||||
for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
|
||||
if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
|
||||
return; /* GP already done and CBs recorded. */
|
||||
spin_lock_irqsave_rcu_node(snp, flags);
|
||||
snp_seq = snp->srcu_have_cbs[idx];
|
||||
if (snp == sdp->mynode && snp_seq == s)
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
if (snp == sdp->mynode && snp_seq != s) {
|
||||
srcu_schedule_cbs_sdp(sdp, do_norm
|
||||
? SRCU_INTERVAL
|
||||
: 0);
|
||||
if (!srcu_invl_snp_seq(snp_seq) && ULONG_CMP_GE(snp_seq, s)) {
|
||||
if (snp == snp_leaf && snp_seq == s)
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
if (snp == snp_leaf && snp_seq != s) {
|
||||
srcu_schedule_cbs_sdp(sdp, do_norm ? SRCU_INTERVAL : 0);
|
||||
return;
|
||||
}
|
||||
if (!do_norm)
|
||||
srcu_funnel_exp_start(ssp, snp, s);
|
||||
return;
|
||||
}
|
||||
if (!do_norm)
|
||||
srcu_funnel_exp_start(ssp, snp, s);
|
||||
return;
|
||||
snp->srcu_have_cbs[idx] = s;
|
||||
if (snp == snp_leaf)
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
sgsne = snp->srcu_gp_seq_needed_exp;
|
||||
if (!do_norm && (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, s)))
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
}
|
||||
snp->srcu_have_cbs[idx] = s;
|
||||
if (snp == sdp->mynode)
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
}
|
||||
|
||||
/* Top of tree, must ensure the grace period will be started. */
|
||||
spin_lock_irqsave_rcu_node(ssp, flags);
|
||||
spin_lock_irqsave_ssp_contention(ssp, &flags);
|
||||
if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed, s)) {
|
||||
/*
|
||||
* Record need for grace period s. Pair with load
|
||||
@@ -678,9 +899,15 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
|
||||
rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
|
||||
WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
|
||||
srcu_gp_start(ssp);
|
||||
|
||||
// And how can that list_add() in the "else" clause
|
||||
// possibly be safe for concurrent execution? Well,
|
||||
// it isn't. And it does not have to be. After all, it
|
||||
// can only be executed during early boot when there is only
|
||||
// the one boot CPU running with interrupts still disabled.
|
||||
if (likely(srcu_init_done))
|
||||
queue_delayed_work(rcu_gp_wq, &ssp->work,
|
||||
srcu_get_delay(ssp));
|
||||
!!srcu_get_delay(ssp));
|
||||
else if (list_empty(&ssp->work.work.entry))
|
||||
list_add(&ssp->work.work.entry, &srcu_boot_list);
|
||||
}
|
||||
@@ -814,11 +1041,17 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
|
||||
bool needgp = false;
|
||||
unsigned long s;
|
||||
struct srcu_data *sdp;
|
||||
struct srcu_node *sdp_mynode;
|
||||
int ss_state;
|
||||
|
||||
check_init_srcu_struct(ssp);
|
||||
idx = srcu_read_lock(ssp);
|
||||
sdp = raw_cpu_ptr(ssp->sda);
|
||||
spin_lock_irqsave_rcu_node(sdp, flags);
|
||||
ss_state = smp_load_acquire(&ssp->srcu_size_state);
|
||||
if (ss_state < SRCU_SIZE_WAIT_CALL)
|
||||
sdp = per_cpu_ptr(ssp->sda, 0);
|
||||
else
|
||||
sdp = raw_cpu_ptr(ssp->sda);
|
||||
spin_lock_irqsave_sdp_contention(sdp, &flags);
|
||||
if (rhp)
|
||||
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
|
||||
rcu_segcblist_advance(&sdp->srcu_cblist,
|
||||
@@ -834,10 +1067,17 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
|
||||
needexp = true;
|
||||
}
|
||||
spin_unlock_irqrestore_rcu_node(sdp, flags);
|
||||
|
||||
/* Ensure that snp node tree is fully initialized before traversing it */
|
||||
if (ss_state < SRCU_SIZE_WAIT_BARRIER)
|
||||
sdp_mynode = NULL;
|
||||
else
|
||||
sdp_mynode = sdp->mynode;
|
||||
|
||||
if (needgp)
|
||||
srcu_funnel_gp_start(ssp, sdp, s, do_norm);
|
||||
else if (needexp)
|
||||
srcu_funnel_exp_start(ssp, sdp->mynode, s);
|
||||
srcu_funnel_exp_start(ssp, sdp_mynode, s);
|
||||
srcu_read_unlock(ssp, idx);
|
||||
return s;
|
||||
}
|
||||
@@ -1097,6 +1337,28 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
|
||||
complete(&ssp->srcu_barrier_completion);
|
||||
}
|
||||
|
||||
/*
|
||||
* Enqueue an srcu_barrier() callback on the specified srcu_data
|
||||
* structure's ->cblist. but only if that ->cblist already has at least one
|
||||
* callback enqueued. Note that if a CPU already has callbacks enqueue,
|
||||
* it must have already registered the need for a future grace period,
|
||||
* so all we need do is enqueue a callback that will use the same grace
|
||||
* period as the last callback already in the queue.
|
||||
*/
|
||||
static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
|
||||
{
|
||||
spin_lock_irq_rcu_node(sdp);
|
||||
atomic_inc(&ssp->srcu_barrier_cpu_cnt);
|
||||
sdp->srcu_barrier_head.func = srcu_barrier_cb;
|
||||
debug_rcu_head_queue(&sdp->srcu_barrier_head);
|
||||
if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
|
||||
&sdp->srcu_barrier_head)) {
|
||||
debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
|
||||
atomic_dec(&ssp->srcu_barrier_cpu_cnt);
|
||||
}
|
||||
spin_unlock_irq_rcu_node(sdp);
|
||||
}
|
||||
|
||||
/**
|
||||
* srcu_barrier - Wait until all in-flight call_srcu() callbacks complete.
|
||||
* @ssp: srcu_struct on which to wait for in-flight callbacks.
|
||||
@@ -1104,7 +1366,7 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
|
||||
void srcu_barrier(struct srcu_struct *ssp)
|
||||
{
|
||||
int cpu;
|
||||
struct srcu_data *sdp;
|
||||
int idx;
|
||||
unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
|
||||
|
||||
check_init_srcu_struct(ssp);
|
||||
@@ -1120,27 +1382,13 @@ void srcu_barrier(struct srcu_struct *ssp)
|
||||
/* Initial count prevents reaching zero until all CBs are posted. */
|
||||
atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
|
||||
|
||||
/*
|
||||
* Each pass through this loop enqueues a callback, but only
|
||||
* on CPUs already having callbacks enqueued. Note that if
|
||||
* a CPU already has callbacks enqueue, it must have already
|
||||
* registered the need for a future grace period, so all we
|
||||
* need do is enqueue a callback that will use the same
|
||||
* grace period as the last callback already in the queue.
|
||||
*/
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
spin_lock_irq_rcu_node(sdp);
|
||||
atomic_inc(&ssp->srcu_barrier_cpu_cnt);
|
||||
sdp->srcu_barrier_head.func = srcu_barrier_cb;
|
||||
debug_rcu_head_queue(&sdp->srcu_barrier_head);
|
||||
if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
|
||||
&sdp->srcu_barrier_head)) {
|
||||
debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
|
||||
atomic_dec(&ssp->srcu_barrier_cpu_cnt);
|
||||
}
|
||||
spin_unlock_irq_rcu_node(sdp);
|
||||
}
|
||||
idx = srcu_read_lock(ssp);
|
||||
if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
|
||||
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
|
||||
else
|
||||
for_each_possible_cpu(cpu)
|
||||
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
|
||||
srcu_read_unlock(ssp, idx);
|
||||
|
||||
/* Remove the initial count, at which point reaching zero can happen. */
|
||||
if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
|
||||
@@ -1214,6 +1462,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
|
||||
srcu_flip(ssp);
|
||||
spin_lock_irq_rcu_node(ssp);
|
||||
rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
|
||||
ssp->srcu_n_exp_nodelay = 0;
|
||||
spin_unlock_irq_rcu_node(ssp);
|
||||
}
|
||||
|
||||
@@ -1228,6 +1477,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
|
||||
mutex_unlock(&ssp->srcu_gp_mutex);
|
||||
return; /* readers present, retry later. */
|
||||
}
|
||||
ssp->srcu_n_exp_nodelay = 0;
|
||||
srcu_gp_end(ssp); /* Releases ->srcu_gp_mutex. */
|
||||
}
|
||||
}
|
||||
@@ -1318,12 +1568,28 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
|
||||
*/
|
||||
static void process_srcu(struct work_struct *work)
|
||||
{
|
||||
unsigned long curdelay;
|
||||
unsigned long j;
|
||||
struct srcu_struct *ssp;
|
||||
|
||||
ssp = container_of(work, struct srcu_struct, work.work);
|
||||
|
||||
srcu_advance_state(ssp);
|
||||
srcu_reschedule(ssp, srcu_get_delay(ssp));
|
||||
curdelay = srcu_get_delay(ssp);
|
||||
if (curdelay) {
|
||||
WRITE_ONCE(ssp->reschedule_count, 0);
|
||||
} else {
|
||||
j = jiffies;
|
||||
if (READ_ONCE(ssp->reschedule_jiffies) == j) {
|
||||
WRITE_ONCE(ssp->reschedule_count, READ_ONCE(ssp->reschedule_count) + 1);
|
||||
if (READ_ONCE(ssp->reschedule_count) > SRCU_MAX_NODELAY)
|
||||
curdelay = 1;
|
||||
} else {
|
||||
WRITE_ONCE(ssp->reschedule_count, 1);
|
||||
WRITE_ONCE(ssp->reschedule_jiffies, j);
|
||||
}
|
||||
}
|
||||
srcu_reschedule(ssp, curdelay);
|
||||
}
|
||||
|
||||
void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
@@ -1337,43 +1603,69 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
|
||||
|
||||
static const char * const srcu_size_state_name[] = {
|
||||
"SRCU_SIZE_SMALL",
|
||||
"SRCU_SIZE_ALLOC",
|
||||
"SRCU_SIZE_WAIT_BARRIER",
|
||||
"SRCU_SIZE_WAIT_CALL",
|
||||
"SRCU_SIZE_WAIT_CBS1",
|
||||
"SRCU_SIZE_WAIT_CBS2",
|
||||
"SRCU_SIZE_WAIT_CBS3",
|
||||
"SRCU_SIZE_WAIT_CBS4",
|
||||
"SRCU_SIZE_BIG",
|
||||
"SRCU_SIZE_???",
|
||||
};
|
||||
|
||||
void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
|
||||
{
|
||||
int cpu;
|
||||
int idx;
|
||||
unsigned long s0 = 0, s1 = 0;
|
||||
int ss_state = READ_ONCE(ssp->srcu_size_state);
|
||||
int ss_state_idx = ss_state;
|
||||
|
||||
idx = ssp->srcu_idx & 0x1;
|
||||
pr_alert("%s%s Tree SRCU g%ld per-CPU(idx=%d):",
|
||||
tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), idx);
|
||||
for_each_possible_cpu(cpu) {
|
||||
unsigned long l0, l1;
|
||||
unsigned long u0, u1;
|
||||
long c0, c1;
|
||||
struct srcu_data *sdp;
|
||||
if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
|
||||
ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
|
||||
pr_alert("%s%s Tree SRCU g%ld state %d (%s)",
|
||||
tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
|
||||
srcu_size_state_name[ss_state_idx]);
|
||||
if (!ssp->sda) {
|
||||
// Called after cleanup_srcu_struct(), perhaps.
|
||||
pr_cont(" No per-CPU srcu_data structures (->sda == NULL).\n");
|
||||
} else {
|
||||
pr_cont(" per-CPU(idx=%d):", idx);
|
||||
for_each_possible_cpu(cpu) {
|
||||
unsigned long l0, l1;
|
||||
unsigned long u0, u1;
|
||||
long c0, c1;
|
||||
struct srcu_data *sdp;
|
||||
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
u0 = data_race(sdp->srcu_unlock_count[!idx]);
|
||||
u1 = data_race(sdp->srcu_unlock_count[idx]);
|
||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||
u0 = data_race(sdp->srcu_unlock_count[!idx]);
|
||||
u1 = data_race(sdp->srcu_unlock_count[idx]);
|
||||
|
||||
/*
|
||||
* Make sure that a lock is always counted if the corresponding
|
||||
* unlock is counted.
|
||||
*/
|
||||
smp_rmb();
|
||||
/*
|
||||
* Make sure that a lock is always counted if the corresponding
|
||||
* unlock is counted.
|
||||
*/
|
||||
smp_rmb();
|
||||
|
||||
l0 = data_race(sdp->srcu_lock_count[!idx]);
|
||||
l1 = data_race(sdp->srcu_lock_count[idx]);
|
||||
l0 = data_race(sdp->srcu_lock_count[!idx]);
|
||||
l1 = data_race(sdp->srcu_lock_count[idx]);
|
||||
|
||||
c0 = l0 - u0;
|
||||
c1 = l1 - u1;
|
||||
pr_cont(" %d(%ld,%ld %c)",
|
||||
cpu, c0, c1,
|
||||
"C."[rcu_segcblist_empty(&sdp->srcu_cblist)]);
|
||||
s0 += c0;
|
||||
s1 += c1;
|
||||
c0 = l0 - u0;
|
||||
c1 = l1 - u1;
|
||||
pr_cont(" %d(%ld,%ld %c)",
|
||||
cpu, c0, c1,
|
||||
"C."[rcu_segcblist_empty(&sdp->srcu_cblist)]);
|
||||
s0 += c0;
|
||||
s1 += c1;
|
||||
}
|
||||
pr_cont(" T(%ld,%ld)\n", s0, s1);
|
||||
}
|
||||
pr_cont(" T(%ld,%ld)\n", s0, s1);
|
||||
if (SRCU_SIZING_IS_TORTURE())
|
||||
srcu_transition_to_big(ssp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcu_torture_stats_print);
|
||||
|
||||
@@ -1390,6 +1682,17 @@ void __init srcu_init(void)
|
||||
{
|
||||
struct srcu_struct *ssp;
|
||||
|
||||
/* Decide on srcu_struct-size strategy. */
|
||||
if (SRCU_SIZING_IS(SRCU_SIZING_AUTO)) {
|
||||
if (nr_cpu_ids >= big_cpu_lim) {
|
||||
convert_to_big = SRCU_SIZING_INIT; // Don't bother waiting for contention.
|
||||
pr_info("%s: Setting srcu_struct sizes to big.\n", __func__);
|
||||
} else {
|
||||
convert_to_big = SRCU_SIZING_NONE | SRCU_SIZING_CONTEND;
|
||||
pr_info("%s: Setting srcu_struct sizes based on contention.\n", __func__);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Once that is set, call_srcu() can follow the normal path and
|
||||
* queue delayed work. This must follow RCU workqueues creation
|
||||
@@ -1400,6 +1703,8 @@ void __init srcu_init(void)
|
||||
ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
|
||||
work.work.entry);
|
||||
list_del_init(&ssp->work.work.entry);
|
||||
if (SRCU_SIZING_IS(SRCU_SIZING_INIT) && ssp->srcu_size_state == SRCU_SIZE_SMALL)
|
||||
ssp->srcu_size_state = SRCU_SIZE_ALLOC;
|
||||
queue_work(rcu_gp_wq, &ssp->work.work);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -111,7 +111,7 @@ static void rcu_sync_func(struct rcu_head *rhp)
|
||||
* a slowpath during the update. After this function returns, all
|
||||
* subsequent calls to rcu_sync_is_idle() will return false, which
|
||||
* tells readers to stay off their fastpaths. A later call to
|
||||
* rcu_sync_exit() re-enables reader slowpaths.
|
||||
* rcu_sync_exit() re-enables reader fastpaths.
|
||||
*
|
||||
* When called in isolation, rcu_sync_enter() must wait for a grace
|
||||
* period, however, closely spaced calls to rcu_sync_enter() can
|
||||
|
||||
@@ -46,7 +46,7 @@ struct rcu_tasks_percpu {
|
||||
|
||||
/**
|
||||
* struct rcu_tasks - Definition for a Tasks-RCU-like mechanism.
|
||||
* @cbs_wq: Wait queue allowing new callback to get kthread's attention.
|
||||
* @cbs_wait: RCU wait allowing a new callback to get kthread's attention.
|
||||
* @cbs_gbl_lock: Lock protecting callback list.
|
||||
* @kthread_ptr: This flavor's grace-period/callback-invocation kthread.
|
||||
* @gp_func: This flavor's grace-period-wait function.
|
||||
@@ -77,7 +77,7 @@ struct rcu_tasks_percpu {
|
||||
* @kname: This flavor's kthread name.
|
||||
*/
|
||||
struct rcu_tasks {
|
||||
struct wait_queue_head cbs_wq;
|
||||
struct rcuwait cbs_wait;
|
||||
raw_spinlock_t cbs_gbl_lock;
|
||||
int gp_state;
|
||||
int gp_sleep;
|
||||
@@ -113,11 +113,11 @@ static void call_rcu_tasks_iw_wakeup(struct irq_work *iwp);
|
||||
#define DEFINE_RCU_TASKS(rt_name, gp, call, n) \
|
||||
static DEFINE_PER_CPU(struct rcu_tasks_percpu, rt_name ## __percpu) = { \
|
||||
.lock = __RAW_SPIN_LOCK_UNLOCKED(rt_name ## __percpu.cbs_pcpu_lock), \
|
||||
.rtp_irq_work = IRQ_WORK_INIT(call_rcu_tasks_iw_wakeup), \
|
||||
.rtp_irq_work = IRQ_WORK_INIT_HARD(call_rcu_tasks_iw_wakeup), \
|
||||
}; \
|
||||
static struct rcu_tasks rt_name = \
|
||||
{ \
|
||||
.cbs_wq = __WAIT_QUEUE_HEAD_INITIALIZER(rt_name.cbs_wq), \
|
||||
.cbs_wait = __RCUWAIT_INITIALIZER(rt_name.wait), \
|
||||
.cbs_gbl_lock = __RAW_SPIN_LOCK_UNLOCKED(rt_name.cbs_gbl_lock), \
|
||||
.gp_func = gp, \
|
||||
.call_func = call, \
|
||||
@@ -143,6 +143,11 @@ module_param(rcu_task_ipi_delay, int, 0644);
|
||||
#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
|
||||
static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT;
|
||||
module_param(rcu_task_stall_timeout, int, 0644);
|
||||
#define RCU_TASK_STALL_INFO (HZ * 10)
|
||||
static int rcu_task_stall_info __read_mostly = RCU_TASK_STALL_INFO;
|
||||
module_param(rcu_task_stall_info, int, 0644);
|
||||
static int rcu_task_stall_info_mult __read_mostly = 3;
|
||||
module_param(rcu_task_stall_info_mult, int, 0444);
|
||||
|
||||
static int rcu_task_enqueue_lim __read_mostly = -1;
|
||||
module_param(rcu_task_enqueue_lim, int, 0444);
|
||||
@@ -261,14 +266,16 @@ static void call_rcu_tasks_iw_wakeup(struct irq_work *iwp)
|
||||
struct rcu_tasks_percpu *rtpcp = container_of(iwp, struct rcu_tasks_percpu, rtp_irq_work);
|
||||
|
||||
rtp = rtpcp->rtpp;
|
||||
wake_up(&rtp->cbs_wq);
|
||||
rcuwait_wake_up(&rtp->cbs_wait);
|
||||
}
|
||||
|
||||
// Enqueue a callback for the specified flavor of Tasks RCU.
|
||||
static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
|
||||
struct rcu_tasks *rtp)
|
||||
{
|
||||
int chosen_cpu;
|
||||
unsigned long flags;
|
||||
int ideal_cpu;
|
||||
unsigned long j;
|
||||
bool needadjust = false;
|
||||
bool needwake;
|
||||
@@ -278,8 +285,9 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
|
||||
rhp->func = func;
|
||||
local_irq_save(flags);
|
||||
rcu_read_lock();
|
||||
rtpcp = per_cpu_ptr(rtp->rtpcpu,
|
||||
smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift));
|
||||
ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift);
|
||||
chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask);
|
||||
rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu);
|
||||
if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled.
|
||||
raw_spin_lock_rcu_node(rtpcp); // irqs already disabled.
|
||||
j = jiffies;
|
||||
@@ -460,7 +468,7 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu
|
||||
}
|
||||
}
|
||||
|
||||
if (rcu_segcblist_empty(&rtpcp->cblist))
|
||||
if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu))
|
||||
return;
|
||||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq));
|
||||
@@ -509,7 +517,9 @@ static int __noreturn rcu_tasks_kthread(void *arg)
|
||||
set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
|
||||
|
||||
/* If there were none, wait a bit and start over. */
|
||||
wait_event_idle(rtp->cbs_wq, (needgpcb = rcu_tasks_need_gpcb(rtp)));
|
||||
rcuwait_wait_event(&rtp->cbs_wait,
|
||||
(needgpcb = rcu_tasks_need_gpcb(rtp)),
|
||||
TASK_IDLE);
|
||||
|
||||
if (needgpcb & 0x2) {
|
||||
// Wait for one grace period.
|
||||
@@ -548,8 +558,15 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp)
|
||||
static void __init rcu_tasks_bootup_oddness(void)
|
||||
{
|
||||
#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
|
||||
int rtsimc;
|
||||
|
||||
if (rcu_task_stall_timeout != RCU_TASK_STALL_TIMEOUT)
|
||||
pr_info("\tTasks-RCU CPU stall warnings timeout set to %d (rcu_task_stall_timeout).\n", rcu_task_stall_timeout);
|
||||
rtsimc = clamp(rcu_task_stall_info_mult, 1, 10);
|
||||
if (rtsimc != rcu_task_stall_info_mult) {
|
||||
pr_info("\tTasks-RCU CPU stall info multiplier clamped to %d (rcu_task_stall_info_mult).\n", rtsimc);
|
||||
rcu_task_stall_info_mult = rtsimc;
|
||||
}
|
||||
#endif /* #ifdef CONFIG_TASKS_RCU */
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
pr_info("\tTrampoline variant of Tasks RCU enabled.\n");
|
||||
@@ -568,7 +585,17 @@ static void __init rcu_tasks_bootup_oddness(void)
|
||||
/* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */
|
||||
static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
|
||||
{
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, 0); // for_each...
|
||||
int cpu;
|
||||
bool havecbs = false;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
|
||||
|
||||
if (!data_race(rcu_segcblist_empty(&rtpcp->cblist))) {
|
||||
havecbs = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
pr_info("%s: %s(%d) since %lu g:%lu i:%lu/%lu %c%c %s\n",
|
||||
rtp->kname,
|
||||
tasks_gp_state_getname(rtp), data_race(rtp->gp_state),
|
||||
@@ -576,7 +603,7 @@ static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
|
||||
data_race(rcu_seq_current(&rtp->tasks_gp_seq)),
|
||||
data_race(rtp->n_ipis_fails), data_race(rtp->n_ipis),
|
||||
".k"[!!data_race(rtp->kthread_ptr)],
|
||||
".C"[!data_race(rcu_segcblist_empty(&rtpcp->cblist))],
|
||||
".C"[havecbs],
|
||||
s);
|
||||
}
|
||||
#endif // #ifndef CONFIG_TINY_RCU
|
||||
@@ -592,10 +619,15 @@ static void exit_tasks_rcu_finish_trace(struct task_struct *t);
|
||||
/* Wait for one RCU-tasks grace period. */
|
||||
static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
|
||||
{
|
||||
struct task_struct *g, *t;
|
||||
unsigned long lastreport;
|
||||
LIST_HEAD(holdouts);
|
||||
struct task_struct *g;
|
||||
int fract;
|
||||
LIST_HEAD(holdouts);
|
||||
unsigned long j;
|
||||
unsigned long lastinfo;
|
||||
unsigned long lastreport;
|
||||
bool reported = false;
|
||||
int rtsi;
|
||||
struct task_struct *t;
|
||||
|
||||
set_tasks_gp_state(rtp, RTGS_PRE_WAIT_GP);
|
||||
rtp->pregp_func();
|
||||
@@ -621,30 +653,50 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
|
||||
* is empty, we are done.
|
||||
*/
|
||||
lastreport = jiffies;
|
||||
lastinfo = lastreport;
|
||||
rtsi = READ_ONCE(rcu_task_stall_info);
|
||||
|
||||
// Start off with initial wait and slowly back off to 1 HZ wait.
|
||||
fract = rtp->init_fract;
|
||||
|
||||
while (!list_empty(&holdouts)) {
|
||||
ktime_t exp;
|
||||
bool firstreport;
|
||||
bool needreport;
|
||||
int rtst;
|
||||
|
||||
/* Slowly back off waiting for holdouts */
|
||||
// Slowly back off waiting for holdouts
|
||||
set_tasks_gp_state(rtp, RTGS_WAIT_SCAN_HOLDOUTS);
|
||||
schedule_timeout_idle(fract);
|
||||
if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
|
||||
schedule_timeout_idle(fract);
|
||||
} else {
|
||||
exp = jiffies_to_nsecs(fract);
|
||||
__set_current_state(TASK_IDLE);
|
||||
schedule_hrtimeout_range(&exp, jiffies_to_nsecs(HZ / 2), HRTIMER_MODE_REL_HARD);
|
||||
}
|
||||
|
||||
if (fract < HZ)
|
||||
fract++;
|
||||
|
||||
rtst = READ_ONCE(rcu_task_stall_timeout);
|
||||
needreport = rtst > 0 && time_after(jiffies, lastreport + rtst);
|
||||
if (needreport)
|
||||
if (needreport) {
|
||||
lastreport = jiffies;
|
||||
reported = true;
|
||||
}
|
||||
firstreport = true;
|
||||
WARN_ON(signal_pending(current));
|
||||
set_tasks_gp_state(rtp, RTGS_SCAN_HOLDOUTS);
|
||||
rtp->holdouts_func(&holdouts, needreport, &firstreport);
|
||||
|
||||
// Print pre-stall informational messages if needed.
|
||||
j = jiffies;
|
||||
if (rtsi > 0 && !reported && time_after(j, lastinfo + rtsi)) {
|
||||
lastinfo = j;
|
||||
rtsi = rtsi * rcu_task_stall_info_mult;
|
||||
pr_info("%s: %s grace period %lu is %lu jiffies old.\n",
|
||||
__func__, rtp->kname, rtp->tasks_gp_seq, j - rtp->gp_start);
|
||||
}
|
||||
}
|
||||
|
||||
set_tasks_gp_state(rtp, RTGS_POST_GP);
|
||||
@@ -950,6 +1002,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
|
||||
// Wait for one rude RCU-tasks grace period.
|
||||
static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
|
||||
{
|
||||
if (num_online_cpus() <= 1)
|
||||
return; // Fastpath for only one CPU.
|
||||
|
||||
rtp->n_ipis += cpumask_weight(cpu_online_mask);
|
||||
schedule_on_each_cpu(rcu_tasks_be_rude);
|
||||
}
|
||||
|
||||
@@ -1679,6 +1679,8 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
|
||||
rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */
|
||||
if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || rdp->gpwrap)
|
||||
WRITE_ONCE(rdp->gp_seq_needed, rnp->gp_seq_needed);
|
||||
if (IS_ENABLED(CONFIG_PROVE_RCU) && READ_ONCE(rdp->gpwrap))
|
||||
WRITE_ONCE(rdp->last_sched_clock, jiffies);
|
||||
WRITE_ONCE(rdp->gpwrap, false);
|
||||
rcu_gpnum_ovf(rnp, rdp);
|
||||
return ret;
|
||||
@@ -1705,11 +1707,37 @@ static void note_gp_changes(struct rcu_data *rdp)
|
||||
rcu_gp_kthread_wake();
|
||||
}
|
||||
|
||||
static atomic_t *rcu_gp_slow_suppress;
|
||||
|
||||
/* Register a counter to suppress debugging grace-period delays. */
|
||||
void rcu_gp_slow_register(atomic_t *rgssp)
|
||||
{
|
||||
WARN_ON_ONCE(rcu_gp_slow_suppress);
|
||||
|
||||
WRITE_ONCE(rcu_gp_slow_suppress, rgssp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_gp_slow_register);
|
||||
|
||||
/* Unregister a counter, with NULL for not caring which. */
|
||||
void rcu_gp_slow_unregister(atomic_t *rgssp)
|
||||
{
|
||||
WARN_ON_ONCE(rgssp && rgssp != rcu_gp_slow_suppress);
|
||||
|
||||
WRITE_ONCE(rcu_gp_slow_suppress, NULL);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_gp_slow_unregister);
|
||||
|
||||
static bool rcu_gp_slow_is_suppressed(void)
|
||||
{
|
||||
atomic_t *rgssp = READ_ONCE(rcu_gp_slow_suppress);
|
||||
|
||||
return rgssp && atomic_read(rgssp);
|
||||
}
|
||||
|
||||
static void rcu_gp_slow(int delay)
|
||||
{
|
||||
if (delay > 0 &&
|
||||
!(rcu_seq_ctr(rcu_state.gp_seq) %
|
||||
(rcu_num_nodes * PER_RCU_NODE_PERIOD * delay)))
|
||||
if (!rcu_gp_slow_is_suppressed() && delay > 0 &&
|
||||
!(rcu_seq_ctr(rcu_state.gp_seq) % (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay)))
|
||||
schedule_timeout_idle(delay);
|
||||
}
|
||||
|
||||
@@ -2096,14 +2124,29 @@ static noinline void rcu_gp_cleanup(void)
|
||||
/* Advance CBs to reduce false positives below. */
|
||||
offloaded = rcu_rdp_is_offloaded(rdp);
|
||||
if ((offloaded || !rcu_accelerate_cbs(rnp, rdp)) && needgp) {
|
||||
|
||||
// We get here if a grace period was needed (“needgp”)
|
||||
// and the above call to rcu_accelerate_cbs() did not set
|
||||
// the RCU_GP_FLAG_INIT bit in ->gp_state (which records
|
||||
// the need for another grace period). The purpose
|
||||
// of the “offloaded” check is to avoid invoking
|
||||
// rcu_accelerate_cbs() on an offloaded CPU because we do not
|
||||
// hold the ->nocb_lock needed to safely access an offloaded
|
||||
// ->cblist. We do not want to acquire that lock because
|
||||
// it can be heavily contended during callback floods.
|
||||
|
||||
WRITE_ONCE(rcu_state.gp_flags, RCU_GP_FLAG_INIT);
|
||||
WRITE_ONCE(rcu_state.gp_req_activity, jiffies);
|
||||
trace_rcu_grace_period(rcu_state.name,
|
||||
rcu_state.gp_seq,
|
||||
TPS("newreq"));
|
||||
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("newreq"));
|
||||
} else {
|
||||
WRITE_ONCE(rcu_state.gp_flags,
|
||||
rcu_state.gp_flags & RCU_GP_FLAG_INIT);
|
||||
|
||||
// We get here either if there is no need for an
|
||||
// additional grace period or if rcu_accelerate_cbs() has
|
||||
// already set the RCU_GP_FLAG_INIT bit in ->gp_flags.
|
||||
// So all we need to do is to clear all of the other
|
||||
// ->gp_flags bits.
|
||||
|
||||
WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags & RCU_GP_FLAG_INIT);
|
||||
}
|
||||
raw_spin_unlock_irq_rcu_node(rnp);
|
||||
|
||||
@@ -2609,6 +2652,13 @@ static void rcu_do_batch(struct rcu_data *rdp)
|
||||
*/
|
||||
void rcu_sched_clock_irq(int user)
|
||||
{
|
||||
unsigned long j;
|
||||
|
||||
if (IS_ENABLED(CONFIG_PROVE_RCU)) {
|
||||
j = jiffies;
|
||||
WARN_ON_ONCE(time_before(j, __this_cpu_read(rcu_data.last_sched_clock)));
|
||||
__this_cpu_write(rcu_data.last_sched_clock, j);
|
||||
}
|
||||
trace_rcu_utilization(TPS("Start scheduler-tick"));
|
||||
lockdep_assert_irqs_disabled();
|
||||
raw_cpu_inc(rcu_data.ticks_this_gp);
|
||||
@@ -2624,6 +2674,8 @@ void rcu_sched_clock_irq(int user)
|
||||
rcu_flavor_sched_clock_irq(user);
|
||||
if (rcu_pending(user))
|
||||
invoke_rcu_core();
|
||||
if (user)
|
||||
rcu_tasks_classic_qs(current, false);
|
||||
lockdep_assert_irqs_disabled();
|
||||
|
||||
trace_rcu_utilization(TPS("End scheduler-tick"));
|
||||
@@ -3717,7 +3769,9 @@ static int rcu_blocking_is_gp(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPTION))
|
||||
// Invoking preempt_model_*() too early gets a splat.
|
||||
if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
|
||||
preempt_model_full() || preempt_model_rt())
|
||||
return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
|
||||
might_sleep(); /* Check for RCU read-side critical section. */
|
||||
preempt_disable();
|
||||
@@ -4179,6 +4233,7 @@ rcu_boot_init_percpu_data(int cpu)
|
||||
rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED;
|
||||
rdp->rcu_onl_gp_seq = rcu_state.gp_seq;
|
||||
rdp->rcu_onl_gp_flags = RCU_GP_CLEANED;
|
||||
rdp->last_sched_clock = jiffies;
|
||||
rdp->cpu = cpu;
|
||||
rcu_boot_init_nocb_percpu_data(rdp);
|
||||
}
|
||||
@@ -4471,6 +4526,51 @@ static int rcu_pm_notify(struct notifier_block *self,
|
||||
return NOTIFY_OK;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_RCU_EXP_KTHREAD
|
||||
struct kthread_worker *rcu_exp_gp_kworker;
|
||||
struct kthread_worker *rcu_exp_par_gp_kworker;
|
||||
|
||||
static void __init rcu_start_exp_gp_kworkers(void)
|
||||
{
|
||||
const char *par_gp_kworker_name = "rcu_exp_par_gp_kthread_worker";
|
||||
const char *gp_kworker_name = "rcu_exp_gp_kthread_worker";
|
||||
struct sched_param param = { .sched_priority = kthread_prio };
|
||||
|
||||
rcu_exp_gp_kworker = kthread_create_worker(0, gp_kworker_name);
|
||||
if (IS_ERR_OR_NULL(rcu_exp_gp_kworker)) {
|
||||
pr_err("Failed to create %s!\n", gp_kworker_name);
|
||||
return;
|
||||
}
|
||||
|
||||
rcu_exp_par_gp_kworker = kthread_create_worker(0, par_gp_kworker_name);
|
||||
if (IS_ERR_OR_NULL(rcu_exp_par_gp_kworker)) {
|
||||
pr_err("Failed to create %s!\n", par_gp_kworker_name);
|
||||
kthread_destroy_worker(rcu_exp_gp_kworker);
|
||||
return;
|
||||
}
|
||||
|
||||
sched_setscheduler_nocheck(rcu_exp_gp_kworker->task, SCHED_FIFO, ¶m);
|
||||
sched_setscheduler_nocheck(rcu_exp_par_gp_kworker->task, SCHED_FIFO,
|
||||
¶m);
|
||||
}
|
||||
|
||||
static inline void rcu_alloc_par_gp_wq(void)
|
||||
{
|
||||
}
|
||||
#else /* !CONFIG_RCU_EXP_KTHREAD */
|
||||
struct workqueue_struct *rcu_par_gp_wq;
|
||||
|
||||
static void __init rcu_start_exp_gp_kworkers(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_alloc_par_gp_wq(void)
|
||||
{
|
||||
rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
|
||||
WARN_ON(!rcu_par_gp_wq);
|
||||
}
|
||||
#endif /* CONFIG_RCU_EXP_KTHREAD */
|
||||
|
||||
/*
|
||||
* Spawn the kthreads that handle RCU's grace periods.
|
||||
*/
|
||||
@@ -4480,6 +4580,7 @@ static int __init rcu_spawn_gp_kthread(void)
|
||||
struct rcu_node *rnp;
|
||||
struct sched_param sp;
|
||||
struct task_struct *t;
|
||||
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
|
||||
|
||||
rcu_scheduler_fully_active = 1;
|
||||
t = kthread_create(rcu_gp_kthread, NULL, "%s", rcu_state.name);
|
||||
@@ -4497,9 +4598,17 @@ static int __init rcu_spawn_gp_kthread(void)
|
||||
smp_store_release(&rcu_state.gp_kthread, t); /* ^^^ */
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
wake_up_process(t);
|
||||
rcu_spawn_nocb_kthreads();
|
||||
rcu_spawn_boost_kthreads();
|
||||
/* This is a pre-SMP initcall, we expect a single CPU */
|
||||
WARN_ON(num_online_cpus() > 1);
|
||||
/*
|
||||
* Those kthreads couldn't be created on rcu_init() -> rcutree_prepare_cpu()
|
||||
* due to rcu_scheduler_fully_active.
|
||||
*/
|
||||
rcu_spawn_cpu_nocb_kthread(smp_processor_id());
|
||||
rcu_spawn_one_boost_kthread(rdp->mynode);
|
||||
rcu_spawn_core_kthreads();
|
||||
/* Create kthread worker for expedited GPs */
|
||||
rcu_start_exp_gp_kworkers();
|
||||
return 0;
|
||||
}
|
||||
early_initcall(rcu_spawn_gp_kthread);
|
||||
@@ -4745,7 +4854,6 @@ static void __init rcu_dump_rcu_node_tree(void)
|
||||
}
|
||||
|
||||
struct workqueue_struct *rcu_gp_wq;
|
||||
struct workqueue_struct *rcu_par_gp_wq;
|
||||
|
||||
static void __init kfree_rcu_batch_init(void)
|
||||
{
|
||||
@@ -4782,7 +4890,7 @@ static void __init kfree_rcu_batch_init(void)
|
||||
|
||||
void __init rcu_init(void)
|
||||
{
|
||||
int cpu;
|
||||
int cpu = smp_processor_id();
|
||||
|
||||
rcu_early_boot_tests();
|
||||
|
||||
@@ -4802,17 +4910,15 @@ void __init rcu_init(void)
|
||||
* or the scheduler are operational.
|
||||
*/
|
||||
pm_notifier(rcu_pm_notify, 0);
|
||||
for_each_online_cpu(cpu) {
|
||||
rcutree_prepare_cpu(cpu);
|
||||
rcu_cpu_starting(cpu);
|
||||
rcutree_online_cpu(cpu);
|
||||
}
|
||||
WARN_ON(num_online_cpus() > 1); // Only one CPU this early in boot.
|
||||
rcutree_prepare_cpu(cpu);
|
||||
rcu_cpu_starting(cpu);
|
||||
rcutree_online_cpu(cpu);
|
||||
|
||||
/* Create workqueue for Tree SRCU and for expedited GPs. */
|
||||
rcu_gp_wq = alloc_workqueue("rcu_gp", WQ_MEM_RECLAIM, 0);
|
||||
WARN_ON(!rcu_gp_wq);
|
||||
rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
|
||||
WARN_ON(!rcu_par_gp_wq);
|
||||
rcu_alloc_par_gp_wq();
|
||||
|
||||
/* Fill in default value for rcutree.qovld boot parameter. */
|
||||
/* -After- the rcu_node ->lock fields are initialized! */
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
*/
|
||||
|
||||
#include <linux/cache.h>
|
||||
#include <linux/kthread.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/rtmutex.h>
|
||||
#include <linux/threads.h>
|
||||
@@ -23,7 +24,11 @@
|
||||
/* Communicate arguments to a workqueue handler. */
|
||||
struct rcu_exp_work {
|
||||
unsigned long rew_s;
|
||||
#ifdef CONFIG_RCU_EXP_KTHREAD
|
||||
struct kthread_work rew_work;
|
||||
#else
|
||||
struct work_struct rew_work;
|
||||
#endif /* CONFIG_RCU_EXP_KTHREAD */
|
||||
};
|
||||
|
||||
/* RCU's kthread states for tracing. */
|
||||
@@ -254,6 +259,7 @@ struct rcu_data {
|
||||
unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */
|
||||
short rcu_onl_gp_flags; /* ->gp_flags at last online. */
|
||||
unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
|
||||
unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */
|
||||
|
||||
int cpu;
|
||||
};
|
||||
@@ -364,6 +370,7 @@ struct rcu_state {
|
||||
arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
|
||||
/* Synchronize offline with */
|
||||
/* GP pre-initialization. */
|
||||
int nocb_is_setup; /* nocb is setup from boot */
|
||||
};
|
||||
|
||||
/* Values for rcu_state structure's gp_flags field. */
|
||||
@@ -421,7 +428,6 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
|
||||
static bool rcu_is_callbacks_kthread(void);
|
||||
static void rcu_cpu_kthread_setup(unsigned int cpu);
|
||||
static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp);
|
||||
static void __init rcu_spawn_boost_kthreads(void);
|
||||
static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
|
||||
static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
|
||||
static void rcu_preempt_deferred_qs(struct task_struct *t);
|
||||
@@ -439,7 +445,6 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level);
|
||||
static bool do_nocb_deferred_wakeup(struct rcu_data *rdp);
|
||||
static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp);
|
||||
static void rcu_spawn_cpu_nocb_kthread(int cpu);
|
||||
static void __init rcu_spawn_nocb_kthreads(void);
|
||||
static void show_rcu_nocb_state(struct rcu_data *rdp);
|
||||
static void rcu_nocb_lock(struct rcu_data *rdp);
|
||||
static void rcu_nocb_unlock(struct rcu_data *rdp);
|
||||
|
||||
@@ -334,15 +334,13 @@ fastpath:
|
||||
* Select the CPUs within the specified rcu_node that the upcoming
|
||||
* expedited grace period needs to wait for.
|
||||
*/
|
||||
static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
|
||||
static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
unsigned long mask_ofl_test;
|
||||
unsigned long mask_ofl_ipi;
|
||||
int ret;
|
||||
struct rcu_exp_work *rewp =
|
||||
container_of(wp, struct rcu_exp_work, rew_work);
|
||||
struct rcu_node *rnp = container_of(rewp, struct rcu_node, rew);
|
||||
|
||||
raw_spin_lock_irqsave_rcu_node(rnp, flags);
|
||||
@@ -417,13 +415,119 @@ retry_ipi:
|
||||
rcu_report_exp_cpu_mult(rnp, mask_ofl_test, false);
|
||||
}
|
||||
|
||||
static void rcu_exp_sel_wait_wake(unsigned long s);
|
||||
|
||||
#ifdef CONFIG_RCU_EXP_KTHREAD
|
||||
static void sync_rcu_exp_select_node_cpus(struct kthread_work *wp)
|
||||
{
|
||||
struct rcu_exp_work *rewp =
|
||||
container_of(wp, struct rcu_exp_work, rew_work);
|
||||
|
||||
__sync_rcu_exp_select_node_cpus(rewp);
|
||||
}
|
||||
|
||||
static inline bool rcu_gp_par_worker_started(void)
|
||||
{
|
||||
return !!READ_ONCE(rcu_exp_par_gp_kworker);
|
||||
}
|
||||
|
||||
static inline void sync_rcu_exp_select_cpus_queue_work(struct rcu_node *rnp)
|
||||
{
|
||||
kthread_init_work(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
|
||||
/*
|
||||
* Use rcu_exp_par_gp_kworker, because flushing a work item from
|
||||
* another work item on the same kthread worker can result in
|
||||
* deadlock.
|
||||
*/
|
||||
kthread_queue_work(rcu_exp_par_gp_kworker, &rnp->rew.rew_work);
|
||||
}
|
||||
|
||||
static inline void sync_rcu_exp_select_cpus_flush_work(struct rcu_node *rnp)
|
||||
{
|
||||
kthread_flush_work(&rnp->rew.rew_work);
|
||||
}
|
||||
|
||||
/*
|
||||
* Work-queue handler to drive an expedited grace period forward.
|
||||
*/
|
||||
static void wait_rcu_exp_gp(struct kthread_work *wp)
|
||||
{
|
||||
struct rcu_exp_work *rewp;
|
||||
|
||||
rewp = container_of(wp, struct rcu_exp_work, rew_work);
|
||||
rcu_exp_sel_wait_wake(rewp->rew_s);
|
||||
}
|
||||
|
||||
static inline void synchronize_rcu_expedited_queue_work(struct rcu_exp_work *rew)
|
||||
{
|
||||
kthread_init_work(&rew->rew_work, wait_rcu_exp_gp);
|
||||
kthread_queue_work(rcu_exp_gp_kworker, &rew->rew_work);
|
||||
}
|
||||
|
||||
static inline void synchronize_rcu_expedited_destroy_work(struct rcu_exp_work *rew)
|
||||
{
|
||||
}
|
||||
#else /* !CONFIG_RCU_EXP_KTHREAD */
|
||||
static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
|
||||
{
|
||||
struct rcu_exp_work *rewp =
|
||||
container_of(wp, struct rcu_exp_work, rew_work);
|
||||
|
||||
__sync_rcu_exp_select_node_cpus(rewp);
|
||||
}
|
||||
|
||||
static inline bool rcu_gp_par_worker_started(void)
|
||||
{
|
||||
return !!READ_ONCE(rcu_par_gp_wq);
|
||||
}
|
||||
|
||||
static inline void sync_rcu_exp_select_cpus_queue_work(struct rcu_node *rnp)
|
||||
{
|
||||
int cpu = find_next_bit(&rnp->ffmask, BITS_PER_LONG, -1);
|
||||
|
||||
INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
|
||||
/* If all offline, queue the work on an unbound CPU. */
|
||||
if (unlikely(cpu > rnp->grphi - rnp->grplo))
|
||||
cpu = WORK_CPU_UNBOUND;
|
||||
else
|
||||
cpu += rnp->grplo;
|
||||
queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
|
||||
}
|
||||
|
||||
static inline void sync_rcu_exp_select_cpus_flush_work(struct rcu_node *rnp)
|
||||
{
|
||||
flush_work(&rnp->rew.rew_work);
|
||||
}
|
||||
|
||||
/*
|
||||
* Work-queue handler to drive an expedited grace period forward.
|
||||
*/
|
||||
static void wait_rcu_exp_gp(struct work_struct *wp)
|
||||
{
|
||||
struct rcu_exp_work *rewp;
|
||||
|
||||
rewp = container_of(wp, struct rcu_exp_work, rew_work);
|
||||
rcu_exp_sel_wait_wake(rewp->rew_s);
|
||||
}
|
||||
|
||||
static inline void synchronize_rcu_expedited_queue_work(struct rcu_exp_work *rew)
|
||||
{
|
||||
INIT_WORK_ONSTACK(&rew->rew_work, wait_rcu_exp_gp);
|
||||
queue_work(rcu_gp_wq, &rew->rew_work);
|
||||
}
|
||||
|
||||
static inline void synchronize_rcu_expedited_destroy_work(struct rcu_exp_work *rew)
|
||||
{
|
||||
destroy_work_on_stack(&rew->rew_work);
|
||||
}
|
||||
#endif /* CONFIG_RCU_EXP_KTHREAD */
|
||||
|
||||
/*
|
||||
* Select the nodes that the upcoming expedited grace period needs
|
||||
* to wait for.
|
||||
*/
|
||||
static void sync_rcu_exp_select_cpus(void)
|
||||
{
|
||||
int cpu;
|
||||
struct rcu_node *rnp;
|
||||
|
||||
trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("reset"));
|
||||
@@ -435,28 +539,21 @@ static void sync_rcu_exp_select_cpus(void)
|
||||
rnp->exp_need_flush = false;
|
||||
if (!READ_ONCE(rnp->expmask))
|
||||
continue; /* Avoid early boot non-existent wq. */
|
||||
if (!READ_ONCE(rcu_par_gp_wq) ||
|
||||
if (!rcu_gp_par_worker_started() ||
|
||||
rcu_scheduler_active != RCU_SCHEDULER_RUNNING ||
|
||||
rcu_is_last_leaf_node(rnp)) {
|
||||
/* No workqueues yet or last leaf, do direct call. */
|
||||
/* No worker started yet or last leaf, do direct call. */
|
||||
sync_rcu_exp_select_node_cpus(&rnp->rew.rew_work);
|
||||
continue;
|
||||
}
|
||||
INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
|
||||
cpu = find_next_bit(&rnp->ffmask, BITS_PER_LONG, -1);
|
||||
/* If all offline, queue the work on an unbound CPU. */
|
||||
if (unlikely(cpu > rnp->grphi - rnp->grplo))
|
||||
cpu = WORK_CPU_UNBOUND;
|
||||
else
|
||||
cpu += rnp->grplo;
|
||||
queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
|
||||
sync_rcu_exp_select_cpus_queue_work(rnp);
|
||||
rnp->exp_need_flush = true;
|
||||
}
|
||||
|
||||
/* Wait for workqueue jobs (if any) to complete. */
|
||||
/* Wait for jobs (if any) to complete. */
|
||||
rcu_for_each_leaf_node(rnp)
|
||||
if (rnp->exp_need_flush)
|
||||
flush_work(&rnp->rew.rew_work);
|
||||
sync_rcu_exp_select_cpus_flush_work(rnp);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -496,7 +593,7 @@ static void synchronize_rcu_expedited_wait(void)
|
||||
struct rcu_node *rnp_root = rcu_get_root();
|
||||
|
||||
trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait"));
|
||||
jiffies_stall = rcu_jiffies_till_stall_check();
|
||||
jiffies_stall = rcu_exp_jiffies_till_stall_check();
|
||||
jiffies_start = jiffies;
|
||||
if (tick_nohz_full_enabled() && rcu_inkernel_boot_has_ended()) {
|
||||
if (synchronize_rcu_expedited_wait_once(1))
|
||||
@@ -571,7 +668,7 @@ static void synchronize_rcu_expedited_wait(void)
|
||||
dump_cpu_task(cpu);
|
||||
}
|
||||
}
|
||||
jiffies_stall = 3 * rcu_jiffies_till_stall_check() + 3;
|
||||
jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -622,17 +719,6 @@ static void rcu_exp_sel_wait_wake(unsigned long s)
|
||||
rcu_exp_wait_wake(s);
|
||||
}
|
||||
|
||||
/*
|
||||
* Work-queue handler to drive an expedited grace period forward.
|
||||
*/
|
||||
static void wait_rcu_exp_gp(struct work_struct *wp)
|
||||
{
|
||||
struct rcu_exp_work *rewp;
|
||||
|
||||
rewp = container_of(wp, struct rcu_exp_work, rew_work);
|
||||
rcu_exp_sel_wait_wake(rewp->rew_s);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PREEMPT_RCU
|
||||
|
||||
/*
|
||||
@@ -848,20 +934,19 @@ void synchronize_rcu_expedited(void)
|
||||
} else {
|
||||
/* Marshall arguments & schedule the expedited grace period. */
|
||||
rew.rew_s = s;
|
||||
INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
|
||||
queue_work(rcu_gp_wq, &rew.rew_work);
|
||||
synchronize_rcu_expedited_queue_work(&rew);
|
||||
}
|
||||
|
||||
/* Wait for expedited grace period to complete. */
|
||||
rnp = rcu_get_root();
|
||||
wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3],
|
||||
sync_exp_work_done(s));
|
||||
smp_mb(); /* Workqueue actions happen before return. */
|
||||
smp_mb(); /* Work actions happen before return. */
|
||||
|
||||
/* Let the next expedited grace period start. */
|
||||
mutex_unlock(&rcu_state.exp_mutex);
|
||||
|
||||
if (likely(!boottime))
|
||||
destroy_work_on_stack(&rew.rew_work);
|
||||
synchronize_rcu_expedited_destroy_work(&rew);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
|
||||
|
||||
@@ -60,9 +60,6 @@ static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
|
||||
* Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters.
|
||||
* If the list is invalid, a warning is emitted and all CPUs are offloaded.
|
||||
*/
|
||||
|
||||
static bool rcu_nocb_is_setup;
|
||||
|
||||
static int __init rcu_nocb_setup(char *str)
|
||||
{
|
||||
alloc_bootmem_cpumask_var(&rcu_nocb_mask);
|
||||
@@ -72,7 +69,7 @@ static int __init rcu_nocb_setup(char *str)
|
||||
cpumask_setall(rcu_nocb_mask);
|
||||
}
|
||||
}
|
||||
rcu_nocb_is_setup = true;
|
||||
rcu_state.nocb_is_setup = true;
|
||||
return 1;
|
||||
}
|
||||
__setup("rcu_nocbs", rcu_nocb_setup);
|
||||
@@ -215,14 +212,6 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
|
||||
init_swait_queue_head(&rnp->nocb_gp_wq[1]);
|
||||
}
|
||||
|
||||
/* Is the specified CPU a no-CBs CPU? */
|
||||
bool rcu_is_nocb_cpu(int cpu)
|
||||
{
|
||||
if (cpumask_available(rcu_nocb_mask))
|
||||
return cpumask_test_cpu(cpu, rcu_nocb_mask);
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
|
||||
struct rcu_data *rdp,
|
||||
bool force, unsigned long flags)
|
||||
@@ -1180,10 +1169,10 @@ void __init rcu_init_nohz(void)
|
||||
return;
|
||||
}
|
||||
}
|
||||
rcu_nocb_is_setup = true;
|
||||
rcu_state.nocb_is_setup = true;
|
||||
}
|
||||
|
||||
if (!rcu_nocb_is_setup)
|
||||
if (!rcu_state.nocb_is_setup)
|
||||
return;
|
||||
|
||||
#if defined(CONFIG_NO_HZ_FULL)
|
||||
@@ -1241,7 +1230,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
struct task_struct *t;
|
||||
struct sched_param sp;
|
||||
|
||||
if (!rcu_scheduler_fully_active || !rcu_nocb_is_setup)
|
||||
if (!rcu_scheduler_fully_active || !rcu_state.nocb_is_setup)
|
||||
return;
|
||||
|
||||
/* If there already is an rcuo kthread, then nothing to do. */
|
||||
@@ -1277,22 +1266,6 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
|
||||
}
|
||||
|
||||
/*
|
||||
* Once the scheduler is running, spawn rcuo kthreads for all online
|
||||
* no-CBs CPUs. This assumes that the early_initcall()s happen before
|
||||
* non-boot CPUs come online -- if this changes, we will need to add
|
||||
* some mutual exclusion.
|
||||
*/
|
||||
static void __init rcu_spawn_nocb_kthreads(void)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
if (rcu_nocb_is_setup) {
|
||||
for_each_online_cpu(cpu)
|
||||
rcu_spawn_cpu_nocb_kthread(cpu);
|
||||
}
|
||||
}
|
||||
|
||||
/* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */
|
||||
static int rcu_nocb_gp_stride = -1;
|
||||
module_param(rcu_nocb_gp_stride, int, 0444);
|
||||
@@ -1549,10 +1522,6 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
{
|
||||
}
|
||||
|
||||
static void __init rcu_spawn_nocb_kthreads(void)
|
||||
{
|
||||
}
|
||||
|
||||
static void show_rcu_nocb_state(struct rcu_data *rdp)
|
||||
{
|
||||
}
|
||||
|
||||
@@ -486,6 +486,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
|
||||
t->rcu_read_unlock_special.s = 0;
|
||||
if (special.b.need_qs) {
|
||||
if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)) {
|
||||
rdp->cpu_no_qs.b.norm = false;
|
||||
rcu_report_qs_rdp(rdp);
|
||||
udelay(rcu_unlock_delay);
|
||||
} else {
|
||||
@@ -660,7 +661,13 @@ static void rcu_read_unlock_special(struct task_struct *t)
|
||||
expboost && !rdp->defer_qs_iw_pending && cpu_online(rdp->cpu)) {
|
||||
// Get scheduler to re-evaluate and call hooks.
|
||||
// If !IRQ_WORK, FQS scan will eventually IPI.
|
||||
init_irq_work(&rdp->defer_qs_iw, rcu_preempt_deferred_qs_handler);
|
||||
if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) &&
|
||||
IS_ENABLED(CONFIG_PREEMPT_RT))
|
||||
rdp->defer_qs_iw = IRQ_WORK_INIT_HARD(
|
||||
rcu_preempt_deferred_qs_handler);
|
||||
else
|
||||
init_irq_work(&rdp->defer_qs_iw,
|
||||
rcu_preempt_deferred_qs_handler);
|
||||
rdp->defer_qs_iw_pending = true;
|
||||
irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu);
|
||||
}
|
||||
@@ -1124,7 +1131,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
|
||||
__releases(rnp->lock)
|
||||
{
|
||||
raw_lockdep_assert_held_rcu_node(rnp);
|
||||
if (!rcu_preempt_blocked_readers_cgp(rnp) && rnp->exp_tasks == NULL) {
|
||||
if (!rnp->boost_kthread_task ||
|
||||
(!rcu_preempt_blocked_readers_cgp(rnp) && !rnp->exp_tasks)) {
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
return;
|
||||
}
|
||||
@@ -1226,18 +1234,6 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
|
||||
free_cpumask_var(cm);
|
||||
}
|
||||
|
||||
/*
|
||||
* Spawn boost kthreads -- called as soon as the scheduler is running.
|
||||
*/
|
||||
static void __init rcu_spawn_boost_kthreads(void)
|
||||
{
|
||||
struct rcu_node *rnp;
|
||||
|
||||
rcu_for_each_leaf_node(rnp)
|
||||
if (rcu_rnp_online_cpus(rnp))
|
||||
rcu_spawn_one_boost_kthread(rnp);
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_RCU_BOOST */
|
||||
|
||||
static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
|
||||
@@ -1263,10 +1259,6 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
|
||||
{
|
||||
}
|
||||
|
||||
static void __init rcu_spawn_boost_kthreads(void)
|
||||
{
|
||||
}
|
||||
|
||||
#endif /* #else #ifdef CONFIG_RCU_BOOST */
|
||||
|
||||
/*
|
||||
|
||||
@@ -25,6 +25,34 @@ int sysctl_max_rcu_stall_to_panic __read_mostly;
|
||||
#define RCU_STALL_MIGHT_DIV 8
|
||||
#define RCU_STALL_MIGHT_MIN (2 * HZ)
|
||||
|
||||
int rcu_exp_jiffies_till_stall_check(void)
|
||||
{
|
||||
int cpu_stall_timeout = READ_ONCE(rcu_exp_cpu_stall_timeout);
|
||||
int exp_stall_delay_delta = 0;
|
||||
int till_stall_check;
|
||||
|
||||
// Zero says to use rcu_cpu_stall_timeout, but in milliseconds.
|
||||
if (!cpu_stall_timeout)
|
||||
cpu_stall_timeout = jiffies_to_msecs(rcu_jiffies_till_stall_check());
|
||||
|
||||
// Limit check must be consistent with the Kconfig limits for
|
||||
// CONFIG_RCU_EXP_CPU_STALL_TIMEOUT, so check the allowed range.
|
||||
// The minimum clamped value is "2UL", because at least one full
|
||||
// tick has to be guaranteed.
|
||||
till_stall_check = clamp(msecs_to_jiffies(cpu_stall_timeout), 2UL, 21UL * HZ);
|
||||
|
||||
if (cpu_stall_timeout && jiffies_to_msecs(till_stall_check) != cpu_stall_timeout)
|
||||
WRITE_ONCE(rcu_exp_cpu_stall_timeout, jiffies_to_msecs(till_stall_check));
|
||||
|
||||
#ifdef CONFIG_PROVE_RCU
|
||||
/* Add extra ~25% out of till_stall_check. */
|
||||
exp_stall_delay_delta = ((till_stall_check * 25) / 100) + 1;
|
||||
#endif
|
||||
|
||||
return till_stall_check + exp_stall_delay_delta;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_exp_jiffies_till_stall_check);
|
||||
|
||||
/* Limit-check stall timeouts specified at boottime and runtime. */
|
||||
int rcu_jiffies_till_stall_check(void)
|
||||
{
|
||||
@@ -565,9 +593,9 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
|
||||
|
||||
for_each_possible_cpu(cpu)
|
||||
totqlen += rcu_get_n_cbs_cpu(cpu);
|
||||
pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
|
||||
pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu ncpus=%d)\n",
|
||||
smp_processor_id(), (long)(jiffies - gps),
|
||||
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
|
||||
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
|
||||
if (ndetected) {
|
||||
rcu_dump_cpu_stacks();
|
||||
|
||||
@@ -626,9 +654,9 @@ static void print_cpu_stall(unsigned long gps)
|
||||
raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
|
||||
for_each_possible_cpu(cpu)
|
||||
totqlen += rcu_get_n_cbs_cpu(cpu);
|
||||
pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n",
|
||||
pr_cont("\t(t=%lu jiffies g=%ld q=%lu ncpus=%d)\n",
|
||||
jiffies - gps,
|
||||
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
|
||||
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
|
||||
|
||||
rcu_check_gp_kthread_expired_fqs_timer();
|
||||
rcu_check_gp_kthread_starvation();
|
||||
|
||||
@@ -506,6 +506,8 @@ EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
|
||||
module_param(rcu_cpu_stall_suppress, int, 0644);
|
||||
int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
|
||||
module_param(rcu_cpu_stall_timeout, int, 0644);
|
||||
int rcu_exp_cpu_stall_timeout __read_mostly = CONFIG_RCU_EXP_CPU_STALL_TIMEOUT;
|
||||
module_param(rcu_exp_cpu_stall_timeout, int, 0644);
|
||||
#endif /* #ifdef CONFIG_RCU_STALL_COMMON */
|
||||
|
||||
// Suppress boot-time RCU CPU stall warnings and rcutorture writer stall
|
||||
|
||||
@@ -267,9 +267,10 @@ static void scf_handler(void *scfc_in)
|
||||
}
|
||||
this_cpu_inc(scf_invoked_count);
|
||||
if (longwait <= 0) {
|
||||
if (!(r & 0xffc0))
|
||||
if (!(r & 0xffc0)) {
|
||||
udelay(r & 0x3f);
|
||||
goto out;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
if (r & 0xfff)
|
||||
goto out;
|
||||
|
||||
@@ -8488,6 +8488,18 @@ static void __init preempt_dynamic_init(void)
|
||||
}
|
||||
}
|
||||
|
||||
#define PREEMPT_MODEL_ACCESSOR(mode) \
|
||||
bool preempt_model_##mode(void) \
|
||||
{ \
|
||||
WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
|
||||
return preempt_dynamic_mode == preempt_dynamic_##mode; \
|
||||
} \
|
||||
EXPORT_SYMBOL_GPL(preempt_model_##mode)
|
||||
|
||||
PREEMPT_MODEL_ACCESSOR(none);
|
||||
PREEMPT_MODEL_ACCESSOR(voluntary);
|
||||
PREEMPT_MODEL_ACCESSOR(full);
|
||||
|
||||
#else /* !CONFIG_PREEMPT_DYNAMIC */
|
||||
|
||||
static inline void preempt_dynamic_init(void) { }
|
||||
|
||||
@@ -183,7 +183,9 @@ static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
|
||||
static DEFINE_PER_CPU(void *, cur_csd_info);
|
||||
static DEFINE_PER_CPU(struct cfd_seq_local, cfd_seq_local);
|
||||
|
||||
#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
|
||||
static ulong csd_lock_timeout = 5000; /* CSD lock timeout in milliseconds. */
|
||||
module_param(csd_lock_timeout, ulong, 0444);
|
||||
|
||||
static atomic_t csd_bug_count = ATOMIC_INIT(0);
|
||||
static u64 cfd_seq;
|
||||
|
||||
@@ -329,6 +331,7 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 *
|
||||
u64 ts2, ts_delta;
|
||||
call_single_data_t *cpu_cur_csd;
|
||||
unsigned int flags = READ_ONCE(csd->node.u_flags);
|
||||
unsigned long long csd_lock_timeout_ns = csd_lock_timeout * NSEC_PER_MSEC;
|
||||
|
||||
if (!(flags & CSD_FLAG_LOCK)) {
|
||||
if (!unlikely(*bug_id))
|
||||
@@ -341,7 +344,7 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 *
|
||||
|
||||
ts2 = sched_clock();
|
||||
ts_delta = ts2 - *ts1;
|
||||
if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
|
||||
if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0))
|
||||
return false;
|
||||
|
||||
firsttime = !*bug_id;
|
||||
|
||||
@@ -144,6 +144,7 @@ config TRACING
|
||||
select BINARY_PRINTF
|
||||
select EVENT_TRACING
|
||||
select TRACE_CLOCK
|
||||
select TASKS_RCU if PREEMPTION
|
||||
|
||||
config GENERIC_TRACER
|
||||
bool
|
||||
|
||||
@@ -24,6 +24,7 @@ help:
|
||||
@echo ' intel-speed-select - Intel Speed Select tool'
|
||||
@echo ' kvm_stat - top-like utility for displaying kvm statistics'
|
||||
@echo ' leds - LEDs tools'
|
||||
@echo ' nolibc - nolibc headers testing and installation'
|
||||
@echo ' objtool - an ELF object analysis tool'
|
||||
@echo ' pci - PCI tools'
|
||||
@echo ' perf - Linux performance measurement and analysis tool'
|
||||
@@ -74,6 +75,9 @@ bpf/%: FORCE
|
||||
libapi: FORCE
|
||||
$(call descend,lib/api)
|
||||
|
||||
nolibc_%: FORCE
|
||||
$(call descend,include/nolibc,$(patsubst nolibc_%,%,$@))
|
||||
|
||||
# The perf build does not follow the descend function setup,
|
||||
# invoking it via it's own make rule.
|
||||
PERF_O = $(if $(O),$(O)/tools/perf,)
|
||||
|
||||
42
tools/include/nolibc/Makefile
Normal file
42
tools/include/nolibc/Makefile
Normal file
@@ -0,0 +1,42 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
# Makefile for nolibc installation and tests
|
||||
include ../../scripts/Makefile.include
|
||||
|
||||
# we're in ".../tools/include/nolibc"
|
||||
ifeq ($(srctree),)
|
||||
srctree := $(patsubst %/tools/include/,%,$(dir $(CURDIR)))
|
||||
endif
|
||||
|
||||
nolibc_arch := $(patsubst arm64,aarch64,$(ARCH))
|
||||
arch_file := arch-$(nolibc_arch).h
|
||||
all_files := ctype.h errno.h nolibc.h signal.h std.h stdio.h stdlib.h string.h \
|
||||
sys.h time.h types.h unistd.h
|
||||
|
||||
# install all headers needed to support a bare-metal compiler
|
||||
all:
|
||||
|
||||
# Note: when ARCH is "x86" we concatenate both x86_64 and i386
|
||||
headers:
|
||||
$(Q)mkdir -p $(OUTPUT)sysroot
|
||||
$(Q)mkdir -p $(OUTPUT)sysroot/include
|
||||
$(Q)cp $(all_files) $(OUTPUT)sysroot/include/
|
||||
$(Q)if [ "$(ARCH)" = "x86" ]; then \
|
||||
sed -e \
|
||||
's,^#ifndef _NOLIBC_ARCH_X86_64_H,#if !defined(_NOLIBC_ARCH_X86_64_H) \&\& defined(__x86_64__),' \
|
||||
arch-x86_64.h; \
|
||||
sed -e \
|
||||
's,^#ifndef _NOLIBC_ARCH_I386_H,#if !defined(_NOLIBC_ARCH_I386_H) \&\& !defined(__x86_64__),' \
|
||||
arch-i386.h; \
|
||||
elif [ -e "$(arch_file)" ]; then \
|
||||
cat $(arch_file); \
|
||||
else \
|
||||
echo "Fatal: architecture $(ARCH) not yet supported by nolibc." >&2; \
|
||||
exit 1; \
|
||||
fi > $(OUTPUT)sysroot/include/arch.h
|
||||
|
||||
headers_standalone: headers
|
||||
$(Q)$(MAKE) -C $(srctree) headers
|
||||
$(Q)$(MAKE) -C $(srctree) headers_install INSTALL_HDR_PATH=$(OUTPUT)/sysroot
|
||||
|
||||
clean:
|
||||
$(call QUIET_CLEAN, nolibc) rm -rf "$(OUTPUT)sysroot"
|
||||
199
tools/include/nolibc/arch-aarch64.h
Normal file
199
tools/include/nolibc/arch-aarch64.h
Normal file
@@ -0,0 +1,199 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* AARCH64 specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_AARCH64_H
|
||||
#define _NOLIBC_ARCH_AARCH64_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_CREAT 0x40
|
||||
#define O_EXCL 0x80
|
||||
#define O_NOCTTY 0x100
|
||||
#define O_TRUNC 0x200
|
||||
#define O_APPEND 0x400
|
||||
#define O_NONBLOCK 0x800
|
||||
#define O_DIRECTORY 0x4000
|
||||
|
||||
/* The struct returned by the newfstatat() syscall. Differs slightly from the
|
||||
* x86_64's stat one by field ordering, so be careful.
|
||||
*/
|
||||
struct sys_stat_struct {
|
||||
unsigned long st_dev;
|
||||
unsigned long st_ino;
|
||||
unsigned int st_mode;
|
||||
unsigned int st_nlink;
|
||||
unsigned int st_uid;
|
||||
unsigned int st_gid;
|
||||
|
||||
unsigned long st_rdev;
|
||||
unsigned long __pad1;
|
||||
long st_size;
|
||||
int st_blksize;
|
||||
int __pad2;
|
||||
|
||||
long st_blocks;
|
||||
long st_atime;
|
||||
unsigned long st_atime_nsec;
|
||||
long st_mtime;
|
||||
|
||||
unsigned long st_mtime_nsec;
|
||||
long st_ctime;
|
||||
unsigned long st_ctime_nsec;
|
||||
unsigned int __unused[2];
|
||||
};
|
||||
|
||||
/* Syscalls for AARCH64 :
|
||||
* - registers are 64-bit
|
||||
* - stack is 16-byte aligned
|
||||
* - syscall number is passed in x8
|
||||
* - arguments are in x0, x1, x2, x3, x4, x5
|
||||
* - the system call is performed by calling svc 0
|
||||
* - syscall return comes in x0.
|
||||
* - the arguments are cast to long and assigned into the target registers
|
||||
* which are then simply passed as registers to the asm code, so that we
|
||||
* don't have to experience issues with register constraints.
|
||||
*
|
||||
* On aarch64, select() is not implemented so we have to use pselect6().
|
||||
*/
|
||||
#define __ARCH_WANT_SYS_PSELECT6
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("x1") = (long)(arg2); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("x1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("x2") = (long)(arg3); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("x1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("x2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("x3") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("x1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("x2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("x3") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("x4") = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r" (_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
|
||||
({ \
|
||||
register long _num __asm__ ("x8") = (num); \
|
||||
register long _arg1 __asm__ ("x0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("x1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("x2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("x3") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("x4") = (long)(arg5); \
|
||||
register long _arg6 __asm__ ("x5") = (long)(arg6); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r" (_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"r"(_arg6), "r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
/* startup code */
|
||||
__asm__ (".section .text\n"
|
||||
".weak _start\n"
|
||||
"_start:\n"
|
||||
"ldr x0, [sp]\n" // argc (x0) was in the stack
|
||||
"add x1, sp, 8\n" // argv (x1) = sp
|
||||
"lsl x2, x0, 3\n" // envp (x2) = 8*argc ...
|
||||
"add x2, x2, 8\n" // + 8 (skip null)
|
||||
"add x2, x2, x1\n" // + argv
|
||||
"and sp, x1, -16\n" // sp must be 16-byte aligned in the callee
|
||||
"bl main\n" // main() returns the status code, we'll exit with it.
|
||||
"mov x8, 93\n" // NR_exit == 93
|
||||
"svc #0\n"
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_AARCH64_H
|
||||
204
tools/include/nolibc/arch-arm.h
Normal file
204
tools/include/nolibc/arch-arm.h
Normal file
@@ -0,0 +1,204 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* ARM specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_ARM_H
|
||||
#define _NOLIBC_ARCH_ARM_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_CREAT 0x40
|
||||
#define O_EXCL 0x80
|
||||
#define O_NOCTTY 0x100
|
||||
#define O_TRUNC 0x200
|
||||
#define O_APPEND 0x400
|
||||
#define O_NONBLOCK 0x800
|
||||
#define O_DIRECTORY 0x4000
|
||||
|
||||
/* The struct returned by the stat() syscall, 32-bit only, the syscall returns
|
||||
* exactly 56 bytes (stops before the unused array). In big endian, the format
|
||||
* differs as devices are returned as short only.
|
||||
*/
|
||||
struct sys_stat_struct {
|
||||
#if defined(__ARMEB__)
|
||||
unsigned short st_dev;
|
||||
unsigned short __pad1;
|
||||
#else
|
||||
unsigned long st_dev;
|
||||
#endif
|
||||
unsigned long st_ino;
|
||||
unsigned short st_mode;
|
||||
unsigned short st_nlink;
|
||||
unsigned short st_uid;
|
||||
unsigned short st_gid;
|
||||
|
||||
#if defined(__ARMEB__)
|
||||
unsigned short st_rdev;
|
||||
unsigned short __pad2;
|
||||
#else
|
||||
unsigned long st_rdev;
|
||||
#endif
|
||||
unsigned long st_size;
|
||||
unsigned long st_blksize;
|
||||
unsigned long st_blocks;
|
||||
|
||||
unsigned long st_atime;
|
||||
unsigned long st_atime_nsec;
|
||||
unsigned long st_mtime;
|
||||
unsigned long st_mtime_nsec;
|
||||
|
||||
unsigned long st_ctime;
|
||||
unsigned long st_ctime_nsec;
|
||||
unsigned long __unused[2];
|
||||
};
|
||||
|
||||
/* Syscalls for ARM in ARM or Thumb modes :
|
||||
* - registers are 32-bit
|
||||
* - stack is 8-byte aligned
|
||||
* ( http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4127.html)
|
||||
* - syscall number is passed in r7
|
||||
* - arguments are in r0, r1, r2, r3, r4, r5
|
||||
* - the system call is performed by calling svc #0
|
||||
* - syscall return comes in r0.
|
||||
* - only lr is clobbered.
|
||||
* - the arguments are cast to long and assigned into the target registers
|
||||
* which are then simply passed as registers to the asm code, so that we
|
||||
* don't have to experience issues with register constraints.
|
||||
* - the syscall number is always specified last in order to allow to force
|
||||
* some registers before (gcc refuses a %-register at the last position).
|
||||
*
|
||||
* Also, ARM supports the old_select syscall if newselect is not available
|
||||
*/
|
||||
#define __ARCH_WANT_SYS_OLD_SELECT
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0") = (long)(arg1); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("r1") = (long)(arg2); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("r1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("r2") = (long)(arg3); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("r1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("r2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("r3") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
register long _num __asm__ ("r7") = (num); \
|
||||
register long _arg1 __asm__ ("r0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("r1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("r2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("r3") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("r4") = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"svc #0\n" \
|
||||
: "=r" (_arg1) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc", "lr" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
/* startup code */
|
||||
__asm__ (".section .text\n"
|
||||
".weak _start\n"
|
||||
"_start:\n"
|
||||
#if defined(__THUMBEB__) || defined(__THUMBEL__)
|
||||
/* We enter here in 32-bit mode but if some previous functions were in
|
||||
* 16-bit mode, the assembler cannot know, so we need to tell it we're in
|
||||
* 32-bit now, then switch to 16-bit (is there a better way to do it than
|
||||
* adding 1 by hand ?) and tell the asm we're now in 16-bit mode so that
|
||||
* it generates correct instructions. Note that we do not support thumb1.
|
||||
*/
|
||||
".code 32\n"
|
||||
"add r0, pc, #1\n"
|
||||
"bx r0\n"
|
||||
".code 16\n"
|
||||
#endif
|
||||
"pop {%r0}\n" // argc was in the stack
|
||||
"mov %r1, %sp\n" // argv = sp
|
||||
"add %r2, %r1, %r0, lsl #2\n" // envp = argv + 4*argc ...
|
||||
"add %r2, %r2, $4\n" // ... + 4
|
||||
"and %r3, %r1, $-8\n" // AAPCS : sp must be 8-byte aligned in the
|
||||
"mov %sp, %r3\n" // callee, an bl doesn't push (lr=pc)
|
||||
"bl main\n" // main() returns the status code, we'll exit with it.
|
||||
"movs r7, $1\n" // NR_exit == 1
|
||||
"svc $0x00\n"
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_ARM_H
|
||||
219
tools/include/nolibc/arch-i386.h
Normal file
219
tools/include/nolibc/arch-i386.h
Normal file
@@ -0,0 +1,219 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* i386 specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_I386_H
|
||||
#define _NOLIBC_ARCH_I386_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_CREAT 0x40
|
||||
#define O_EXCL 0x80
|
||||
#define O_NOCTTY 0x100
|
||||
#define O_TRUNC 0x200
|
||||
#define O_APPEND 0x400
|
||||
#define O_NONBLOCK 0x800
|
||||
#define O_DIRECTORY 0x10000
|
||||
|
||||
/* The struct returned by the stat() syscall, 32-bit only, the syscall returns
|
||||
* exactly 56 bytes (stops before the unused array).
|
||||
*/
|
||||
struct sys_stat_struct {
|
||||
unsigned long st_dev;
|
||||
unsigned long st_ino;
|
||||
unsigned short st_mode;
|
||||
unsigned short st_nlink;
|
||||
unsigned short st_uid;
|
||||
unsigned short st_gid;
|
||||
|
||||
unsigned long st_rdev;
|
||||
unsigned long st_size;
|
||||
unsigned long st_blksize;
|
||||
unsigned long st_blocks;
|
||||
|
||||
unsigned long st_atime;
|
||||
unsigned long st_atime_nsec;
|
||||
unsigned long st_mtime;
|
||||
unsigned long st_mtime_nsec;
|
||||
|
||||
unsigned long st_ctime;
|
||||
unsigned long st_ctime_nsec;
|
||||
unsigned long __unused[2];
|
||||
};
|
||||
|
||||
/* Syscalls for i386 :
|
||||
* - mostly similar to x86_64
|
||||
* - registers are 32-bit
|
||||
* - syscall number is passed in eax
|
||||
* - arguments are in ebx, ecx, edx, esi, edi, ebp respectively
|
||||
* - all registers are preserved (except eax of course)
|
||||
* - the system call is performed by calling int $0x80
|
||||
* - syscall return comes in eax
|
||||
* - the arguments are cast to long and assigned into the target registers
|
||||
* which are then simply passed as registers to the asm code, so that we
|
||||
* don't have to experience issues with register constraints.
|
||||
* - the syscall number is always specified last in order to allow to force
|
||||
* some registers before (gcc refuses a %-register at the last position).
|
||||
*
|
||||
* Also, i386 supports the old_select syscall if newselect is not available
|
||||
*/
|
||||
#define __ARCH_WANT_SYS_OLD_SELECT
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
register long _arg1 __asm__ ("ebx") = (long)(arg1); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "r"(_arg1), \
|
||||
"0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
register long _arg1 __asm__ ("ebx") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("ecx") = (long)(arg2); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), \
|
||||
"0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
register long _arg1 __asm__ ("ebx") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("ecx") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("edx") = (long)(arg3); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), \
|
||||
"0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
register long _arg1 __asm__ ("ebx") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("ecx") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("edx") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("esi") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), \
|
||||
"0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("eax") = (num); \
|
||||
register long _arg1 __asm__ ("ebx") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("ecx") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("edx") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("esi") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("edi") = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"int $0x80\n" \
|
||||
: "=a" (_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"0"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
|
||||
({ \
|
||||
long _eax = (long)(num); \
|
||||
long _arg6 = (long)(arg6); /* Always in memory */ \
|
||||
__asm__ volatile ( \
|
||||
"pushl %[_arg6]\n\t" \
|
||||
"pushl %%ebp\n\t" \
|
||||
"movl 4(%%esp),%%ebp\n\t" \
|
||||
"int $0x80\n\t" \
|
||||
"popl %%ebp\n\t" \
|
||||
"addl $4,%%esp\n\t" \
|
||||
: "+a"(_eax) /* %eax */ \
|
||||
: "b"(arg1), /* %ebx */ \
|
||||
"c"(arg2), /* %ecx */ \
|
||||
"d"(arg3), /* %edx */ \
|
||||
"S"(arg4), /* %esi */ \
|
||||
"D"(arg5), /* %edi */ \
|
||||
[_arg6]"m"(_arg6) /* memory */ \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_eax; \
|
||||
})
|
||||
|
||||
/* startup code */
|
||||
/*
|
||||
* i386 System V ABI mandates:
|
||||
* 1) last pushed argument must be 16-byte aligned.
|
||||
* 2) The deepest stack frame should be set to zero
|
||||
*
|
||||
*/
|
||||
__asm__ (".section .text\n"
|
||||
".weak _start\n"
|
||||
"_start:\n"
|
||||
"pop %eax\n" // argc (first arg, %eax)
|
||||
"mov %esp, %ebx\n" // argv[] (second arg, %ebx)
|
||||
"lea 4(%ebx,%eax,4),%ecx\n" // then a NULL then envp (third arg, %ecx)
|
||||
"xor %ebp, %ebp\n" // zero the stack frame
|
||||
"and $-16, %esp\n" // x86 ABI : esp must be 16-byte aligned before
|
||||
"sub $4, %esp\n" // the call instruction (args are aligned)
|
||||
"push %ecx\n" // push all registers on the stack so that we
|
||||
"push %ebx\n" // support both regparm and plain stack modes
|
||||
"push %eax\n"
|
||||
"call main\n" // main() returns the status code in %eax
|
||||
"mov %eax, %ebx\n" // retrieve exit code (32-bit int)
|
||||
"movl $1, %eax\n" // NR_exit == 1
|
||||
"int $0x80\n" // exit now
|
||||
"hlt\n" // ensure it does not
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_I386_H
|
||||
215
tools/include/nolibc/arch-mips.h
Normal file
215
tools/include/nolibc/arch-mips.h
Normal file
@@ -0,0 +1,215 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* MIPS specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_MIPS_H
|
||||
#define _NOLIBC_ARCH_MIPS_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_APPEND 0x0008
|
||||
#define O_NONBLOCK 0x0080
|
||||
#define O_CREAT 0x0100
|
||||
#define O_TRUNC 0x0200
|
||||
#define O_EXCL 0x0400
|
||||
#define O_NOCTTY 0x0800
|
||||
#define O_DIRECTORY 0x10000
|
||||
|
||||
/* The struct returned by the stat() syscall. 88 bytes are returned by the
|
||||
* syscall.
|
||||
*/
|
||||
struct sys_stat_struct {
|
||||
unsigned int st_dev;
|
||||
long st_pad1[3];
|
||||
unsigned long st_ino;
|
||||
unsigned int st_mode;
|
||||
unsigned int st_nlink;
|
||||
unsigned int st_uid;
|
||||
unsigned int st_gid;
|
||||
unsigned int st_rdev;
|
||||
long st_pad2[2];
|
||||
long st_size;
|
||||
long st_pad3;
|
||||
|
||||
long st_atime;
|
||||
long st_atime_nsec;
|
||||
long st_mtime;
|
||||
long st_mtime_nsec;
|
||||
|
||||
long st_ctime;
|
||||
long st_ctime_nsec;
|
||||
long st_blksize;
|
||||
long st_blocks;
|
||||
long st_pad4[14];
|
||||
};
|
||||
|
||||
/* Syscalls for MIPS ABI O32 :
|
||||
* - WARNING! there's always a delayed slot!
|
||||
* - WARNING again, the syntax is different, registers take a '$' and numbers
|
||||
* do not.
|
||||
* - registers are 32-bit
|
||||
* - stack is 8-byte aligned
|
||||
* - syscall number is passed in v0 (starts at 0xfa0).
|
||||
* - arguments are in a0, a1, a2, a3, then the stack. The caller needs to
|
||||
* leave some room in the stack for the callee to save a0..a3 if needed.
|
||||
* - Many registers are clobbered, in fact only a0..a2 and s0..s8 are
|
||||
* preserved. See: https://www.linux-mips.org/wiki/Syscall as well as
|
||||
* scall32-o32.S in the kernel sources.
|
||||
* - the system call is performed by calling "syscall"
|
||||
* - syscall return comes in v0, and register a3 needs to be checked to know
|
||||
* if an error occurred, in which case errno is in v0.
|
||||
* - the arguments are cast to long and assigned into the target registers
|
||||
* which are then simply passed as registers to the asm code, so that we
|
||||
* don't have to experience issues with register constraints.
|
||||
*/
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg4 __asm__ ("a3"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"syscall\n" \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r"(_num), "=r"(_arg4) \
|
||||
: "r"(_num) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg4 __asm__ ("a3"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"syscall\n" \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r"(_num), "=r"(_arg4) \
|
||||
: "0"(_num), \
|
||||
"r"(_arg1) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg4 __asm__ ("a3"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"syscall\n" \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r"(_num), "=r"(_arg4) \
|
||||
: "0"(_num), \
|
||||
"r"(_arg1), "r"(_arg2) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"syscall\n" \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r"(_num), "=r"(_arg4) \
|
||||
: "0"(_num), \
|
||||
"r"(_arg1), "r"(_arg2), "r"(_arg3) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"syscall\n" \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r" (_num), "=r"(_arg4) \
|
||||
: "0"(_num), \
|
||||
"r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
register long _num __asm__ ("v0") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3") = (long)(arg4); \
|
||||
register long _arg5 = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"addiu $sp, $sp, -32\n" \
|
||||
"sw %7, 16($sp)\n" \
|
||||
"syscall\n " \
|
||||
"addiu $sp, $sp, 32\n" \
|
||||
: "=r" (_num), "=r"(_arg4) \
|
||||
: "0"(_num), \
|
||||
"r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5) \
|
||||
: "memory", "cc", "at", "v1", "hi", "lo", \
|
||||
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
|
||||
); \
|
||||
_arg4 ? -_num : _num; \
|
||||
})
|
||||
|
||||
/* startup code, note that it's called __start on MIPS */
|
||||
__asm__ (".section .text\n"
|
||||
".weak __start\n"
|
||||
".set nomips16\n"
|
||||
".set noreorder\n"
|
||||
".option pic0\n"
|
||||
".ent __start\n"
|
||||
"__start:\n"
|
||||
"lw $a0,($sp)\n" // argc was in the stack
|
||||
"addiu $a1, $sp, 4\n" // argv = sp + 4
|
||||
"sll $a2, $a0, 2\n" // a2 = argc * 4
|
||||
"add $a2, $a2, $a1\n" // envp = argv + 4*argc ...
|
||||
"addiu $a2, $a2, 4\n" // ... + 4
|
||||
"li $t0, -8\n"
|
||||
"and $sp, $sp, $t0\n" // sp must be 8-byte aligned
|
||||
"addiu $sp,$sp,-16\n" // the callee expects to save a0..a3 there!
|
||||
"jal main\n" // main() returns the status code, we'll exit with it.
|
||||
"nop\n" // delayed slot
|
||||
"move $a0, $v0\n" // retrieve 32-bit exit code from v0
|
||||
"li $v0, 4001\n" // NR_exit == 4001
|
||||
"syscall\n"
|
||||
".end __start\n"
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_MIPS_H
|
||||
204
tools/include/nolibc/arch-riscv.h
Normal file
204
tools/include/nolibc/arch-riscv.h
Normal file
@@ -0,0 +1,204 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* RISCV (32 and 64) specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_RISCV_H
|
||||
#define _NOLIBC_ARCH_RISCV_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_CREAT 0x100
|
||||
#define O_EXCL 0x200
|
||||
#define O_NOCTTY 0x400
|
||||
#define O_TRUNC 0x1000
|
||||
#define O_APPEND 0x2000
|
||||
#define O_NONBLOCK 0x4000
|
||||
#define O_DIRECTORY 0x200000
|
||||
|
||||
struct sys_stat_struct {
|
||||
unsigned long st_dev; /* Device. */
|
||||
unsigned long st_ino; /* File serial number. */
|
||||
unsigned int st_mode; /* File mode. */
|
||||
unsigned int st_nlink; /* Link count. */
|
||||
unsigned int st_uid; /* User ID of the file's owner. */
|
||||
unsigned int st_gid; /* Group ID of the file's group. */
|
||||
unsigned long st_rdev; /* Device number, if device. */
|
||||
unsigned long __pad1;
|
||||
long st_size; /* Size of file, in bytes. */
|
||||
int st_blksize; /* Optimal block size for I/O. */
|
||||
int __pad2;
|
||||
long st_blocks; /* Number 512-byte blocks allocated. */
|
||||
long st_atime; /* Time of last access. */
|
||||
unsigned long st_atime_nsec;
|
||||
long st_mtime; /* Time of last modification. */
|
||||
unsigned long st_mtime_nsec;
|
||||
long st_ctime; /* Time of last status change. */
|
||||
unsigned long st_ctime_nsec;
|
||||
unsigned int __unused4;
|
||||
unsigned int __unused5;
|
||||
};
|
||||
|
||||
#if __riscv_xlen == 64
|
||||
#define PTRLOG "3"
|
||||
#define SZREG "8"
|
||||
#elif __riscv_xlen == 32
|
||||
#define PTRLOG "2"
|
||||
#define SZREG "4"
|
||||
#endif
|
||||
|
||||
/* Syscalls for RISCV :
|
||||
* - stack is 16-byte aligned
|
||||
* - syscall number is passed in a7
|
||||
* - arguments are in a0, a1, a2, a3, a4, a5
|
||||
* - the system call is performed by calling ecall
|
||||
* - syscall return comes in a0
|
||||
* - the arguments are cast to long and assigned into the target
|
||||
* registers which are then simply passed as registers to the asm code,
|
||||
* so that we don't have to experience issues with register constraints.
|
||||
*
|
||||
* On riscv, select() is not implemented so we have to use pselect6().
|
||||
*/
|
||||
#define __ARCH_WANT_SYS_PSELECT6
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0"); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n\t" \
|
||||
: "=r"(_arg1) \
|
||||
: "r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_arg2), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n\t" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_arg2), "r"(_arg3), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_arg2), "r"(_arg3), "r"(_arg4), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("a4") = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
|
||||
({ \
|
||||
register long _num __asm__ ("a7") = (num); \
|
||||
register long _arg1 __asm__ ("a0") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("a1") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("a2") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("a3") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("a4") = (long)(arg5); \
|
||||
register long _arg6 __asm__ ("a5") = (long)(arg6); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"ecall\n" \
|
||||
: "+r"(_arg1) \
|
||||
: "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), "r"(_arg6), \
|
||||
"r"(_num) \
|
||||
: "memory", "cc" \
|
||||
); \
|
||||
_arg1; \
|
||||
})
|
||||
|
||||
/* startup code */
|
||||
__asm__ (".section .text\n"
|
||||
".weak _start\n"
|
||||
"_start:\n"
|
||||
".option push\n"
|
||||
".option norelax\n"
|
||||
"lla gp, __global_pointer$\n"
|
||||
".option pop\n"
|
||||
"ld a0, 0(sp)\n" // argc (a0) was in the stack
|
||||
"add a1, sp, "SZREG"\n" // argv (a1) = sp
|
||||
"slli a2, a0, "PTRLOG"\n" // envp (a2) = SZREG*argc ...
|
||||
"add a2, a2, "SZREG"\n" // + SZREG (skip null)
|
||||
"add a2,a2,a1\n" // + argv
|
||||
"andi sp,a1,-16\n" // sp must be 16-byte aligned
|
||||
"call main\n" // main() returns the status code, we'll exit with it.
|
||||
"li a7, 93\n" // NR_exit == 93
|
||||
"ecall\n"
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_RISCV_H
|
||||
215
tools/include/nolibc/arch-x86_64.h
Normal file
215
tools/include/nolibc/arch-x86_64.h
Normal file
@@ -0,0 +1,215 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* x86_64 specific definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_X86_64_H
|
||||
#define _NOLIBC_ARCH_X86_64_H
|
||||
|
||||
/* O_* macros for fcntl/open are architecture-specific */
|
||||
#define O_RDONLY 0
|
||||
#define O_WRONLY 1
|
||||
#define O_RDWR 2
|
||||
#define O_CREAT 0x40
|
||||
#define O_EXCL 0x80
|
||||
#define O_NOCTTY 0x100
|
||||
#define O_TRUNC 0x200
|
||||
#define O_APPEND 0x400
|
||||
#define O_NONBLOCK 0x800
|
||||
#define O_DIRECTORY 0x10000
|
||||
|
||||
/* The struct returned by the stat() syscall, equivalent to stat64(). The
|
||||
* syscall returns 116 bytes and stops in the middle of __unused.
|
||||
*/
|
||||
struct sys_stat_struct {
|
||||
unsigned long st_dev;
|
||||
unsigned long st_ino;
|
||||
unsigned long st_nlink;
|
||||
unsigned int st_mode;
|
||||
unsigned int st_uid;
|
||||
|
||||
unsigned int st_gid;
|
||||
unsigned int __pad0;
|
||||
unsigned long st_rdev;
|
||||
long st_size;
|
||||
long st_blksize;
|
||||
|
||||
long st_blocks;
|
||||
unsigned long st_atime;
|
||||
unsigned long st_atime_nsec;
|
||||
unsigned long st_mtime;
|
||||
|
||||
unsigned long st_mtime_nsec;
|
||||
unsigned long st_ctime;
|
||||
unsigned long st_ctime_nsec;
|
||||
long __unused[3];
|
||||
};
|
||||
|
||||
/* Syscalls for x86_64 :
|
||||
* - registers are 64-bit
|
||||
* - syscall number is passed in rax
|
||||
* - arguments are in rdi, rsi, rdx, r10, r8, r9 respectively
|
||||
* - the system call is performed by calling the syscall instruction
|
||||
* - syscall return comes in rax
|
||||
* - rcx and r11 are clobbered, others are preserved.
|
||||
* - the arguments are cast to long and assigned into the target registers
|
||||
* which are then simply passed as registers to the asm code, so that we
|
||||
* don't have to experience issues with register constraints.
|
||||
* - the syscall number is always specified last in order to allow to force
|
||||
* some registers before (gcc refuses a %-register at the last position).
|
||||
* - see also x86-64 ABI section A.2 AMD64 Linux Kernel Conventions, A.2.1
|
||||
* Calling Conventions.
|
||||
*
|
||||
* Link x86-64 ABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/home
|
||||
*
|
||||
*/
|
||||
|
||||
#define my_syscall0(num) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall1(num, arg1) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), \
|
||||
"0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall2(num, arg1, arg2) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("rsi") = (long)(arg2); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), \
|
||||
"0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall3(num, arg1, arg2, arg3) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("rsi") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("rdx") = (long)(arg3); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), \
|
||||
"0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall4(num, arg1, arg2, arg3, arg4) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("rsi") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("rdx") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("r10") = (long)(arg4); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), \
|
||||
"0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("rsi") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("rdx") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("r10") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("r8") = (long)(arg5); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
|
||||
({ \
|
||||
long _ret; \
|
||||
register long _num __asm__ ("rax") = (num); \
|
||||
register long _arg1 __asm__ ("rdi") = (long)(arg1); \
|
||||
register long _arg2 __asm__ ("rsi") = (long)(arg2); \
|
||||
register long _arg3 __asm__ ("rdx") = (long)(arg3); \
|
||||
register long _arg4 __asm__ ("r10") = (long)(arg4); \
|
||||
register long _arg5 __asm__ ("r8") = (long)(arg5); \
|
||||
register long _arg6 __asm__ ("r9") = (long)(arg6); \
|
||||
\
|
||||
__asm__ volatile ( \
|
||||
"syscall\n" \
|
||||
: "=a"(_ret) \
|
||||
: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
|
||||
"r"(_arg6), "0"(_num) \
|
||||
: "rcx", "r11", "memory", "cc" \
|
||||
); \
|
||||
_ret; \
|
||||
})
|
||||
|
||||
/* startup code */
|
||||
/*
|
||||
* x86-64 System V ABI mandates:
|
||||
* 1) %rsp must be 16-byte aligned right before the function call.
|
||||
* 2) The deepest stack frame should be zero (the %rbp).
|
||||
*
|
||||
*/
|
||||
__asm__ (".section .text\n"
|
||||
".weak _start\n"
|
||||
"_start:\n"
|
||||
"pop %rdi\n" // argc (first arg, %rdi)
|
||||
"mov %rsp, %rsi\n" // argv[] (second arg, %rsi)
|
||||
"lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
|
||||
"xor %ebp, %ebp\n" // zero the stack frame
|
||||
"and $-16, %rsp\n" // x86 ABI : esp must be 16-byte aligned before call
|
||||
"call main\n" // main() returns the status code, we'll exit with it.
|
||||
"mov %eax, %edi\n" // retrieve exit code (32 bit)
|
||||
"mov $60, %eax\n" // NR_exit == 60
|
||||
"syscall\n" // really exit
|
||||
"hlt\n" // ensure it does not return
|
||||
"");
|
||||
|
||||
#endif // _NOLIBC_ARCH_X86_64_H
|
||||
32
tools/include/nolibc/arch.h
Normal file
32
tools/include/nolibc/arch.h
Normal file
@@ -0,0 +1,32 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
/* Below comes the architecture-specific code. For each architecture, we have
|
||||
* the syscall declarations and the _start code definition. This is the only
|
||||
* global part. On all architectures the kernel puts everything in the stack
|
||||
* before jumping to _start just above us, without any return address (_start
|
||||
* is not a function but an entry pint). So at the stack pointer we find argc.
|
||||
* Then argv[] begins, and ends at the first NULL. Then we have envp which
|
||||
* starts and ends with a NULL as well. So envp=argv+argc+1.
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ARCH_H
|
||||
#define _NOLIBC_ARCH_H
|
||||
|
||||
#if defined(__x86_64__)
|
||||
#include "arch-x86_64.h"
|
||||
#elif defined(__i386__) || defined(__i486__) || defined(__i586__) || defined(__i686__)
|
||||
#include "arch-i386.h"
|
||||
#elif defined(__ARM_EABI__)
|
||||
#include "arch-arm.h"
|
||||
#elif defined(__aarch64__)
|
||||
#include "arch-aarch64.h"
|
||||
#elif defined(__mips__) && defined(_ABIO32)
|
||||
#include "arch-mips.h"
|
||||
#elif defined(__riscv)
|
||||
#include "arch-riscv.h"
|
||||
#endif
|
||||
|
||||
#endif /* _NOLIBC_ARCH_H */
|
||||
99
tools/include/nolibc/ctype.h
Normal file
99
tools/include/nolibc/ctype.h
Normal file
@@ -0,0 +1,99 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* ctype function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_CTYPE_H
|
||||
#define _NOLIBC_CTYPE_H
|
||||
|
||||
#include "std.h"
|
||||
|
||||
/*
|
||||
* As much as possible, please keep functions alphabetically sorted.
|
||||
*/
|
||||
|
||||
static __attribute__((unused))
|
||||
int isascii(int c)
|
||||
{
|
||||
/* 0x00..0x7f */
|
||||
return (unsigned int)c <= 0x7f;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isblank(int c)
|
||||
{
|
||||
return c == '\t' || c == ' ';
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int iscntrl(int c)
|
||||
{
|
||||
/* 0x00..0x1f, 0x7f */
|
||||
return (unsigned int)c < 0x20 || c == 0x7f;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isdigit(int c)
|
||||
{
|
||||
return (unsigned int)(c - '0') < 10;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isgraph(int c)
|
||||
{
|
||||
/* 0x21..0x7e */
|
||||
return (unsigned int)(c - 0x21) < 0x5e;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int islower(int c)
|
||||
{
|
||||
return (unsigned int)(c - 'a') < 26;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isprint(int c)
|
||||
{
|
||||
/* 0x20..0x7e */
|
||||
return (unsigned int)(c - 0x20) < 0x5f;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isspace(int c)
|
||||
{
|
||||
/* \t is 0x9, \n is 0xA, \v is 0xB, \f is 0xC, \r is 0xD */
|
||||
return ((unsigned int)c == ' ') || (unsigned int)(c - 0x09) < 5;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isupper(int c)
|
||||
{
|
||||
return (unsigned int)(c - 'A') < 26;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isxdigit(int c)
|
||||
{
|
||||
return isdigit(c) || (unsigned int)(c - 'A') < 6 || (unsigned int)(c - 'a') < 6;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isalpha(int c)
|
||||
{
|
||||
return islower(c) || isupper(c);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int isalnum(int c)
|
||||
{
|
||||
return isalpha(c) || isdigit(c);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int ispunct(int c)
|
||||
{
|
||||
return isgraph(c) && !isalnum(c);
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_CTYPE_H */
|
||||
27
tools/include/nolibc/errno.h
Normal file
27
tools/include/nolibc/errno.h
Normal file
@@ -0,0 +1,27 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* Minimal errno definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_ERRNO_H
|
||||
#define _NOLIBC_ERRNO_H
|
||||
|
||||
#include <asm/errno.h>
|
||||
|
||||
/* this way it will be removed if unused */
|
||||
static int errno;
|
||||
|
||||
#ifndef NOLIBC_IGNORE_ERRNO
|
||||
#define SET_ERRNO(v) do { errno = (v); } while (0)
|
||||
#else
|
||||
#define SET_ERRNO(v) do { } while (0)
|
||||
#endif
|
||||
|
||||
|
||||
/* errno codes all ensure that they will not conflict with a valid pointer
|
||||
* because they all correspond to the highest addressable memory page.
|
||||
*/
|
||||
#define MAX_ERRNO 4095
|
||||
|
||||
#endif /* _NOLIBC_ERRNO_H */
|
||||
File diff suppressed because it is too large
Load Diff
22
tools/include/nolibc/signal.h
Normal file
22
tools/include/nolibc/signal.h
Normal file
@@ -0,0 +1,22 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* signal function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_SIGNAL_H
|
||||
#define _NOLIBC_SIGNAL_H
|
||||
|
||||
#include "std.h"
|
||||
#include "arch.h"
|
||||
#include "types.h"
|
||||
#include "sys.h"
|
||||
|
||||
/* This one is not marked static as it's needed by libgcc for divide by zero */
|
||||
__attribute__((weak,unused,section(".text.nolibc_raise")))
|
||||
int raise(int signal)
|
||||
{
|
||||
return sys_kill(sys_getpid(), signal);
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_SIGNAL_H */
|
||||
49
tools/include/nolibc/std.h
Normal file
49
tools/include/nolibc/std.h
Normal file
@@ -0,0 +1,49 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* Standard definitions and types for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_STD_H
|
||||
#define _NOLIBC_STD_H
|
||||
|
||||
/* Declare a few quite common macros and types that usually are in stdlib.h,
|
||||
* stdint.h, ctype.h, unistd.h and a few other common locations. Please place
|
||||
* integer type definitions and generic macros here, but avoid OS-specific and
|
||||
* syscall-specific stuff, as this file is expected to be included very early.
|
||||
*/
|
||||
|
||||
/* note: may already be defined */
|
||||
#ifndef NULL
|
||||
#define NULL ((void *)0)
|
||||
#endif
|
||||
|
||||
/* stdint types */
|
||||
typedef unsigned char uint8_t;
|
||||
typedef signed char int8_t;
|
||||
typedef unsigned short uint16_t;
|
||||
typedef signed short int16_t;
|
||||
typedef unsigned int uint32_t;
|
||||
typedef signed int int32_t;
|
||||
typedef unsigned long long uint64_t;
|
||||
typedef signed long long int64_t;
|
||||
typedef unsigned long size_t;
|
||||
typedef signed long ssize_t;
|
||||
typedef unsigned long uintptr_t;
|
||||
typedef signed long intptr_t;
|
||||
typedef signed long ptrdiff_t;
|
||||
|
||||
/* those are commonly provided by sys/types.h */
|
||||
typedef unsigned int dev_t;
|
||||
typedef unsigned long ino_t;
|
||||
typedef unsigned int mode_t;
|
||||
typedef signed int pid_t;
|
||||
typedef unsigned int uid_t;
|
||||
typedef unsigned int gid_t;
|
||||
typedef unsigned long nlink_t;
|
||||
typedef signed long off_t;
|
||||
typedef signed long blksize_t;
|
||||
typedef signed long blkcnt_t;
|
||||
typedef signed long time_t;
|
||||
|
||||
#endif /* _NOLIBC_STD_H */
|
||||
306
tools/include/nolibc/stdio.h
Normal file
306
tools/include/nolibc/stdio.h
Normal file
@@ -0,0 +1,306 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* minimal stdio function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_STDIO_H
|
||||
#define _NOLIBC_STDIO_H
|
||||
|
||||
#include <stdarg.h>
|
||||
|
||||
#include "std.h"
|
||||
#include "arch.h"
|
||||
#include "errno.h"
|
||||
#include "types.h"
|
||||
#include "sys.h"
|
||||
#include "stdlib.h"
|
||||
#include "string.h"
|
||||
|
||||
#ifndef EOF
|
||||
#define EOF (-1)
|
||||
#endif
|
||||
|
||||
/* just define FILE as a non-empty type */
|
||||
typedef struct FILE {
|
||||
char dummy[1];
|
||||
} FILE;
|
||||
|
||||
/* We define the 3 common stdio files as constant invalid pointers that
|
||||
* are easily recognized.
|
||||
*/
|
||||
static __attribute__((unused)) FILE* const stdin = (FILE*)-3;
|
||||
static __attribute__((unused)) FILE* const stdout = (FILE*)-2;
|
||||
static __attribute__((unused)) FILE* const stderr = (FILE*)-1;
|
||||
|
||||
/* getc(), fgetc(), getchar() */
|
||||
|
||||
#define getc(stream) fgetc(stream)
|
||||
|
||||
static __attribute__((unused))
|
||||
int fgetc(FILE* stream)
|
||||
{
|
||||
unsigned char ch;
|
||||
int fd;
|
||||
|
||||
if (stream < stdin || stream > stderr)
|
||||
return EOF;
|
||||
|
||||
fd = 3 + (long)stream;
|
||||
|
||||
if (read(fd, &ch, 1) <= 0)
|
||||
return EOF;
|
||||
return ch;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int getchar(void)
|
||||
{
|
||||
return fgetc(stdin);
|
||||
}
|
||||
|
||||
|
||||
/* putc(), fputc(), putchar() */
|
||||
|
||||
#define putc(c, stream) fputc(c, stream)
|
||||
|
||||
static __attribute__((unused))
|
||||
int fputc(int c, FILE* stream)
|
||||
{
|
||||
unsigned char ch = c;
|
||||
int fd;
|
||||
|
||||
if (stream < stdin || stream > stderr)
|
||||
return EOF;
|
||||
|
||||
fd = 3 + (long)stream;
|
||||
|
||||
if (write(fd, &ch, 1) <= 0)
|
||||
return EOF;
|
||||
return ch;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int putchar(int c)
|
||||
{
|
||||
return fputc(c, stdout);
|
||||
}
|
||||
|
||||
|
||||
/* fwrite(), puts(), fputs(). Note that puts() emits '\n' but not fputs(). */
|
||||
|
||||
/* internal fwrite()-like function which only takes a size and returns 0 on
|
||||
* success or EOF on error. It automatically retries on short writes.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int _fwrite(const void *buf, size_t size, FILE *stream)
|
||||
{
|
||||
ssize_t ret;
|
||||
int fd;
|
||||
|
||||
if (stream < stdin || stream > stderr)
|
||||
return EOF;
|
||||
|
||||
fd = 3 + (long)stream;
|
||||
|
||||
while (size) {
|
||||
ret = write(fd, buf, size);
|
||||
if (ret <= 0)
|
||||
return EOF;
|
||||
size -= ret;
|
||||
buf += ret;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
size_t fwrite(const void *s, size_t size, size_t nmemb, FILE *stream)
|
||||
{
|
||||
size_t written;
|
||||
|
||||
for (written = 0; written < nmemb; written++) {
|
||||
if (_fwrite(s, size, stream) != 0)
|
||||
break;
|
||||
s += size;
|
||||
}
|
||||
return written;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int fputs(const char *s, FILE *stream)
|
||||
{
|
||||
return _fwrite(s, strlen(s), stream);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int puts(const char *s)
|
||||
{
|
||||
if (fputs(s, stdout) == EOF)
|
||||
return EOF;
|
||||
return putchar('\n');
|
||||
}
|
||||
|
||||
|
||||
/* fgets() */
|
||||
static __attribute__((unused))
|
||||
char *fgets(char *s, int size, FILE *stream)
|
||||
{
|
||||
int ofs;
|
||||
int c;
|
||||
|
||||
for (ofs = 0; ofs + 1 < size;) {
|
||||
c = fgetc(stream);
|
||||
if (c == EOF)
|
||||
break;
|
||||
s[ofs++] = c;
|
||||
if (c == '\n')
|
||||
break;
|
||||
}
|
||||
if (ofs < size)
|
||||
s[ofs] = 0;
|
||||
return ofs ? s : NULL;
|
||||
}
|
||||
|
||||
|
||||
/* minimal vfprintf(). It supports the following formats:
|
||||
* - %[l*]{d,u,c,x,p}
|
||||
* - %s
|
||||
* - unknown modifiers are ignored.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int vfprintf(FILE *stream, const char *fmt, va_list args)
|
||||
{
|
||||
char escape, lpref, c;
|
||||
unsigned long long v;
|
||||
unsigned int written;
|
||||
size_t len, ofs;
|
||||
char tmpbuf[21];
|
||||
const char *outstr;
|
||||
|
||||
written = ofs = escape = lpref = 0;
|
||||
while (1) {
|
||||
c = fmt[ofs++];
|
||||
|
||||
if (escape) {
|
||||
/* we're in an escape sequence, ofs == 1 */
|
||||
escape = 0;
|
||||
if (c == 'c' || c == 'd' || c == 'u' || c == 'x' || c == 'p') {
|
||||
char *out = tmpbuf;
|
||||
|
||||
if (c == 'p')
|
||||
v = va_arg(args, unsigned long);
|
||||
else if (lpref) {
|
||||
if (lpref > 1)
|
||||
v = va_arg(args, unsigned long long);
|
||||
else
|
||||
v = va_arg(args, unsigned long);
|
||||
} else
|
||||
v = va_arg(args, unsigned int);
|
||||
|
||||
if (c == 'd') {
|
||||
/* sign-extend the value */
|
||||
if (lpref == 0)
|
||||
v = (long long)(int)v;
|
||||
else if (lpref == 1)
|
||||
v = (long long)(long)v;
|
||||
}
|
||||
|
||||
switch (c) {
|
||||
case 'c':
|
||||
out[0] = v;
|
||||
out[1] = 0;
|
||||
break;
|
||||
case 'd':
|
||||
i64toa_r(v, out);
|
||||
break;
|
||||
case 'u':
|
||||
u64toa_r(v, out);
|
||||
break;
|
||||
case 'p':
|
||||
*(out++) = '0';
|
||||
*(out++) = 'x';
|
||||
/* fall through */
|
||||
default: /* 'x' and 'p' above */
|
||||
u64toh_r(v, out);
|
||||
break;
|
||||
}
|
||||
outstr = tmpbuf;
|
||||
}
|
||||
else if (c == 's') {
|
||||
outstr = va_arg(args, char *);
|
||||
if (!outstr)
|
||||
outstr="(null)";
|
||||
}
|
||||
else if (c == '%') {
|
||||
/* queue it verbatim */
|
||||
continue;
|
||||
}
|
||||
else {
|
||||
/* modifiers or final 0 */
|
||||
if (c == 'l') {
|
||||
/* long format prefix, maintain the escape */
|
||||
lpref++;
|
||||
}
|
||||
escape = 1;
|
||||
goto do_escape;
|
||||
}
|
||||
len = strlen(outstr);
|
||||
goto flush_str;
|
||||
}
|
||||
|
||||
/* not an escape sequence */
|
||||
if (c == 0 || c == '%') {
|
||||
/* flush pending data on escape or end */
|
||||
escape = 1;
|
||||
lpref = 0;
|
||||
outstr = fmt;
|
||||
len = ofs - 1;
|
||||
flush_str:
|
||||
if (_fwrite(outstr, len, stream) != 0)
|
||||
break;
|
||||
|
||||
written += len;
|
||||
do_escape:
|
||||
if (c == 0)
|
||||
break;
|
||||
fmt += ofs;
|
||||
ofs = 0;
|
||||
continue;
|
||||
}
|
||||
|
||||
/* literal char, just queue it */
|
||||
}
|
||||
return written;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int fprintf(FILE *stream, const char *fmt, ...)
|
||||
{
|
||||
va_list args;
|
||||
int ret;
|
||||
|
||||
va_start(args, fmt);
|
||||
ret = vfprintf(stream, fmt, args);
|
||||
va_end(args);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int printf(const char *fmt, ...)
|
||||
{
|
||||
va_list args;
|
||||
int ret;
|
||||
|
||||
va_start(args, fmt);
|
||||
ret = vfprintf(stdout, fmt, args);
|
||||
va_end(args);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void perror(const char *msg)
|
||||
{
|
||||
fprintf(stderr, "%s%serrno=%d\n", (msg && *msg) ? msg : "", (msg && *msg) ? ": " : "", errno);
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_STDIO_H */
|
||||
423
tools/include/nolibc/stdlib.h
Normal file
423
tools/include/nolibc/stdlib.h
Normal file
@@ -0,0 +1,423 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* stdlib function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_STDLIB_H
|
||||
#define _NOLIBC_STDLIB_H
|
||||
|
||||
#include "std.h"
|
||||
#include "arch.h"
|
||||
#include "types.h"
|
||||
#include "sys.h"
|
||||
#include "string.h"
|
||||
|
||||
struct nolibc_heap {
|
||||
size_t len;
|
||||
char user_p[] __attribute__((__aligned__));
|
||||
};
|
||||
|
||||
/* Buffer used to store int-to-ASCII conversions. Will only be implemented if
|
||||
* any of the related functions is implemented. The area is large enough to
|
||||
* store "18446744073709551615" or "-9223372036854775808" and the final zero.
|
||||
*/
|
||||
static __attribute__((unused)) char itoa_buffer[21];
|
||||
|
||||
/*
|
||||
* As much as possible, please keep functions alphabetically sorted.
|
||||
*/
|
||||
|
||||
/* must be exported, as it's used by libgcc for various divide functions */
|
||||
__attribute__((weak,unused,noreturn,section(".text.nolibc_abort")))
|
||||
void abort(void)
|
||||
{
|
||||
sys_kill(sys_getpid(), SIGABRT);
|
||||
for (;;);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
long atol(const char *s)
|
||||
{
|
||||
unsigned long ret = 0;
|
||||
unsigned long d;
|
||||
int neg = 0;
|
||||
|
||||
if (*s == '-') {
|
||||
neg = 1;
|
||||
s++;
|
||||
}
|
||||
|
||||
while (1) {
|
||||
d = (*s++) - '0';
|
||||
if (d > 9)
|
||||
break;
|
||||
ret *= 10;
|
||||
ret += d;
|
||||
}
|
||||
|
||||
return neg ? -ret : ret;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int atoi(const char *s)
|
||||
{
|
||||
return atol(s);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void free(void *ptr)
|
||||
{
|
||||
struct nolibc_heap *heap;
|
||||
|
||||
if (!ptr)
|
||||
return;
|
||||
|
||||
heap = container_of(ptr, struct nolibc_heap, user_p);
|
||||
munmap(heap, heap->len);
|
||||
}
|
||||
|
||||
/* getenv() tries to find the environment variable named <name> in the
|
||||
* environment array pointed to by global variable "environ" which must be
|
||||
* declared as a char **, and must be terminated by a NULL (it is recommended
|
||||
* to set this variable to the "envp" argument of main()). If the requested
|
||||
* environment variable exists its value is returned otherwise NULL is
|
||||
* returned. getenv() is forcefully inlined so that the reference to "environ"
|
||||
* will be dropped if unused, even at -O0.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
char *_getenv(const char *name, char **environ)
|
||||
{
|
||||
int idx, i;
|
||||
|
||||
if (environ) {
|
||||
for (idx = 0; environ[idx]; idx++) {
|
||||
for (i = 0; name[i] && name[i] == environ[idx][i];)
|
||||
i++;
|
||||
if (!name[i] && environ[idx][i] == '=')
|
||||
return &environ[idx][i+1];
|
||||
}
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static inline __attribute__((unused,always_inline))
|
||||
char *getenv(const char *name)
|
||||
{
|
||||
extern char **environ;
|
||||
return _getenv(name, environ);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void *malloc(size_t len)
|
||||
{
|
||||
struct nolibc_heap *heap;
|
||||
|
||||
/* Always allocate memory with size multiple of 4096. */
|
||||
len = sizeof(*heap) + len;
|
||||
len = (len + 4095UL) & -4096UL;
|
||||
heap = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE,
|
||||
-1, 0);
|
||||
if (__builtin_expect(heap == MAP_FAILED, 0))
|
||||
return NULL;
|
||||
|
||||
heap->len = len;
|
||||
return heap->user_p;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void *calloc(size_t size, size_t nmemb)
|
||||
{
|
||||
void *orig;
|
||||
size_t res = 0;
|
||||
|
||||
if (__builtin_expect(__builtin_mul_overflow(nmemb, size, &res), 0)) {
|
||||
SET_ERRNO(ENOMEM);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* No need to zero the heap, the MAP_ANONYMOUS in malloc()
|
||||
* already does it.
|
||||
*/
|
||||
return malloc(res);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void *realloc(void *old_ptr, size_t new_size)
|
||||
{
|
||||
struct nolibc_heap *heap;
|
||||
size_t user_p_len;
|
||||
void *ret;
|
||||
|
||||
if (!old_ptr)
|
||||
return malloc(new_size);
|
||||
|
||||
heap = container_of(old_ptr, struct nolibc_heap, user_p);
|
||||
user_p_len = heap->len - sizeof(*heap);
|
||||
/*
|
||||
* Don't realloc() if @user_p_len >= @new_size, this block of
|
||||
* memory is still enough to handle the @new_size. Just return
|
||||
* the same pointer.
|
||||
*/
|
||||
if (user_p_len >= new_size)
|
||||
return old_ptr;
|
||||
|
||||
ret = malloc(new_size);
|
||||
if (__builtin_expect(!ret, 0))
|
||||
return NULL;
|
||||
|
||||
memcpy(ret, heap->user_p, heap->len);
|
||||
munmap(heap, heap->len);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* Converts the unsigned long integer <in> to its hex representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (17 bytes for "ffffffffffffffff" or 9 for "ffffffff"). The
|
||||
* buffer is filled from the first byte, and the number of characters emitted
|
||||
* (not counting the trailing zero) is returned. The function is constructed
|
||||
* in a way to optimize the code size and avoid any divide that could add a
|
||||
* dependency on large external functions.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int utoh_r(unsigned long in, char *buffer)
|
||||
{
|
||||
signed char pos = (~0UL > 0xfffffffful) ? 60 : 28;
|
||||
int digits = 0;
|
||||
int dig;
|
||||
|
||||
do {
|
||||
dig = in >> pos;
|
||||
in -= (uint64_t)dig << pos;
|
||||
pos -= 4;
|
||||
if (dig || digits || pos < 0) {
|
||||
if (dig > 9)
|
||||
dig += 'a' - '0' - 10;
|
||||
buffer[digits++] = '0' + dig;
|
||||
}
|
||||
} while (pos >= 0);
|
||||
|
||||
buffer[digits] = 0;
|
||||
return digits;
|
||||
}
|
||||
|
||||
/* converts unsigned long <in> to an hex string using the static itoa_buffer
|
||||
* and returns the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *utoh(unsigned long in)
|
||||
{
|
||||
utoh_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* Converts the unsigned long integer <in> to its string representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (21 bytes for 18446744073709551615 in 64-bit, 11 for
|
||||
* 4294967295 in 32-bit). The buffer is filled from the first byte, and the
|
||||
* number of characters emitted (not counting the trailing zero) is returned.
|
||||
* The function is constructed in a way to optimize the code size and avoid
|
||||
* any divide that could add a dependency on large external functions.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int utoa_r(unsigned long in, char *buffer)
|
||||
{
|
||||
unsigned long lim;
|
||||
int digits = 0;
|
||||
int pos = (~0UL > 0xfffffffful) ? 19 : 9;
|
||||
int dig;
|
||||
|
||||
do {
|
||||
for (dig = 0, lim = 1; dig < pos; dig++)
|
||||
lim *= 10;
|
||||
|
||||
if (digits || in >= lim || !pos) {
|
||||
for (dig = 0; in >= lim; dig++)
|
||||
in -= lim;
|
||||
buffer[digits++] = '0' + dig;
|
||||
}
|
||||
} while (pos--);
|
||||
|
||||
buffer[digits] = 0;
|
||||
return digits;
|
||||
}
|
||||
|
||||
/* Converts the signed long integer <in> to its string representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (21 bytes for -9223372036854775808 in 64-bit, 12 for
|
||||
* -2147483648 in 32-bit). The buffer is filled from the first byte, and the
|
||||
* number of characters emitted (not counting the trailing zero) is returned.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int itoa_r(long in, char *buffer)
|
||||
{
|
||||
char *ptr = buffer;
|
||||
int len = 0;
|
||||
|
||||
if (in < 0) {
|
||||
in = -in;
|
||||
*(ptr++) = '-';
|
||||
len++;
|
||||
}
|
||||
len += utoa_r(in, ptr);
|
||||
return len;
|
||||
}
|
||||
|
||||
/* for historical compatibility, same as above but returns the pointer to the
|
||||
* buffer.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *ltoa_r(long in, char *buffer)
|
||||
{
|
||||
itoa_r(in, buffer);
|
||||
return buffer;
|
||||
}
|
||||
|
||||
/* converts long integer <in> to a string using the static itoa_buffer and
|
||||
* returns the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *itoa(long in)
|
||||
{
|
||||
itoa_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* converts long integer <in> to a string using the static itoa_buffer and
|
||||
* returns the pointer to that string. Same as above, for compatibility.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *ltoa(long in)
|
||||
{
|
||||
itoa_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* converts unsigned long integer <in> to a string using the static itoa_buffer
|
||||
* and returns the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *utoa(unsigned long in)
|
||||
{
|
||||
utoa_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* Converts the unsigned 64-bit integer <in> to its hex representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (17 bytes for "ffffffffffffffff"). The buffer is filled from
|
||||
* the first byte, and the number of characters emitted (not counting the
|
||||
* trailing zero) is returned. The function is constructed in a way to optimize
|
||||
* the code size and avoid any divide that could add a dependency on large
|
||||
* external functions.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int u64toh_r(uint64_t in, char *buffer)
|
||||
{
|
||||
signed char pos = 60;
|
||||
int digits = 0;
|
||||
int dig;
|
||||
|
||||
do {
|
||||
if (sizeof(long) >= 8) {
|
||||
dig = (in >> pos) & 0xF;
|
||||
} else {
|
||||
/* 32-bit platforms: avoid a 64-bit shift */
|
||||
uint32_t d = (pos >= 32) ? (in >> 32) : in;
|
||||
dig = (d >> (pos & 31)) & 0xF;
|
||||
}
|
||||
if (dig > 9)
|
||||
dig += 'a' - '0' - 10;
|
||||
pos -= 4;
|
||||
if (dig || digits || pos < 0)
|
||||
buffer[digits++] = '0' + dig;
|
||||
} while (pos >= 0);
|
||||
|
||||
buffer[digits] = 0;
|
||||
return digits;
|
||||
}
|
||||
|
||||
/* converts uint64_t <in> to an hex string using the static itoa_buffer and
|
||||
* returns the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *u64toh(uint64_t in)
|
||||
{
|
||||
u64toh_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* Converts the unsigned 64-bit integer <in> to its string representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (21 bytes for 18446744073709551615). The buffer is filled from
|
||||
* the first byte, and the number of characters emitted (not counting the
|
||||
* trailing zero) is returned. The function is constructed in a way to optimize
|
||||
* the code size and avoid any divide that could add a dependency on large
|
||||
* external functions.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int u64toa_r(uint64_t in, char *buffer)
|
||||
{
|
||||
unsigned long long lim;
|
||||
int digits = 0;
|
||||
int pos = 19; /* start with the highest possible digit */
|
||||
int dig;
|
||||
|
||||
do {
|
||||
for (dig = 0, lim = 1; dig < pos; dig++)
|
||||
lim *= 10;
|
||||
|
||||
if (digits || in >= lim || !pos) {
|
||||
for (dig = 0; in >= lim; dig++)
|
||||
in -= lim;
|
||||
buffer[digits++] = '0' + dig;
|
||||
}
|
||||
} while (pos--);
|
||||
|
||||
buffer[digits] = 0;
|
||||
return digits;
|
||||
}
|
||||
|
||||
/* Converts the signed 64-bit integer <in> to its string representation into
|
||||
* buffer <buffer>, which must be long enough to store the number and the
|
||||
* trailing zero (21 bytes for -9223372036854775808). The buffer is filled from
|
||||
* the first byte, and the number of characters emitted (not counting the
|
||||
* trailing zero) is returned.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
int i64toa_r(int64_t in, char *buffer)
|
||||
{
|
||||
char *ptr = buffer;
|
||||
int len = 0;
|
||||
|
||||
if (in < 0) {
|
||||
in = -in;
|
||||
*(ptr++) = '-';
|
||||
len++;
|
||||
}
|
||||
len += u64toa_r(in, ptr);
|
||||
return len;
|
||||
}
|
||||
|
||||
/* converts int64_t <in> to a string using the static itoa_buffer and returns
|
||||
* the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *i64toa(int64_t in)
|
||||
{
|
||||
i64toa_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
/* converts uint64_t <in> to a string using the static itoa_buffer and returns
|
||||
* the pointer to that string.
|
||||
*/
|
||||
static inline __attribute__((unused))
|
||||
char *u64toa(uint64_t in)
|
||||
{
|
||||
u64toa_r(in, itoa_buffer);
|
||||
return itoa_buffer;
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_STDLIB_H */
|
||||
285
tools/include/nolibc/string.h
Normal file
285
tools/include/nolibc/string.h
Normal file
@@ -0,0 +1,285 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* string function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_STRING_H
|
||||
#define _NOLIBC_STRING_H
|
||||
|
||||
#include "std.h"
|
||||
|
||||
static void *malloc(size_t len);
|
||||
|
||||
/*
|
||||
* As much as possible, please keep functions alphabetically sorted.
|
||||
*/
|
||||
|
||||
static __attribute__((unused))
|
||||
int memcmp(const void *s1, const void *s2, size_t n)
|
||||
{
|
||||
size_t ofs = 0;
|
||||
char c1 = 0;
|
||||
|
||||
while (ofs < n && !(c1 = ((char *)s1)[ofs] - ((char *)s2)[ofs])) {
|
||||
ofs++;
|
||||
}
|
||||
return c1;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void *_nolibc_memcpy_up(void *dst, const void *src, size_t len)
|
||||
{
|
||||
size_t pos = 0;
|
||||
|
||||
while (pos < len) {
|
||||
((char *)dst)[pos] = ((const char *)src)[pos];
|
||||
pos++;
|
||||
}
|
||||
return dst;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
void *_nolibc_memcpy_down(void *dst, const void *src, size_t len)
|
||||
{
|
||||
while (len) {
|
||||
len--;
|
||||
((char *)dst)[len] = ((const char *)src)[len];
|
||||
}
|
||||
return dst;
|
||||
}
|
||||
|
||||
/* might be ignored by the compiler without -ffreestanding, then found as
|
||||
* missing.
|
||||
*/
|
||||
__attribute__((weak,unused,section(".text.nolibc_memmove")))
|
||||
void *memmove(void *dst, const void *src, size_t len)
|
||||
{
|
||||
size_t dir, pos;
|
||||
|
||||
pos = len;
|
||||
dir = -1;
|
||||
|
||||
if (dst < src) {
|
||||
pos = -1;
|
||||
dir = 1;
|
||||
}
|
||||
|
||||
while (len) {
|
||||
pos += dir;
|
||||
((char *)dst)[pos] = ((const char *)src)[pos];
|
||||
len--;
|
||||
}
|
||||
return dst;
|
||||
}
|
||||
|
||||
/* must be exported, as it's used by libgcc on ARM */
|
||||
__attribute__((weak,unused,section(".text.nolibc_memcpy")))
|
||||
void *memcpy(void *dst, const void *src, size_t len)
|
||||
{
|
||||
return _nolibc_memcpy_up(dst, src, len);
|
||||
}
|
||||
|
||||
/* might be ignored by the compiler without -ffreestanding, then found as
|
||||
* missing.
|
||||
*/
|
||||
__attribute__((weak,unused,section(".text.nolibc_memset")))
|
||||
void *memset(void *dst, int b, size_t len)
|
||||
{
|
||||
char *p = dst;
|
||||
|
||||
while (len--)
|
||||
*(p++) = b;
|
||||
return dst;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strchr(const char *s, int c)
|
||||
{
|
||||
while (*s) {
|
||||
if (*s == (char)c)
|
||||
return (char *)s;
|
||||
s++;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int strcmp(const char *a, const char *b)
|
||||
{
|
||||
unsigned int c;
|
||||
int diff;
|
||||
|
||||
while (!(diff = (unsigned char)*a++ - (c = (unsigned char)*b++)) && c)
|
||||
;
|
||||
return diff;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strcpy(char *dst, const char *src)
|
||||
{
|
||||
char *ret = dst;
|
||||
|
||||
while ((*dst++ = *src++));
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* this function is only used with arguments that are not constants or when
|
||||
* it's not known because optimizations are disabled.
|
||||
*/
|
||||
static __attribute__((unused))
|
||||
size_t nolibc_strlen(const char *str)
|
||||
{
|
||||
size_t len;
|
||||
|
||||
for (len = 0; str[len]; len++);
|
||||
return len;
|
||||
}
|
||||
|
||||
/* do not trust __builtin_constant_p() at -O0, as clang will emit a test and
|
||||
* the two branches, then will rely on an external definition of strlen().
|
||||
*/
|
||||
#if defined(__OPTIMIZE__)
|
||||
#define strlen(str) ({ \
|
||||
__builtin_constant_p((str)) ? \
|
||||
__builtin_strlen((str)) : \
|
||||
nolibc_strlen((str)); \
|
||||
})
|
||||
#else
|
||||
#define strlen(str) nolibc_strlen((str))
|
||||
#endif
|
||||
|
||||
static __attribute__((unused))
|
||||
size_t strnlen(const char *str, size_t maxlen)
|
||||
{
|
||||
size_t len;
|
||||
|
||||
for (len = 0; (len < maxlen) && str[len]; len++);
|
||||
return len;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strdup(const char *str)
|
||||
{
|
||||
size_t len;
|
||||
char *ret;
|
||||
|
||||
len = strlen(str);
|
||||
ret = malloc(len + 1);
|
||||
if (__builtin_expect(ret != NULL, 1))
|
||||
memcpy(ret, str, len + 1);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strndup(const char *str, size_t maxlen)
|
||||
{
|
||||
size_t len;
|
||||
char *ret;
|
||||
|
||||
len = strnlen(str, maxlen);
|
||||
ret = malloc(len + 1);
|
||||
if (__builtin_expect(ret != NULL, 1)) {
|
||||
memcpy(ret, str, len);
|
||||
ret[len] = '\0';
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
size_t strlcat(char *dst, const char *src, size_t size)
|
||||
{
|
||||
size_t len;
|
||||
char c;
|
||||
|
||||
for (len = 0; dst[len]; len++)
|
||||
;
|
||||
|
||||
for (;;) {
|
||||
c = *src;
|
||||
if (len < size)
|
||||
dst[len] = c;
|
||||
if (!c)
|
||||
break;
|
||||
len++;
|
||||
src++;
|
||||
}
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
size_t strlcpy(char *dst, const char *src, size_t size)
|
||||
{
|
||||
size_t len;
|
||||
char c;
|
||||
|
||||
for (len = 0;;) {
|
||||
c = src[len];
|
||||
if (len < size)
|
||||
dst[len] = c;
|
||||
if (!c)
|
||||
break;
|
||||
len++;
|
||||
}
|
||||
return len;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strncat(char *dst, const char *src, size_t size)
|
||||
{
|
||||
char *orig = dst;
|
||||
|
||||
while (*dst)
|
||||
dst++;
|
||||
|
||||
while (size && (*dst = *src)) {
|
||||
src++;
|
||||
dst++;
|
||||
size--;
|
||||
}
|
||||
|
||||
*dst = 0;
|
||||
return orig;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int strncmp(const char *a, const char *b, size_t size)
|
||||
{
|
||||
unsigned int c;
|
||||
int diff = 0;
|
||||
|
||||
while (size-- &&
|
||||
!(diff = (unsigned char)*a++ - (c = (unsigned char)*b++)) && c)
|
||||
;
|
||||
|
||||
return diff;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strncpy(char *dst, const char *src, size_t size)
|
||||
{
|
||||
size_t len;
|
||||
|
||||
for (len = 0; len < size; len++)
|
||||
if ((dst[len] = *src))
|
||||
src++;
|
||||
return dst;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
char *strrchr(const char *s, int c)
|
||||
{
|
||||
const char *ret = NULL;
|
||||
|
||||
while (*s) {
|
||||
if (*s == (char)c)
|
||||
ret = s;
|
||||
s++;
|
||||
}
|
||||
return (char *)ret;
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_STRING_H */
|
||||
1247
tools/include/nolibc/sys.h
Normal file
1247
tools/include/nolibc/sys.h
Normal file
File diff suppressed because it is too large
Load Diff
28
tools/include/nolibc/time.h
Normal file
28
tools/include/nolibc/time.h
Normal file
@@ -0,0 +1,28 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* time function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_TIME_H
|
||||
#define _NOLIBC_TIME_H
|
||||
|
||||
#include "std.h"
|
||||
#include "arch.h"
|
||||
#include "types.h"
|
||||
#include "sys.h"
|
||||
|
||||
static __attribute__((unused))
|
||||
time_t time(time_t *tptr)
|
||||
{
|
||||
struct timeval tv;
|
||||
|
||||
/* note, cannot fail here */
|
||||
sys_gettimeofday(&tv, NULL);
|
||||
|
||||
if (tptr)
|
||||
*tptr = tv.tv_sec;
|
||||
return tv.tv_sec;
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_TIME_H */
|
||||
205
tools/include/nolibc/types.h
Normal file
205
tools/include/nolibc/types.h
Normal file
@@ -0,0 +1,205 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* Special types used by various syscalls for NOLIBC
|
||||
* Copyright (C) 2017-2021 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_TYPES_H
|
||||
#define _NOLIBC_TYPES_H
|
||||
|
||||
#include "std.h"
|
||||
#include <linux/time.h>
|
||||
|
||||
|
||||
/* Only the generic macros and types may be defined here. The arch-specific
|
||||
* ones such as the O_RDONLY and related macros used by fcntl() and open(), or
|
||||
* the layout of sys_stat_struct must not be defined here.
|
||||
*/
|
||||
|
||||
/* stat flags (WARNING, octal here) */
|
||||
#define S_IFDIR 0040000
|
||||
#define S_IFCHR 0020000
|
||||
#define S_IFBLK 0060000
|
||||
#define S_IFREG 0100000
|
||||
#define S_IFIFO 0010000
|
||||
#define S_IFLNK 0120000
|
||||
#define S_IFSOCK 0140000
|
||||
#define S_IFMT 0170000
|
||||
|
||||
#define S_ISDIR(mode) (((mode) & S_IFDIR) == S_IFDIR)
|
||||
#define S_ISCHR(mode) (((mode) & S_IFCHR) == S_IFCHR)
|
||||
#define S_ISBLK(mode) (((mode) & S_IFBLK) == S_IFBLK)
|
||||
#define S_ISREG(mode) (((mode) & S_IFREG) == S_IFREG)
|
||||
#define S_ISFIFO(mode) (((mode) & S_IFIFO) == S_IFIFO)
|
||||
#define S_ISLNK(mode) (((mode) & S_IFLNK) == S_IFLNK)
|
||||
#define S_ISSOCK(mode) (((mode) & S_IFSOCK) == S_IFSOCK)
|
||||
|
||||
/* dirent types */
|
||||
#define DT_UNKNOWN 0x0
|
||||
#define DT_FIFO 0x1
|
||||
#define DT_CHR 0x2
|
||||
#define DT_DIR 0x4
|
||||
#define DT_BLK 0x6
|
||||
#define DT_REG 0x8
|
||||
#define DT_LNK 0xa
|
||||
#define DT_SOCK 0xc
|
||||
|
||||
/* commonly an fd_set represents 256 FDs */
|
||||
#ifndef FD_SETSIZE
|
||||
#define FD_SETSIZE 256
|
||||
#endif
|
||||
|
||||
/* PATH_MAX and MAXPATHLEN are often used and found with plenty of different
|
||||
* values.
|
||||
*/
|
||||
#ifndef PATH_MAX
|
||||
#define PATH_MAX 4096
|
||||
#endif
|
||||
|
||||
#ifndef MAXPATHLEN
|
||||
#define MAXPATHLEN (PATH_MAX)
|
||||
#endif
|
||||
|
||||
/* Special FD used by all the *at functions */
|
||||
#ifndef AT_FDCWD
|
||||
#define AT_FDCWD (-100)
|
||||
#endif
|
||||
|
||||
/* whence values for lseek() */
|
||||
#define SEEK_SET 0
|
||||
#define SEEK_CUR 1
|
||||
#define SEEK_END 2
|
||||
|
||||
/* cmd for reboot() */
|
||||
#define LINUX_REBOOT_MAGIC1 0xfee1dead
|
||||
#define LINUX_REBOOT_MAGIC2 0x28121969
|
||||
#define LINUX_REBOOT_CMD_HALT 0xcdef0123
|
||||
#define LINUX_REBOOT_CMD_POWER_OFF 0x4321fedc
|
||||
#define LINUX_REBOOT_CMD_RESTART 0x01234567
|
||||
#define LINUX_REBOOT_CMD_SW_SUSPEND 0xd000fce2
|
||||
|
||||
/* Macros used on waitpid()'s return status */
|
||||
#define WEXITSTATUS(status) (((status) & 0xff00) >> 8)
|
||||
#define WIFEXITED(status) (((status) & 0x7f) == 0)
|
||||
|
||||
/* waitpid() flags */
|
||||
#define WNOHANG 1
|
||||
|
||||
/* standard exit() codes */
|
||||
#define EXIT_SUCCESS 0
|
||||
#define EXIT_FAILURE 1
|
||||
|
||||
/* for select() */
|
||||
typedef struct {
|
||||
uint32_t fd32[(FD_SETSIZE + 31) / 32];
|
||||
} fd_set;
|
||||
|
||||
#define FD_CLR(fd, set) do { \
|
||||
fd_set *__set = (set); \
|
||||
int __fd = (fd); \
|
||||
if (__fd >= 0) \
|
||||
__set->fd32[__fd / 32] &= ~(1U << (__fd & 31)); \
|
||||
} while (0)
|
||||
|
||||
#define FD_SET(fd, set) do { \
|
||||
fd_set *__set = (set); \
|
||||
int __fd = (fd); \
|
||||
if (__fd >= 0) \
|
||||
__set->fd32[__fd / 32] |= 1U << (__fd & 31); \
|
||||
} while (0)
|
||||
|
||||
#define FD_ISSET(fd, set) ({ \
|
||||
fd_set *__set = (set); \
|
||||
int __fd = (fd); \
|
||||
int __r = 0; \
|
||||
if (__fd >= 0) \
|
||||
__r = !!(__set->fd32[__fd / 32] & 1U << (__fd & 31)); \
|
||||
__r; \
|
||||
})
|
||||
|
||||
#define FD_ZERO(set) do { \
|
||||
fd_set *__set = (set); \
|
||||
int __idx; \
|
||||
for (__idx = 0; __idx < (FD_SETSIZE+31) / 32; __idx ++) \
|
||||
__set->fd32[__idx] = 0; \
|
||||
} while (0)
|
||||
|
||||
/* for poll() */
|
||||
#define POLLIN 0x0001
|
||||
#define POLLPRI 0x0002
|
||||
#define POLLOUT 0x0004
|
||||
#define POLLERR 0x0008
|
||||
#define POLLHUP 0x0010
|
||||
#define POLLNVAL 0x0020
|
||||
|
||||
struct pollfd {
|
||||
int fd;
|
||||
short int events;
|
||||
short int revents;
|
||||
};
|
||||
|
||||
/* for getdents64() */
|
||||
struct linux_dirent64 {
|
||||
uint64_t d_ino;
|
||||
int64_t d_off;
|
||||
unsigned short d_reclen;
|
||||
unsigned char d_type;
|
||||
char d_name[];
|
||||
};
|
||||
|
||||
/* needed by wait4() */
|
||||
struct rusage {
|
||||
struct timeval ru_utime;
|
||||
struct timeval ru_stime;
|
||||
long ru_maxrss;
|
||||
long ru_ixrss;
|
||||
long ru_idrss;
|
||||
long ru_isrss;
|
||||
long ru_minflt;
|
||||
long ru_majflt;
|
||||
long ru_nswap;
|
||||
long ru_inblock;
|
||||
long ru_oublock;
|
||||
long ru_msgsnd;
|
||||
long ru_msgrcv;
|
||||
long ru_nsignals;
|
||||
long ru_nvcsw;
|
||||
long ru_nivcsw;
|
||||
};
|
||||
|
||||
/* The format of the struct as returned by the libc to the application, which
|
||||
* significantly differs from the format returned by the stat() syscall flavours.
|
||||
*/
|
||||
struct stat {
|
||||
dev_t st_dev; /* ID of device containing file */
|
||||
ino_t st_ino; /* inode number */
|
||||
mode_t st_mode; /* protection */
|
||||
nlink_t st_nlink; /* number of hard links */
|
||||
uid_t st_uid; /* user ID of owner */
|
||||
gid_t st_gid; /* group ID of owner */
|
||||
dev_t st_rdev; /* device ID (if special file) */
|
||||
off_t st_size; /* total size, in bytes */
|
||||
blksize_t st_blksize; /* blocksize for file system I/O */
|
||||
blkcnt_t st_blocks; /* number of 512B blocks allocated */
|
||||
time_t st_atime; /* time of last access */
|
||||
time_t st_mtime; /* time of last modification */
|
||||
time_t st_ctime; /* time of last status change */
|
||||
};
|
||||
|
||||
/* WARNING, it only deals with the 4096 first majors and 256 first minors */
|
||||
#define makedev(major, minor) ((dev_t)((((major) & 0xfff) << 8) | ((minor) & 0xff)))
|
||||
#define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff))
|
||||
#define minor(dev) ((unsigned int)(((dev) & 0xff))
|
||||
|
||||
#ifndef offsetof
|
||||
#define offsetof(TYPE, FIELD) ((size_t) &((TYPE *)0)->FIELD)
|
||||
#endif
|
||||
|
||||
#ifndef container_of
|
||||
#define container_of(PTR, TYPE, FIELD) ({ \
|
||||
__typeof__(((TYPE *)0)->FIELD) *__FIELD_PTR = (PTR); \
|
||||
(TYPE *)((char *) __FIELD_PTR - offsetof(TYPE, FIELD)); \
|
||||
})
|
||||
#endif
|
||||
|
||||
#endif /* _NOLIBC_TYPES_H */
|
||||
54
tools/include/nolibc/unistd.h
Normal file
54
tools/include/nolibc/unistd.h
Normal file
@@ -0,0 +1,54 @@
|
||||
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
|
||||
/*
|
||||
* unistd function definitions for NOLIBC
|
||||
* Copyright (C) 2017-2022 Willy Tarreau <w@1wt.eu>
|
||||
*/
|
||||
|
||||
#ifndef _NOLIBC_UNISTD_H
|
||||
#define _NOLIBC_UNISTD_H
|
||||
|
||||
#include "std.h"
|
||||
#include "arch.h"
|
||||
#include "types.h"
|
||||
#include "sys.h"
|
||||
|
||||
|
||||
static __attribute__((unused))
|
||||
int msleep(unsigned int msecs)
|
||||
{
|
||||
struct timeval my_timeval = { msecs / 1000, (msecs % 1000) * 1000 };
|
||||
|
||||
if (sys_select(0, 0, 0, 0, &my_timeval) < 0)
|
||||
return (my_timeval.tv_sec * 1000) +
|
||||
(my_timeval.tv_usec / 1000) +
|
||||
!!(my_timeval.tv_usec % 1000);
|
||||
else
|
||||
return 0;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
unsigned int sleep(unsigned int seconds)
|
||||
{
|
||||
struct timeval my_timeval = { seconds, 0 };
|
||||
|
||||
if (sys_select(0, 0, 0, 0, &my_timeval) < 0)
|
||||
return my_timeval.tv_sec + !!my_timeval.tv_usec;
|
||||
else
|
||||
return 0;
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int usleep(unsigned int usecs)
|
||||
{
|
||||
struct timeval my_timeval = { usecs / 1000000, usecs % 1000000 };
|
||||
|
||||
return sys_select(0, 0, 0, 0, &my_timeval);
|
||||
}
|
||||
|
||||
static __attribute__((unused))
|
||||
int tcsetpgrp(int fd, pid_t pid)
|
||||
{
|
||||
return ioctl(fd, TIOCSPGRP, &pid);
|
||||
}
|
||||
|
||||
#endif /* _NOLIBC_UNISTD_H */
|
||||
@@ -54,7 +54,8 @@ klitmus7 Compatibility Table
|
||||
-- 4.14 7.48 --
|
||||
4.15 -- 4.19 7.49 --
|
||||
4.20 -- 5.5 7.54 --
|
||||
5.6 -- 7.56 --
|
||||
5.6 -- 5.16 7.56 --
|
||||
5.17 -- 7.56.1 --
|
||||
============ ==========
|
||||
|
||||
|
||||
|
||||
@@ -301,7 +301,7 @@ specify_qemu_cpus () {
|
||||
echo $2 -smp $3
|
||||
;;
|
||||
qemu-system-ppc64)
|
||||
nt="`lscpu | grep '^NUMA node0' | sed -e 's/^[^,]*,\([0-9]*\),.*$/\1/'`"
|
||||
nt="`lscpu | sed -n 's/^Thread(s) per core:\s*//p'`"
|
||||
echo $2 -smp cores=`expr \( $3 + $nt - 1 \) / $nt`,threads=$nt
|
||||
;;
|
||||
esac
|
||||
|
||||
@@ -36,7 +36,7 @@ do
|
||||
then
|
||||
egrep "error:|warning:|^ld: .*undefined reference to" < $i > $i.diags
|
||||
files="$files $i.diags $i"
|
||||
elif ! test -f ${scenariobasedir}/vmlinux
|
||||
elif ! test -f ${scenariobasedir}/vmlinux && ! test -f "${rundir}/re-run"
|
||||
then
|
||||
echo No ${scenariobasedir}/vmlinux file > $i.diags
|
||||
files="$files $i.diags $i"
|
||||
|
||||
@@ -33,7 +33,12 @@ do
|
||||
TORTURE_SUITE="`cat $i/../torture_suite`"
|
||||
configfile=`echo $i | sed -e 's,^.*/,,'`
|
||||
rm -f $i/console.log.*.diags
|
||||
kvm-recheck-${TORTURE_SUITE}.sh $i
|
||||
case "${TORTURE_SUITE}" in
|
||||
X*)
|
||||
;;
|
||||
*)
|
||||
kvm-recheck-${TORTURE_SUITE}.sh $i
|
||||
esac
|
||||
if test -f "$i/qemu-retval" && test "`cat $i/qemu-retval`" -ne 0 && test "`cat $i/qemu-retval`" -ne 137
|
||||
then
|
||||
echo QEMU error, output:
|
||||
|
||||
@@ -138,14 +138,14 @@ chmod +x $T/bin/kvm-remote-*.sh
|
||||
# Check first to avoid the need for cleanup for system-name typos
|
||||
for i in $systems
|
||||
do
|
||||
ncpus="`ssh $i getconf _NPROCESSORS_ONLN 2> /dev/null`"
|
||||
echo $i: $ncpus CPUs " " `date` | tee -a "$oldrun/remote-log"
|
||||
ncpus="`ssh -o BatchMode=yes $i getconf _NPROCESSORS_ONLN 2> /dev/null`"
|
||||
ret=$?
|
||||
if test "$ret" -ne 0
|
||||
then
|
||||
echo System $i unreachable, giving up. | tee -a "$oldrun/remote-log"
|
||||
exit 4
|
||||
fi
|
||||
echo $i: $ncpus CPUs " " `date` | tee -a "$oldrun/remote-log"
|
||||
done
|
||||
|
||||
# Download and expand the tarball on all systems.
|
||||
@@ -153,14 +153,14 @@ echo Build-products tarball: `du -h $T/binres.tgz` | tee -a "$oldrun/remote-log"
|
||||
for i in $systems
|
||||
do
|
||||
echo Downloading tarball to $i `date` | tee -a "$oldrun/remote-log"
|
||||
cat $T/binres.tgz | ssh $i "cd /tmp; tar -xzf -"
|
||||
cat $T/binres.tgz | ssh -o BatchMode=yes $i "cd /tmp; tar -xzf -"
|
||||
ret=$?
|
||||
tries=0
|
||||
while test "$ret" -ne 0
|
||||
do
|
||||
echo Unable to download $T/binres.tgz to system $i, waiting and then retrying. $tries prior retries. | tee -a "$oldrun/remote-log"
|
||||
sleep 60
|
||||
cat $T/binres.tgz | ssh $i "cd /tmp; tar -xzf -"
|
||||
cat $T/binres.tgz | ssh -o BatchMode=yes $i "cd /tmp; tar -xzf -"
|
||||
ret=$?
|
||||
if test "$ret" -ne 0
|
||||
then
|
||||
@@ -185,7 +185,7 @@ checkremotefile () {
|
||||
|
||||
while :
|
||||
do
|
||||
ssh $1 "test -f \"$2\""
|
||||
ssh -o BatchMode=yes $1 "test -f \"$2\""
|
||||
ret=$?
|
||||
if test "$ret" -eq 255
|
||||
then
|
||||
@@ -228,7 +228,7 @@ startbatches () {
|
||||
then
|
||||
continue # System still running last test, skip.
|
||||
fi
|
||||
ssh "$i" "cd \"$resdir/$ds\"; touch remote.run; PATH=\"$T/bin:$PATH\" nohup kvm-remote-$curbatch.sh > kvm-remote-$curbatch.sh.out 2>&1 &" 1>&2
|
||||
ssh -o BatchMode=yes "$i" "cd \"$resdir/$ds\"; touch remote.run; PATH=\"$T/bin:$PATH\" nohup kvm-remote-$curbatch.sh > kvm-remote-$curbatch.sh.out 2>&1 &" 1>&2
|
||||
ret=$?
|
||||
if test "$ret" -ne 0
|
||||
then
|
||||
@@ -267,7 +267,7 @@ do
|
||||
sleep 30
|
||||
done
|
||||
echo " ---" Collecting results from $i `date` | tee -a "$oldrun/remote-log"
|
||||
( cd "$oldrun"; ssh $i "cd $rundir; tar -czf - kvm-remote-*.sh.out */console.log */kvm-test-1-run*.sh.out */qemu[_-]pid */qemu-retval */qemu-affinity; rm -rf $T > /dev/null 2>&1" | tar -xzf - )
|
||||
( cd "$oldrun"; ssh -o BatchMode=yes $i "cd $rundir; tar -czf - kvm-remote-*.sh.out */console.log */kvm-test-1-run*.sh.out */qemu[_-]pid */qemu-retval */qemu-affinity; rm -rf $T > /dev/null 2>&1" | tar -xzf - )
|
||||
done
|
||||
|
||||
( kvm-end-run-stats.sh "$oldrun" "$starttime"; echo $? > $T/exitcode ) | tee -a "$oldrun/remote-log"
|
||||
|
||||
@@ -44,6 +44,7 @@ TORTURE_KCONFIG_KASAN_ARG=""
|
||||
TORTURE_KCONFIG_KCSAN_ARG=""
|
||||
TORTURE_KMAKE_ARG=""
|
||||
TORTURE_QEMU_MEM=512
|
||||
torture_qemu_mem_default=1
|
||||
TORTURE_REMOTE=
|
||||
TORTURE_SHUTDOWN_GRACE=180
|
||||
TORTURE_SUITE=rcu
|
||||
@@ -86,7 +87,7 @@ usage () {
|
||||
echo " --remote"
|
||||
echo " --results absolute-pathname"
|
||||
echo " --shutdown-grace seconds"
|
||||
echo " --torture lock|rcu|rcuscale|refscale|scf"
|
||||
echo " --torture lock|rcu|rcuscale|refscale|scf|X*"
|
||||
echo " --trust-make"
|
||||
exit 1
|
||||
}
|
||||
@@ -180,6 +181,10 @@ do
|
||||
;;
|
||||
--kasan)
|
||||
TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
|
||||
if test -n "$torture_qemu_mem_default"
|
||||
then
|
||||
TORTURE_QEMU_MEM=2G
|
||||
fi
|
||||
;;
|
||||
--kconfig|--kconfigs)
|
||||
checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
|
||||
@@ -202,6 +207,7 @@ do
|
||||
--memory)
|
||||
checkarg --memory "(memory size)" $# "$2" '^[0-9]\+[MG]\?$' error
|
||||
TORTURE_QEMU_MEM=$2
|
||||
torture_qemu_mem_default=
|
||||
shift
|
||||
;;
|
||||
--no-initrd)
|
||||
@@ -231,7 +237,7 @@ do
|
||||
shift
|
||||
;;
|
||||
--torture)
|
||||
checkarg --torture "(suite name)" "$#" "$2" '^\(lock\|rcu\|rcuscale\|refscale\|scf\)$' '^--'
|
||||
checkarg --torture "(suite name)" "$#" "$2" '^\(lock\|rcu\|rcuscale\|refscale\|scf\|X.*\)$' '^--'
|
||||
TORTURE_SUITE=$2
|
||||
TORTURE_MOD="`echo $TORTURE_SUITE | sed -e 's/^\(lock\|rcu\|scf\)$/\1torture/'`"
|
||||
shift
|
||||
|
||||
@@ -54,6 +54,7 @@ do_kvfree=yes
|
||||
do_kasan=yes
|
||||
do_kcsan=no
|
||||
do_clocksourcewd=yes
|
||||
do_rt=yes
|
||||
|
||||
# doyesno - Helper function for yes/no arguments
|
||||
function doyesno () {
|
||||
@@ -82,6 +83,7 @@ usage () {
|
||||
echo " --do-rcuscale / --do-no-rcuscale"
|
||||
echo " --do-rcutorture / --do-no-rcutorture"
|
||||
echo " --do-refscale / --do-no-refscale"
|
||||
echo " --do-rt / --do-no-rt"
|
||||
echo " --do-scftorture / --do-no-scftorture"
|
||||
echo " --duration [ <minutes> | <hours>h | <days>d ]"
|
||||
echo " --kcsan-kmake-arg kernel-make-arguments"
|
||||
@@ -118,6 +120,7 @@ do
|
||||
do_scftorture=yes
|
||||
do_rcuscale=yes
|
||||
do_refscale=yes
|
||||
do_rt=yes
|
||||
do_kvfree=yes
|
||||
do_kasan=yes
|
||||
do_kcsan=yes
|
||||
@@ -148,6 +151,7 @@ do
|
||||
do_scftorture=no
|
||||
do_rcuscale=no
|
||||
do_refscale=no
|
||||
do_rt=no
|
||||
do_kvfree=no
|
||||
do_kasan=no
|
||||
do_kcsan=no
|
||||
@@ -162,6 +166,9 @@ do
|
||||
--do-refscale|--do-no-refscale)
|
||||
do_refscale=`doyesno "$1" --do-refscale`
|
||||
;;
|
||||
--do-rt|--do-no-rt)
|
||||
do_rt=`doyesno "$1" --do-rt`
|
||||
;;
|
||||
--do-scftorture|--do-no-scftorture)
|
||||
do_scftorture=`doyesno "$1" --do-scftorture`
|
||||
;;
|
||||
@@ -322,6 +329,7 @@ then
|
||||
echo " --- make clean" > "$amcdir/Make.out" 2>&1
|
||||
make -j$MAKE_ALLOTED_CPUS clean >> "$amcdir/Make.out" 2>&1
|
||||
echo " --- make allmodconfig" >> "$amcdir/Make.out" 2>&1
|
||||
cp .config $amcdir
|
||||
make -j$MAKE_ALLOTED_CPUS allmodconfig >> "$amcdir/Make.out" 2>&1
|
||||
echo " --- make " >> "$amcdir/Make.out" 2>&1
|
||||
make -j$MAKE_ALLOTED_CPUS >> "$amcdir/Make.out" 2>&1
|
||||
@@ -350,8 +358,19 @@ fi
|
||||
|
||||
if test "$do_scftorture" = "yes"
|
||||
then
|
||||
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot"
|
||||
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 1G --trust-make
|
||||
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
|
||||
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
|
||||
fi
|
||||
|
||||
if test "$do_rt" = "yes"
|
||||
then
|
||||
# With all post-boot grace periods forced to normal.
|
||||
torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000 rcupdate.rcu_normal=1"
|
||||
torture_set "rcurttorture" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration "$duration_rcutorture" --configs "TREE03" --trust-make
|
||||
|
||||
# With all post-boot grace periods forced to expedited.
|
||||
torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000 rcupdate.rcu_expedited=1"
|
||||
torture_set "rcurttorture-exp" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration "$duration_rcutorture" --configs "TREE03" --trust-make
|
||||
fi
|
||||
|
||||
if test "$do_refscale" = yes
|
||||
@@ -363,7 +382,7 @@ fi
|
||||
for prim in $primlist
|
||||
do
|
||||
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
||||
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
||||
done
|
||||
|
||||
if test "$do_rcuscale" = yes
|
||||
@@ -375,13 +394,13 @@ fi
|
||||
for prim in $primlist
|
||||
do
|
||||
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
|
||||
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
|
||||
done
|
||||
|
||||
if test "$do_kvfree" = "yes"
|
||||
then
|
||||
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
|
||||
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 1G --trust-make
|
||||
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
|
||||
fi
|
||||
|
||||
if test "$do_clocksourcewd" = "yes"
|
||||
|
||||
@@ -8,3 +8,5 @@ CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_FORCE_TASKS_RUDE_RCU=y
|
||||
#CHECK#CONFIG_TASKS_RUDE_RCU=y
|
||||
|
||||
@@ -6,3 +6,5 @@ CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
#CHECK#CONFIG_RCU_EXPERT=n
|
||||
CONFIG_KPROBES=n
|
||||
CONFIG_FTRACE=n
|
||||
|
||||
@@ -7,4 +7,5 @@ CONFIG_PREEMPT=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_TASKS_RCU=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
||||
@@ -2,3 +2,7 @@ CONFIG_SMP=n
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
#CHECK#CONFIG_TASKS_RCU=y
|
||||
CONFIG_FORCE_TASKS_RCU=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
||||
@@ -1 +1,2 @@
|
||||
rcutorture.torture_type=tasks
|
||||
rcutorture.stat_interval=60
|
||||
|
||||
@@ -7,3 +7,5 @@ CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=n
|
||||
CONFIG_NO_HZ_FULL=y
|
||||
#CHECK#CONFIG_RCU_EXPERT=n
|
||||
CONFIG_TASKS_RCU=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
||||
@@ -4,8 +4,11 @@ CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_PROVE_LOCKING=n
|
||||
#CHECK#CONFIG_PROVE_RCU=n
|
||||
CONFIG_FORCE_TASKS_TRACE_RCU=y
|
||||
#CHECK#CONFIG_TASKS_TRACE_RCU=y
|
||||
CONFIG_TASKS_TRACE_RCU_READ_MB=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
||||
@@ -7,5 +7,7 @@ CONFIG_PREEMPT=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_FORCE_TASKS_TRACE_RCU=y
|
||||
#CHECK#CONFIG_TASKS_TRACE_RCU=y
|
||||
CONFIG_TASKS_TRACE_RCU_READ_MB=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
CONFIG_SMP=y
|
||||
CONFIG_NR_CPUS=8
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT_NONE=n
|
||||
CONFIG_PREEMPT_VOLUNTARY=y
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
#CHECK#CONFIG_TREE_RCU=y
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=n
|
||||
|
||||
@@ -3,6 +3,7 @@ CONFIG_NR_CPUS=16
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
#CHECK#CONFIG_TREE_RCU=y
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=n
|
||||
|
||||
@@ -13,3 +13,5 @@ CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
#CHECK#CONFIG_RCU_EXPERT=n
|
||||
CONFIG_KPROBES=n
|
||||
CONFIG_FTRACE=n
|
||||
|
||||
@@ -3,6 +3,7 @@ CONFIG_NR_CPUS=56
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
#CHECK#CONFIG_TREE_RCU=y
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
|
||||
# rcutorture_param_n_barrier_cbs bootparam-string
|
||||
#
|
||||
# Adds n_barrier_cbs rcutorture module parameter to kernels having it.
|
||||
# Adds n_barrier_cbs rcutorture module parameter if not already specified.
|
||||
rcutorture_param_n_barrier_cbs () {
|
||||
if echo $1 | grep -q "rcutorture\.n_barrier_cbs"
|
||||
then
|
||||
@@ -30,13 +30,25 @@ rcutorture_param_onoff () {
|
||||
fi
|
||||
}
|
||||
|
||||
# rcutorture_param_stat_interval bootparam-string
|
||||
#
|
||||
# Adds stat_interval rcutorture module parameter if not already specified.
|
||||
rcutorture_param_stat_interval () {
|
||||
if echo $1 | grep -q "rcutorture\.stat_interval"
|
||||
then
|
||||
:
|
||||
else
|
||||
echo rcutorture.stat_interval=15
|
||||
fi
|
||||
}
|
||||
|
||||
# per_version_boot_params bootparam-string config-file seconds
|
||||
#
|
||||
# Adds per-version torture-module parameters to kernels supporting them.
|
||||
per_version_boot_params () {
|
||||
echo $1 `rcutorture_param_onoff "$1" "$2"` \
|
||||
`rcutorture_param_n_barrier_cbs "$1"` \
|
||||
rcutorture.stat_interval=15 \
|
||||
`rcutorture_param_stat_interval "$1"` \
|
||||
rcutorture.shutdown_secs=$3 \
|
||||
rcutorture.test_no_idle_hz=1 \
|
||||
rcutorture.verbose=1
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user