Expose cr_faults and cr_fault_failures counters to the user space. This
allows a user app to keep track of how many fault the application is
causing with the completion record (CR) and also the number of failures
of the CR writeback. Having a high number of cr_fault_failures is bad as
the app is submitting descriptors with the CR addresses that are bad. User
monitoring daemon may want to consider killing the application as it may be
malicious and attempting to flood the device event log.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-15-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add event log processing for faulting of user batch descriptor completion
record.
When encountering an event log entry for a page fault on a completion
record, the driver is expected to do the following:
1. If the "first error in batch" bit in event log entry error info is
set, discard any previously recorded errors associated with the
"batch identifier".
2. Fix the page fault according to the fault address in the event log. If
successful, write the completion record to the fault address in user space.
3. If an error is encountered while writing the completion record and it is
associated to a descriptor in the batch, the driver associates the error
with the batch identifier of the event log entry and tracks it until the
event log entry for the corresponding batch desc is encountered.
While processing an event log entry for a batch descriptor with error
indicating that one or more descs in the batch had event log entries,
the driver will do the following before writing the batch completion
record:
1. If the status field of the completion record is 0x1, the driver will
change it to error code 0x5 (one or more operations in batch completed
with status not successful) and changes the result field to 1.
2. If the status is error code 0x6 (page fault on batch descriptor list
address), change the result field to 1.
3. If status is any other value, the completion record is not changed.
4. Clear the recorded error in preparation for next batch with same batch
identifier.
The result field is for user software to determine whether to set the
"Batch Error" flag bit in the descriptor for continuation of partial
batch descriptor completion. See DSA spec 2.0 for additional information.
If no error has been recorded for the batch, the batch completion record is
written to user space as is.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DSA supports page fault handling through PRS. However, the DMA engine
that's processing the descriptor is blocked until the PRS response is
received. Other workqueues sharing the engine are also blocked.
Page fault handing by the driver with PRS disabled can be used to
mitigate the stalling.
With PRS disabled while ATS remain enabled, DSA handles page faults on
a completion record by reporting an event in the event log. In this
instance, the descriptor is completed and the event log contains the
completion record address and the contents of the completion record. Add
support to the event log handling code to fault in the completion record
and copy the content of the completion record to user memory.
A bitmap is introduced to keep track of discarded event log entries. When
the user process initiates ->release() of the char device, it no longer is
interested in any remaining event log entries tied to the relevant wq and
PASID. The driver will mark the event log entry index in the bitmap. Upon
encountering the entries during processing, the event log handler will just
clear the bitmap bit and skip the entry rather than attempt to process the
event log entry.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Define idxd_copy_cr() to copy completion record to fault address in
user address that is found by work queue (wq) and PASID.
It will be used to write the user's completion record that the hardware
device is not able to write due to user completion record page fault.
An xarray is added to associate the PASID and mm with the
struct idxd_user_context so mm can be found by PASID and wq.
It is called when handling the completion record fault in a kernel thread
context. Switch to the mm using kthread_use_vm() and copy the
completion record to the mm via copy_to_user(). Once the copy is
completed, switch back to the current mm using kthread_unuse_mm().
Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
An event log interrupt is raised in the misc interrupt INTCAUSE register
when an event is written by the hardware. Add basic event log processing
support to the interrupt handler. The event log is a ring where the
hardware owns the tail and the software owns the head. The hardware will
advance the tail index when an additional event has been pushed to memory.
The software will process the log entry and then advances the head. The
log is full when (tail + 1) % log_size = head. The hardware will stop
writing when the log is full. The user is expected to create a log size
large enough to handle all the expected events.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-5-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add setup of event log feature for supported device. Event log addresses
error reporting that was lacking in gen 1 DSA devices where a second error
event does not get reported when a first event is pending software
handling. The event log allows a circular buffer that the device can push
error events to. It is up to the user to create a large enough event log
ring in order to capture the expected events. The evl size can be set in
the device sysfs attribute. By default 64 entries are supported as minimal
when event log is enabled.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add support for changing of the event log size. Event log is a
feature added to DSA 2.0 hardware to improve error reporting.
It supersedes the SWERROR register on DSA 1.0 hardware and hope
to prevent loss of reported errors.
The error log size determines how many error entries supported for
the device. It can be configured by the user via sysfs attribute.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Current code continuously processes the interrupt as long as the hardware
is setting the status bit. There's no reason to do that since the threaded
handler will get called again if another interrupt is asserted.
Also through testing, it has shown that if a misprogrammed (or malicious)
agent can continuously submit descriptors with bad completion record and
causes errors to be reported via the misc interrupt. Continuous processing
by the thread can cause software hang watchdog to kick off since the thread
isn't giving up the CPU.
Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In case the system suspends to a deep sleep state where power to DMA
controller is cut-off we need to restore the content of GRWS register.
This is a write only register and writing bit X tells the controller
to suspend read and write requests for channel X. Thus set GRWS before
restoring the content of GE (Global Enable) regiter.
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-5-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In case there are DMA channels not paused by consumers in suspend
process (valid on AT91 SoCs for serial driver when no_console_suspend) the
driver pauses them (using at_xdmac_device_pause() which is also the same
function called by dmaengine_pause()) and then in the resume process the
driver resumes them calling at_xdmac_device_resume() which is the same
function called by dmaengine_resume()). This is good for DMA channels
not paused by consumers but for drivers that calls
dmaengine_pause()/dmaegine_resume() on suspend/resume path this may lead to
DMA channel being enabled before the IP is enabled. For IPs that needs
strict ordering with regards to DMA channel enablement this will lead to
wrong behavior. To fix this add a new set of functions
at_xdmac_device_pause_internal()/at_xdmac_device_resume_internal() to be
called only on suspend/resume.
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-4-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In case there are channels not paused during suspend (which on AT91 case
is valid for serial driver when no_console_suspend boot argument is used)
the at_xdmac_runtime_suspend_descriptors() was called more than
one time due to at_xdmac_off(). To fix this add a new argument to
at_xdmac_off() to specify if runtime PM reference counter needs to be
decremented for queued active descriptors. Along with it moved the
at_xdmac_runtime_suspend_descriptors() call under at_xdmac_chan_is_paused()
check on suspend path as for the rest of channels the suspend is delayed
by atmel_xdmac_prepare() in case channel is enabled. Same approach has
been applied on resume path.
Fixes: 650b0e990c ("dmaengine: at_xdmac: add runtime pm support")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-3-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Runtime PM APIs for at_xdmac just plays with clk_enable()/clk_disable()
letting aside the clk_prepare()/clk_unprepare() that needs to be
executed as the clock is also prepared on probe. Thus instead of using
runtime PM force suspend/resume APIs use
clk_disable_unprepare() + pm_runtime_put_noidle() on suspend and
clk_prepare_enable() + pm_runtime_get_noresume() on resume. This
approach as been chosen instead of using runtime PM force suspend/resume
with clk_unprepare()/clk_prepare() as it looks simpler and the final
code is better.
While at it added the missing pm_runtime_mark_last_busy() on suspend before
decrementing the reference counter.
Fixes: 650b0e990c ("dmaengine: at_xdmac: add runtime pm support")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-2-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The bit DMAC_CHEN[0] is automatically cleared by hardware to disable the
channel after the last AMBA transfer of the DMA transfer to the
destination has completed. Software can therefore poll this bit to
determine when this channel is free for a new DMA transfer.
This time requires at least 40 milliseconds on JH7110 SoC, otherwise an
error message 'failed to stop' will be reported.
Signed-off-by: Walker Chen <walker.chen@starfivetech.com>
Link: https://lore.kernel.org/r/20230322094820.24738-4-walker.chen@starfivetech.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Make the flask.h target depend on the genheaders binary instead of
classmap.h to ensure that it is rebuilt if any of the dependencies of
genheaders are changed.
Notably this fixes flask.h not being rebuilt when
initial_sid_to_string.h is modified.
Fixes: 8753f6bec3 ("selinux: generate flask headers during kernel build")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The dw-edma driver stops after processing a DMA request even if a request
remains in the issued queue, which is not the expected behavior. The DMA
engine API requires continuous processing.
Add a trigger to start after one processing finished if there are requests
remain.
Fixes: e63d79d1ff ("dmaengine: Add Synopsys eDMA IP core driver")
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Link: https://lore.kernel.org/r/20230411101758.438472-1-mie@igel.co.jp
Signed-off-by: Vinod Koul <vkoul@kernel.org>
After commit 6376ce56feb6 ("iov_iter: import single vector iovecs as
ITER_UBUF"), GCC does an inter-procedural compiler optimization which
moves the user_access_begin() out of copy_compat_iovec_from_user() and
into its callers:
lib/iov_iter.o: warning: objtool: .altinstr_replacement+0x0: redundant UACCESS disable
lib/iov_iter.o: warning: objtool: iovec_from_user.part.0+0xc7: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled
lib/iov_iter.o: warning: objtool: __import_iovec+0x21d: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled
Enforce the "no UACCESS enable across function boundaries" rule by
disabling cloning for copy_compat_iovec_from_user().
Fixes: 6376ce56feb6 ("iov_iter: import single vector iovecs as ITER_UBUF")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
https://lkml.kernel.org/lkml/20230327120017.6bb826d7@canb.auug.org.au
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
DPHY RX module has a common module reset (RSTB_CMN) which is expected
to be released during configuration. In J721E SR1.0 the RSTB_CMN is
internally tied to CSI_RX_RST and is hardware controlled, for all
other newer platforms the common module reset is software controlled.
Add support to control common module reset during configuration and
also skip common module reset based on soc_device_match() for J721E SR1.0.
Signed-off-by: Sinthu Raja <sinthu.raja@ti.com>
Co-developed-by: Vaishnav Achath <vaishnav.a@ti.com>
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Reviewed-by: Jai Luthra <j-luthra@ti.com>
Link: https://lore.kernel.org/r/20230314073137.2153-1-vaishnav.a@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
`UniqueArc::try_new_uninit` calls `Arc::try_new(MaybeUninit::uninit())`.
This results in the uninitialized memory being placed on the stack,
which may be arbitrarily large due to the generic `T` and thus could
cause a stack overflow for large types.
Change the implementation to use the pin-init API which enables in-place
initialization. In particular it avoids having to first construct and
then move the uninitialized memory from the stack into the final location.
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>
Link: https://lore.kernel.org/r/20230408122429.1103522-15-y86-dev@protonmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Add the following initializer macros:
- `#[pin_data]` to annotate structurally pinned fields of structs,
needed for `pin_init!` and `try_pin_init!` to select the correct
initializer of fields.
- `pin_init!` create a pin-initializer for a struct with the
`Infallible` error type.
- `try_pin_init!` create a pin-initializer for a struct with a custom
error type (`kernel::error::Error` is the default).
- `init!` create an in-place-initializer for a struct with the
`Infallible` error type.
- `try_init!` create an in-place-initializer for a struct with a custom
error type (`kernel::error::Error` is the default).
Also add their needed internal helper traits and structs.
Co-developed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>
Link: https://lore.kernel.org/r/20230408122429.1103522-8-y86-dev@protonmail.com
[ Fixed three typos. ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
This API is used to facilitate safe pinned initialization of structs. It
replaces cumbersome `unsafe` manual initialization with elegant safe macro
invocations.
Due to the size of this change it has been split into six commits:
1. This commit introducing the basic public interface: traits and
functions to represent and create initializers.
2. Adds the `#[pin_data]`, `pin_init!`, `try_pin_init!`, `init!` and
`try_init!` macros along with their internal types.
3. Adds the `InPlaceInit` trait that allows using an initializer to create
an object inside of a `Box<T>` and other smart pointers.
4. Adds the `PinnedDrop` trait and adds macro support for it in
the `#[pin_data]` macro.
5. Adds the `stack_pin_init!` macro allowing to pin-initialize a struct on
the stack.
6. Adds the `Zeroable` trait and `init::zeroed` function to initialize
types that have `0x00` in all bytes as a valid bit pattern.
--
In this section the problem that the new pin-init API solves is outlined.
This message describes the entirety of the API, not just the parts
introduced in this commit. For a more granular explanation and additional
information on pinning and this issue, view [1].
Pinning is Rust's way of enforcing the address stability of a value. When a
value gets pinned it will be impossible for safe code to move it to another
location. This is done by wrapping pointers to said object with `Pin<P>`.
This wrapper prevents safe code from creating mutable references to the
object, preventing mutable access, which is needed to move the value.
`Pin<P>` provides `unsafe` functions to circumvent this and allow
modifications regardless. It is then the programmer's responsibility to
uphold the pinning guarantee.
Many kernel data structures require a stable address, because there are
foreign pointers to them which would get invalidated by moving the
structure. Since these data structures are usually embedded in structs to
use them, this pinning property propagates to the container struct.
Resulting in most structs in both Rust and C code needing to be pinned.
So if we want to have a `mutex` field in a Rust struct, this struct also
needs to be pinned, because a `mutex` contains a `list_head`. Additionally
initializing a `list_head` requires already having the final memory
location available, because it is initialized by pointing it to itself. But
this presents another challenge in Rust: values have to be initialized at
all times. There is the `MaybeUninit<T>` wrapper type, which allows
handling uninitialized memory, but this requires using the `unsafe` raw
pointers and a casting the type to the initialized variant.
This problem gets exacerbated when considering encapsulation and the normal
safety requirements of Rust code. The fields of the Rust `Mutex<T>` should
not be accessible to normal driver code. After all if anyone can modify
the fields, there is no way to ensure the invariants of the `Mutex<T>` are
upheld. But if the fields are inaccessible, then initialization of a
`Mutex<T>` needs to be somehow achieved via a function or a macro. Because
the `Mutex<T>` must be pinned in memory, the function cannot return it by
value. It also cannot allocate a `Box` to put the `Mutex<T>` into, because
that is an unnecessary allocation and indirection which would hurt
performance.
The solution in the rust tree (e.g. this commit: [2]) that is replaced by
this API is to split this function into two parts:
1. A `new` function that returns a partially initialized `Mutex<T>`,
2. An `init` function that requires the `Mutex<T>` to be pinned and that
fully initializes the `Mutex<T>`.
Both of these functions have to be marked `unsafe`, since a call to `new`
needs to be accompanied with a call to `init`, otherwise using the
`Mutex<T>` could result in UB. And because calling `init` twice also is not
safe. While `Mutex<T>` initialization cannot fail, other structs might
also have to allocate memory, which would result in conditional successful
initialization requiring even more manual accommodation work.
Combine this with the problem of pin-projections -- the way of accessing
fields of a pinned struct -- which also have an `unsafe` API, pinned
initialization is riddled with `unsafe` resulting in very poor ergonomics.
Not only that, but also having to call two functions possibly multiple
lines apart makes it very easy to forget it outright or during refactoring.
Here is an example of the current way of initializing a struct with two
synchronization primitives (see [3] for the full example):
struct SharedState {
state_changed: CondVar,
inner: Mutex<SharedStateInner>,
}
impl SharedState {
fn try_new() -> Result<Arc<Self>> {
let mut state = Pin::from(UniqueArc::try_new(Self {
// SAFETY: `condvar_init!` is called below.
state_changed: unsafe { CondVar::new() },
// SAFETY: `mutex_init!` is called below.
inner: unsafe {
Mutex::new(SharedStateInner { token_count: 0 })
},
})?);
// SAFETY: `state_changed` is pinned when `state` is.
let pinned = unsafe {
state.as_mut().map_unchecked_mut(|s| &mut s.state_changed)
};
kernel::condvar_init!(pinned, "SharedState::state_changed");
// SAFETY: `inner` is pinned when `state` is.
let pinned = unsafe {
state.as_mut().map_unchecked_mut(|s| &mut s.inner)
};
kernel::mutex_init!(pinned, "SharedState::inner");
Ok(state.into())
}
}
The pin-init API of this patch solves this issue by providing a
comprehensive solution comprised of macros and traits. Here is the example
from above using the pin-init API:
#[pin_data]
struct SharedState {
#[pin]
state_changed: CondVar,
#[pin]
inner: Mutex<SharedStateInner>,
}
impl SharedState {
fn new() -> impl PinInit<Self> {
pin_init!(Self {
state_changed <- new_condvar!("SharedState::state_changed"),
inner <- new_mutex!(
SharedStateInner { token_count: 0 },
"SharedState::inner",
),
})
}
}
Notably the way the macro is used here requires no `unsafe` and thus comes
with the usual Rust promise of safe code not introducing any memory
violations. Additionally it is now up to the caller of `new()` to decide
the memory location of the `SharedState`. They can choose at the moment
`Arc<T>`, `Box<T>` or the stack.
--
The API has the following architecture:
1. Initializer traits `PinInit<T, E>` and `Init<T, E>` that act like
closures.
2. Macros to create these initializer traits safely.
3. Functions to allow manually writing initializers.
The initializers (an `impl PinInit<T, E>`) receive a raw pointer pointing
to uninitialized memory and their job is to fully initialize a `T` at that
location. If initialization fails, they return an error (`E`) by value.
This way of initializing cannot be safely exposed to the user, since it
relies upon these properties outside of the control of the trait:
- the memory location (slot) needs to be valid memory,
- if initialization fails, the slot should not be read from,
- the value in the slot should be pinned, so it cannot move and the memory
cannot be deallocated until the value is dropped.
This is why using an initializer is facilitated by another trait that
ensures these requirements.
These initializers can be created manually by just supplying a closure that
fulfills the same safety requirements as `PinInit<T, E>`. But this is an
`unsafe` operation. To allow safe initializer creation, the `pin_init!` is
provided along with three other variants: `try_pin_init!`, `try_init!` and
`init!`. These take a modified struct initializer as a parameter and
generate a closure that initializes the fields in sequence.
The macros take great care in upholding the safety requirements:
- A shadowed struct type is used as the return type of the closure instead
of `()`. This is to prevent early returns, as these would prevent full
initialization.
- To ensure every field is only initialized once, a normal struct
initializer is placed in unreachable code. The type checker will emit
errors if a field is missing or specified multiple times.
- When initializing a field fails, the whole initializer will fail and
automatically drop fields that have been initialized earlier.
- Only the correct initializer type is allowed for unpinned fields. You
cannot use a `impl PinInit<T, E>` to initialize a structurally not pinned
field.
To ensure the last point, an additional macro `#[pin_data]` is needed. This
macro annotates the struct itself and the user specifies structurally
pinned and not pinned fields.
Because dropping a pinned struct is also not allowed to break the pinning
invariants, another macro attribute `#[pinned_drop]` is needed. This
macro is introduced in a following commit.
These two macros also have mechanisms to ensure the overall safety of the
API. Additionally, they utilize a combined proc-macro, declarative macro
design: first a proc-macro enables the outer attribute syntax `#[...]` and
does some important pre-parsing. Notably this prepares the generics such
that the declarative macro can handle them using token trees. Then the
actual parsing of the structure and the emission of code is handled by a
declarative macro.
For pin-projections the crates `pin-project` [4] and `pin-project-lite` [5]
had been considered, but were ultimately rejected:
- `pin-project` depends on `syn` [6] which is a very big dependency, around
50k lines of code.
- `pin-project-lite` is a more reasonable 5k lines of code, but contains a
very complex declarative macro to parse generics. On top of that it
would require modification that would need to be maintained
independently.
Link: https://rust-for-linux.com/the-safe-pinned-initialization-problem [1]
Link: 0a04dc4ddd [2]
Link: f509ede33f/samples/rust/rust_miscdev.rs [3]
Link: https://crates.io/crates/pin-project [4]
Link: https://crates.io/crates/pin-project-lite [5]
Link: https://crates.io/crates/syn [6]
Co-developed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>
Link: https://lore.kernel.org/r/20230408122429.1103522-7-y86-dev@protonmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>