linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-11 13:27:06 +09:00

Author	SHA1	Message	Date
Dave Jiang	244009b07e	dmaengine: idxd: expose fault counters to sysfs Expose cr_faults and cr_fault_failures counters to the user space. This allows a user app to keep track of how many fault the application is causing with the completion record (CR) and also the number of failures of the CR writeback. Having a high number of cr_fault_failures is bad as the app is submitting descriptors with the CR addresses that are bad. User monitoring daemon may want to consider killing the application as it may be malicious and attempting to flood the device event log. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-15-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:46 +05:30
Dave Jiang	e6fd6d7e5f	dmaengine: idxd: add a device to represent the file opened Embed a struct device for the user file context in order to export sysfs attributes related with the opened file. Tie the lifetime of the file context to the device. The sysfs entry will be added under the char device. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-14-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	fecae134ee	dmaengine: idxd: add per file user counters for completion record faults Add counters per opened file for the char device in order to keep track how many completion record faults occurred and how many of those faults failed the writeback by the driver after attempt to fault in the page. The counters are managed by xarray that associates the PASID with struct idxd_user_context. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-13-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	2442b7473a	dmaengine: idxd: process batch descriptor completion record faults Add event log processing for faulting of user batch descriptor completion record. When encountering an event log entry for a page fault on a completion record, the driver is expected to do the following: 1. If the "first error in batch" bit in event log entry error info is set, discard any previously recorded errors associated with the "batch identifier". 2. Fix the page fault according to the fault address in the event log. If successful, write the completion record to the fault address in user space. 3. If an error is encountered while writing the completion record and it is associated to a descriptor in the batch, the driver associates the error with the batch identifier of the event log entry and tracks it until the event log entry for the corresponding batch desc is encountered. While processing an event log entry for a batch descriptor with error indicating that one or more descs in the batch had event log entries, the driver will do the following before writing the batch completion record: 1. If the status field of the completion record is 0x1, the driver will change it to error code 0x5 (one or more operations in batch completed with status not successful) and changes the result field to 1. 2. If the status is error code 0x6 (page fault on batch descriptor list address), change the result field to 1. 3. If status is any other value, the completion record is not changed. 4. Clear the recorded error in preparation for next batch with same batch identifier. The result field is for user software to determine whether to set the "Batch Error" flag bit in the descriptor for continuation of partial batch descriptor completion. See DSA spec 2.0 for additional information. If no error has been recorded for the batch, the batch completion record is written to user space as is. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	6926987185	dmaengine: idxd: add descs_completed field for completion record The descs_completed field for a completion record is part of a batch descriptor completion record. It takes the same location as bytes_completed in a normal descriptor field. Add to expose to user. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-11-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	c40bd7d973	dmaengine: idxd: process user page faults for completion record DSA supports page fault handling through PRS. However, the DMA engine that's processing the descriptor is blocked until the PRS response is received. Other workqueues sharing the engine are also blocked. Page fault handing by the driver with PRS disabled can be used to mitigate the stalling. With PRS disabled while ATS remain enabled, DSA handles page faults on a completion record by reporting an event in the event log. In this instance, the descriptor is completed and the event log contains the completion record address and the contents of the completion record. Add support to the event log handling code to fault in the completion record and copy the content of the completion record to user memory. A bitmap is introduced to keep track of discarded event log entries. When the user process initiates ->release() of the char device, it no longer is interested in any remaining event log entries tied to the relevant wq and PASID. The driver will mark the event log entry index in the bitmap. Upon encountering the entries during processing, the event log handler will just clear the bitmap bit and skip the entry rather than attempt to process the event log entry. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Fenghua Yu	b022f59725	dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling Define idxd_copy_cr() to copy completion record to fault address in user address that is found by work queue (wq) and PASID. It will be used to write the user's completion record that the hardware device is not able to write due to user completion record page fault. An xarray is added to associate the PASID and mm with the struct idxd_user_context so mm can be found by PASID and wq. It is called when handling the completion record fault in a kernel thread context. Switch to the mm using kthread_use_vm() and copy the completion record to the mm via copy_to_user(). Once the copy is completed, switch back to the current mm using kthread_unuse_mm(). Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	c2f156bf16	dmaengine: idxd: create kmem cache for event log fault items Add a kmem cache per device for allocating event log fault context. The context allows an event log entry to be copied and passed to a software workqueue to be processed. Due to each device can have different sized event log entry depending on device type, it's not possible to have a global kmem cache. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	2f30decd2f	dmaengine: idxd: add per DSA wq workqueue for processing cr faults Add a workqueue for user submitted completion record fault processing. The workqueue creation and destruction lifetime will be tied to the user sub-driver since it will only be used when the wq is a user type. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-7-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	5fbe6503b5	dmanegine: idxd: add debugfs for event log dump Add debugfs entry to dump the content of the event log for debugging. The function will dump all non-zero entries in the event log. It will note which entries are processed and which entries are still pending processing at the time of the dump. The entries may not always be in chronological order due to the log is a circular buffer. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	2f431ba908	dmaengine: idxd: add interrupt handling for event log An event log interrupt is raised in the misc interrupt INTCAUSE register when an event is written by the hardware. Add basic event log processing support to the interrupt handler. The event log is a ring where the hardware owns the tail and the software owns the head. The hardware will advance the tail index when an additional event has been pushed to memory. The software will process the log entry and then advances the head. The log is full when (tail + 1) % log_size = head. The hardware will stop writing when the log is full. The user is expected to create a log size large enough to handle all the expected events. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-5-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	244da66cda	dmaengine: idxd: setup event log configuration Add setup of event log feature for supported device. Event log addresses error reporting that was lacking in gen 1 DSA devices where a second error event does not get reported when a first event is pending software handling. The event log allows a circular buffer that the device can push error events to. It is up to the user to create a large enough event log ring in order to capture the expected events. The evl size can be set in the device sysfs attribute. By default 64 entries are supported as minimal when event log is enabled. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	1649091f91	dmaengine: idxd: add event log size sysfs attribute Add support for changing of the event log size. Event log is a feature added to DSA 2.0 hardware to improve error reporting. It supersedes the SWERROR register on DSA 1.0 hardware and hope to prevent loss of reported errors. The error log size determines how many error entries supported for the device. It can be configured by the user via sysfs attribute. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Dave Jiang	0c40bfb4c2	dmaengine: idxd: make misc interrupt one shot Current code continuously processes the interrupt as long as the hardware is setting the status bit. There's no reason to do that since the threaded handler will get called again if another interrupt is asserted. Also through testing, it has shown that if a misprogrammed (or malicious) agent can continuously submit descriptors with bad completion record and causes errors to be reported via the misc interrupt. Continuous processing by the thread can cause software hang watchdog to kick off since the thread isn't giving up the CPU. Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Walker Chen	c9566127f0	dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma The DMA controller needs two reset items to work properly on JH7110 SoC, so there is need to constrain the items' value to 2, other platforms have 1 reset item at most. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Walker Chen <walker.chen@starfivetech.com> Link: https://lore.kernel.org/r/20230322094820.24738-2-walker.chen@starfivetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Rob Herring	894abe0dcf	dt-bindings: dma: Drop unneeded quotes Cleanup bindings dropping unneeded quotes. Once all these are fixed, checking for this can be enabled in yamllint. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230331182141.1900348-1-robh@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	09ebe227c2	dmaengine: at_xdmac: align declaration of ret with the rest of variables Align the declaration of ret in atmel_xdmac_resume() with the rest of variables. Do this by adding ret to the line with declaration for i variable. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-8-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	5056eae6c3	dmaengine: at_xdmac: add a warning message regarding for unpaused channels Add a warning message on suspend to let the user that there are channels not paused by their consumers. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-7-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	f8435befd8	dmaengine: at_xdmac: do not enable all cyclic channels Do not global enable all the cyclic channels in at_xdmac_resume(). Instead save the global status in at_xdmac_suspend() and re-enable the cyclic channel only if it was active before suspend. Fixes: `e1f7c9eee7` ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-6-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	7c5eb63d16	dmaengine: at_xdmac: restore the content of grws register In case the system suspends to a deep sleep state where power to DMA controller is cut-off we need to restore the content of GRWS register. This is a write only register and writing bit X tells the controller to suspend read and write requests for channel X. Thus set GRWS before restoring the content of GE (Global Enable) regiter. Fixes: `e1f7c9eee7` ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-5-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	44fe8440bd	dmaengine: at_xdmac: do not resume channels paused by consumers In case there are DMA channels not paused by consumers in suspend process (valid on AT91 SoCs for serial driver when no_console_suspend) the driver pauses them (using at_xdmac_device_pause() which is also the same function called by dmaengine_pause()) and then in the resume process the driver resumes them calling at_xdmac_device_resume() which is the same function called by dmaengine_resume()). This is good for DMA channels not paused by consumers but for drivers that calls dmaengine_pause()/dmaegine_resume() on suspend/resume path this may lead to DMA channel being enabled before the IP is enabled. For IPs that needs strict ordering with regards to DMA channel enablement this will lead to wrong behavior. To fix this add a new set of functions at_xdmac_device_pause_internal()/at_xdmac_device_resume_internal() to be called only on suspend/resume. Fixes: `e1f7c9eee7` ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-4-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	e53957e1ec	dmaengine: at_xdmac: fix imbalanced runtime PM reference counter In case there are channels not paused during suspend (which on AT91 case is valid for serial driver when no_console_suspend boot argument is used) the at_xdmac_runtime_suspend_descriptors() was called more than one time due to at_xdmac_off(). To fix this add a new argument to at_xdmac_off() to specify if runtime PM reference counter needs to be decremented for queued active descriptors. Along with it moved the at_xdmac_runtime_suspend_descriptors() call under at_xdmac_chan_is_paused() check on suspend path as for the rest of channels the suspend is delayed by atmel_xdmac_prepare() in case channel is enabled. Same approach has been applied on resume path. Fixes: `650b0e990c` ("dmaengine: at_xdmac: add runtime pm support") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-3-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Claudiu Beznea	2de5ddb5e6	dmaengine: at_xdmac: disable/enable clock directly on suspend/resume Runtime PM APIs for at_xdmac just plays with clk_enable()/clk_disable() letting aside the clk_prepare()/clk_unprepare() that needs to be executed as the clock is also prepared on probe. Thus instead of using runtime PM force suspend/resume APIs use clk_disable_unprepare() + pm_runtime_put_noidle() on suspend and clk_prepare_enable() + pm_runtime_get_noresume() on resume. This approach as been chosen instead of using runtime PM force suspend/resume with clk_unprepare()/clk_prepare() as it looks simpler and the final code is better. While at it added the missing pm_runtime_mark_last_busy() on suspend before decrementing the reference counter. Fixes: `650b0e990c` ("dmaengine: at_xdmac: add runtime pm support") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230214151827.1050280-2-claudiu.beznea@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:43 +05:30
Walker Chen	ce62432cb8	dmaengine: dw-axi-dmac: Increase polling time to DMA transmission completion status The bit DMAC_CHEN[0] is automatically cleared by hardware to disable the channel after the last AMBA transfer of the DMA transfer to the destination has completed. Software can therefore poll this bit to determine when this channel is free for a new DMA transfer. This time requires at least 40 milliseconds on JH7110 SoC, otherwise an error message 'failed to stop' will be reported. Signed-off-by: Walker Chen <walker.chen@starfivetech.com> Link: https://lore.kernel.org/r/20230322094820.24738-4-walker.chen@starfivetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:43 +05:30
Walker Chen	790f3c8b8f	dmaengine: dw-axi-dmac: Add support for StarFive JH7110 DMA Add DMA reset operation in device probe and use different configuration on CH_CFG registers according to match data. Update all uses of of_device_is_compatible with of_device_get_match_data. Signed-off-by: Walker Chen <walker.chen@starfivetech.com> Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Link: https://lore.kernel.org/r/20230322094820.24738-3-walker.chen@starfivetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:43 +05:30
Ian Rogers	1f94479edb	libperf: Make perf_cpu_map__alloc() available as an internal function for tools/perf to use We had the open coded equivalent in perf_cpu_map__empty_new(), so reuse what is in libperf. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-3-irogers@google.com [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-12 14:44:24 -03:00
Ian Rogers	7bb1d048bd	perf cpumap: Use perf_cpu_map__nr(cpus) to access cpus->nr So that we can have a single point where to refcount check 'struct perf_cpu_map' instances for use after free, etc. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-3-irogers@google.com [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-12 14:44:11 -03:00
Ondrej Mosnacek	bcab1adeaa	selinux: fix Makefile dependencies of flask.h Make the flask.h target depend on the genheaders binary instead of classmap.h to ensure that it is rebuilt if any of the dependencies of genheaders are changed. Notably this fixes flask.h not being rebuilt when initial_sid_to_string.h is modified. Fixes: `8753f6bec3` ("selinux: generate flask headers during kernel build") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2023-04-12 13:34:20 -04:00
Tetsuo Handa	57dcd64c7e	cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex syzbot is reporting circular locking dependency between cpu_hotplug_lock and freezer_mutex, for commit `f5d39b0208` ("freezer,sched: Rewrite core freezer logic") replaced atomic_inc() in freezer_apply_state() with static_branch_inc() which holds cpu_hotplug_lock. cpu_hotplug_lock => cgroup_threadgroup_rwsem => freezer_mutex cgroup_file_write() { cgroup_procs_write() { __cgroup_procs_write() { cgroup_procs_write_start() { cgroup_attach_lock() { cpus_read_lock() { percpu_down_read(&cpu_hotplug_lock); } percpu_down_write(&cgroup_threadgroup_rwsem); } } cgroup_attach_task() { cgroup_migrate() { cgroup_migrate_execute() { freezer_attach() { mutex_lock(&freezer_mutex); (...snipped...) } } } } (...snipped...) } } } freezer_mutex => cpu_hotplug_lock cgroup_file_write() { freezer_write() { freezer_change_state() { mutex_lock(&freezer_mutex); freezer_apply_state() { static_branch_inc(&freezer_active) { static_key_slow_inc() { cpus_read_lock(); static_key_slow_inc_cpuslocked(); cpus_read_unlock(); } } } mutex_unlock(&freezer_mutex); } } } Swap locking order by moving cpus_read_lock() in freezer_apply_state() to before mutex_lock(&freezer_mutex) in freezer_change_state(). Reported-by: syzbot <syzbot+c39682e86c9d84152f93@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=c39682e86c9d84152f93 Suggested-by: Hillf Danton <hdanton@sina.com> Fixes: `f5d39b0208` ("freezer,sched: Rewrite core freezer logic") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2023-04-12 07:30:28 -10:00
Alexei Starovoitov	10fd5f70c3	bpf: Handle NULL in bpf_local_storage_free. During OOM bpf_local_storage_alloc() may fail to allocate 'storage' and call to bpf_local_storage_free() with NULL pointer will cause a crash like: [ 271718.917646] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 271719.019620] RIP: 0010:call_rcu+0x2d/0x240 [ 271719.216274] bpf_local_storage_alloc+0x19e/0x1e0 [ 271719.250121] bpf_local_storage_update+0x33b/0x740 Fixes: `7e30a8477b` ("bpf: Add bpf_local_storage_free()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230412171252.15635-1-alexei.starovoitov@gmail.com	2023-04-12 10:27:50 -07:00
Shunsuke Mie	970b17dfe2	dmaengine: dw-edma: Fix to enable to issue dma request on DMA processing The issue_pending request is ignored while driver is processing a DMA request. Fix to issue the pending requests on any dma channel status. Fixes: `e63d79d1ff` ("dmaengine: Add Synopsys eDMA IP core driver") Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Link: https://lore.kernel.org/r/20230411101758.438472-2-mie@igel.co.jp Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:44:49 +05:30
Shunsuke Mie	a251994a44	dmaengine: dw-edma: Fix to change for continuous transfer The dw-edma driver stops after processing a DMA request even if a request remains in the issued queue, which is not the expected behavior. The DMA engine API requires continuous processing. Add a trigger to start after one processing finished if there are requests remain. Fixes: `e63d79d1ff` ("dmaengine: Add Synopsys eDMA IP core driver") Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Link: https://lore.kernel.org/r/20230411101758.438472-1-mie@igel.co.jp Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:44:49 +05:30
Rob Herring	619d8ea96d	dmaengine: qcom_hidma: Add explicit platform_device.h and of_device.h includes qcom_hidma uses of_dma_configure() which is declared in of_device.h. platform_device.h and of_device.h get implicitly included by of_platform.h, but that is going to be removed soon. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Sinan Kaya <okaya@kernel.org> Link: https://lore.kernel.org/r/20230410232654.1561462-1-robh@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:43:08 +05:30
Dmitry Baryshkov	91d6a468e3	dma: gpi: remove spurious unlock in gpi_ch_init gpi_ch_init() doesn't lock the ctrl_lock mutex, so there is no need to unlock it too. Instead the mutex is handled by the function gpi_alloc_chan_resources(), which properly locks and unlocks the mutex. ===================================== WARNING: bad unlock balance detected! 6.3.0-rc5-00253-g99792582ded1-dirty #15 Not tainted ------------------------------------- kworker/u16:0/9 is trying to release lock (&gpii->ctrl_lock) at: [<ffffb99d04e1284c>] gpi_alloc_chan_resources+0x108/0x5bc but there are no more locks to release! other info that might help us debug this: 6 locks held by kworker/u16:0/9: #0: ffff575740010938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x220/0x594 #1: ffff80000809bdd0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x220/0x594 #2: ffff575740f2a0f8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x38/0x188 #3: ffff57574b5570f8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x38/0x188 #4: ffffb99d06a2f180 (of_dma_lock){+.+.}-{3:3}, at: of_dma_request_slave_channel+0x138/0x280 #5: ffffb99d06a2ee20 (dma_list_mutex){+.+.}-{3:3}, at: dma_get_slave_channel+0x28/0x10c stack backtrace: CPU: 7 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc5-00253-g99792582ded1-dirty #15 Hardware name: Google Pixel 3 (DT) Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace+0xa0/0xfc show_stack+0x18/0x24 dump_stack_lvl+0x60/0xac dump_stack+0x18/0x24 print_unlock_imbalance_bug+0x130/0x148 lock_release+0x270/0x300 __mutex_unlock_slowpath+0x48/0x2cc mutex_unlock+0x20/0x2c gpi_alloc_chan_resources+0x108/0x5bc dma_chan_get+0x84/0x188 dma_get_slave_channel+0x5c/0x10c gpi_of_dma_xlate+0x110/0x1a0 of_dma_request_slave_channel+0x174/0x280 dma_request_chan+0x3c/0x2d4 geni_i2c_probe+0x544/0x63c platform_probe+0x68/0xc4 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 device_add+0x60c/0x7d8 of_device_add+0x44/0x60 of_platform_device_create_pdata+0x90/0x124 of_platform_bus_create+0x15c/0x3c8 of_platform_populate+0x58/0xf8 devm_of_platform_populate+0x58/0xbc geni_se_probe+0xf0/0x164 platform_probe+0x68/0xc4 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x2bc/0x594 worker_thread+0x228/0x438 kthread+0x108/0x10c ret_from_fork+0x10/0x20 Fixes: `5d0c3533a1` ("dmaengine: qcom: Add GPI dma driver") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20230409233355.453741-1-dmitry.baryshkov@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:42:27 +05:30
Josh Poimboeuf	50f9a76ef1	iov_iter: Mark copy_compat_iovec_from_user() noinline After commit 6376ce56feb6 ("iov_iter: import single vector iovecs as ITER_UBUF"), GCC does an inter-procedural compiler optimization which moves the user_access_begin() out of copy_compat_iovec_from_user() and into its callers: lib/iov_iter.o: warning: objtool: .altinstr_replacement+0x0: redundant UACCESS disable lib/iov_iter.o: warning: objtool: iovec_from_user.part.0+0xc7: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled lib/iov_iter.o: warning: objtool: __import_iovec+0x21d: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled Enforce the "no UACCESS enable across function boundaries" rule by disabling cloning for copy_compat_iovec_from_user(). Fixes: 6376ce56feb6 ("iov_iter: import single vector iovecs as ITER_UBUF") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> https://lkml.kernel.org/lkml/20230327120017.6bb826d7@canb.auug.org.au Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-04-12 10:46:48 -06:00
Sinthu Raja	a010613237	phy: cadence: cdns-dphy-rx: Add common module reset support DPHY RX module has a common module reset (RSTB_CMN) which is expected to be released during configuration. In J721E SR1.0 the RSTB_CMN is internally tied to CSI_RX_RST and is hardware controlled, for all other newer platforms the common module reset is software controlled. Add support to control common module reset during configuration and also skip common module reset based on soc_device_match() for J721E SR1.0. Signed-off-by: Sinthu Raja <sinthu.raja@ti.com> Co-developed-by: Vaishnav Achath <vaishnav.a@ti.com> Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com> Reviewed-by: Jai Luthra <j-luthra@ti.com> Link: https://lore.kernel.org/r/20230314073137.2153-1-vaishnav.a@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:16:16 +05:30
Siddharth Vadapalli	ec318c51b6	phy: ti: j721e-wiz: Add SGMII support in WIZ driver for J721E Enable full rate divider configuration support for J721E_WIZ_16G for SGMII. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20230309092434.443550-1-s-vadapalli@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 22:15:27 +05:30
Benno Lossin	1944caa8e8	rust: sync: add functions for initializing `UniqueArc<MaybeUninit<T>>` Add two functions `init_with` and `pin_init_with` to `UniqueArc<MaybeUninit<T>>` to initialize the memory of already allocated `UniqueArc`s. This is useful when you want to allocate memory check some condition inside of a context where allocation is forbidden and then conditionally initialize an object. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-16-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	701608bd03	rust: sync: reduce stack usage of `UniqueArc::try_new_uninit` `UniqueArc::try_new_uninit` calls `Arc::try_new(MaybeUninit::uninit())`. This results in the uninitialized memory being placed on the stack, which may be arbitrarily large due to the generic `T` and thus could cause a stack overflow for large types. Change the implementation to use the pin-init API which enables in-place initialization. In particular it avoids having to first construct and then move the uninitialized memory from the stack into the final location. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-15-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	692e8935e2	rust: types: add `Opaque::ffi_init` This function allows to easily initialize `Opaque` with the pin-init API. `Opaque::ffi_init` takes a closure and returns a pin-initializer. This pin-initiailizer calls the given closure with a pointer to the inner `T`. Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20230408122429.1103522-14-y86-dev@protonmail.com [ Fixed typo. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	8586f1acd3	rust: prelude: add `pin-init` API items to prelude Add `pin-init` API macros and traits to the prelude. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-13-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	38cde0bd7b	rust: init: add `Zeroable` trait and `init::zeroed` function Add the `Zeroable` trait which marks types that can be initialized by writing `0x00` to every byte of the type. Also add the `init::zeroed` function that creates an initializer for a `Zeroable` type that writes `0x00` to every byte. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-12-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	6841d45a30	rust: init: add `stack_pin_init!` macro The `stack_pin_init!` macro allows pin-initializing a value on the stack. It accepts a `impl PinInit<T, E>` to initialize a `T`. It allows propagating any errors via `?` or handling it normally via `match`. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20230408122429.1103522-11-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	d0fdc39612	rust: init: add `PinnedDrop` trait and macros The `PinnedDrop` trait that facilitates destruction of pinned types. It has to be implemented via the `#[pinned_drop]` macro, since the `drop` function should not be called by normal code, only by other destructors. It also only works on structs that are annotated with `#[pin_data(PinnedDrop)]`. Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-10-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	92c4a1e7e8	rust: init/sync: add `InPlaceInit` trait to pin-initialize smart pointers The `InPlaceInit` trait that provides two functions, for initializing using `PinInit<T, E>` and `Init<T>`. It is implemented by `Arc<T>`, `UniqueArc<T>` and `Box<T>`. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-9-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	fc6c6baa1f	rust: init: add initialization macros Add the following initializer macros: - `#[pin_data]` to annotate structurally pinned fields of structs, needed for `pin_init!` and `try_pin_init!` to select the correct initializer of fields. - `pin_init!` create a pin-initializer for a struct with the `Infallible` error type. - `try_pin_init!` create a pin-initializer for a struct with a custom error type (`kernel::error::Error` is the default). - `init!` create an in-place-initializer for a struct with the `Infallible` error type. - `try_init!` create an in-place-initializer for a struct with a custom error type (`kernel::error::Error` is the default). Also add their needed internal helper traits and structs. Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-8-y86-dev@protonmail.com [ Fixed three typos. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	90e53c5e70	rust: add pin-init API core This API is used to facilitate safe pinned initialization of structs. It replaces cumbersome `unsafe` manual initialization with elegant safe macro invocations. Due to the size of this change it has been split into six commits: 1. This commit introducing the basic public interface: traits and functions to represent and create initializers. 2. Adds the `#[pin_data]`, `pin_init!`, `try_pin_init!`, `init!` and `try_init!` macros along with their internal types. 3. Adds the `InPlaceInit` trait that allows using an initializer to create an object inside of a `Box<T>` and other smart pointers. 4. Adds the `PinnedDrop` trait and adds macro support for it in the `#[pin_data]` macro. 5. Adds the `stack_pin_init!` macro allowing to pin-initialize a struct on the stack. 6. Adds the `Zeroable` trait and `init::zeroed` function to initialize types that have `0x00` in all bytes as a valid bit pattern. -- In this section the problem that the new pin-init API solves is outlined. This message describes the entirety of the API, not just the parts introduced in this commit. For a more granular explanation and additional information on pinning and this issue, view [1]. Pinning is Rust's way of enforcing the address stability of a value. When a value gets pinned it will be impossible for safe code to move it to another location. This is done by wrapping pointers to said object with `Pin<P>`. This wrapper prevents safe code from creating mutable references to the object, preventing mutable access, which is needed to move the value. `Pin<P>` provides `unsafe` functions to circumvent this and allow modifications regardless. It is then the programmer's responsibility to uphold the pinning guarantee. Many kernel data structures require a stable address, because there are foreign pointers to them which would get invalidated by moving the structure. Since these data structures are usually embedded in structs to use them, this pinning property propagates to the container struct. Resulting in most structs in both Rust and C code needing to be pinned. So if we want to have a `mutex` field in a Rust struct, this struct also needs to be pinned, because a `mutex` contains a `list_head`. Additionally initializing a `list_head` requires already having the final memory location available, because it is initialized by pointing it to itself. But this presents another challenge in Rust: values have to be initialized at all times. There is the `MaybeUninit<T>` wrapper type, which allows handling uninitialized memory, but this requires using the `unsafe` raw pointers and a casting the type to the initialized variant. This problem gets exacerbated when considering encapsulation and the normal safety requirements of Rust code. The fields of the Rust `Mutex<T>` should not be accessible to normal driver code. After all if anyone can modify the fields, there is no way to ensure the invariants of the `Mutex<T>` are upheld. But if the fields are inaccessible, then initialization of a `Mutex<T>` needs to be somehow achieved via a function or a macro. Because the `Mutex<T>` must be pinned in memory, the function cannot return it by value. It also cannot allocate a `Box` to put the `Mutex<T>` into, because that is an unnecessary allocation and indirection which would hurt performance. The solution in the rust tree (e.g. this commit: [2]) that is replaced by this API is to split this function into two parts: 1. A `new` function that returns a partially initialized `Mutex<T>`, 2. An `init` function that requires the `Mutex<T>` to be pinned and that fully initializes the `Mutex<T>`. Both of these functions have to be marked `unsafe`, since a call to `new` needs to be accompanied with a call to `init`, otherwise using the `Mutex<T>` could result in UB. And because calling `init` twice also is not safe. While `Mutex<T>` initialization cannot fail, other structs might also have to allocate memory, which would result in conditional successful initialization requiring even more manual accommodation work. Combine this with the problem of pin-projections -- the way of accessing fields of a pinned struct -- which also have an `unsafe` API, pinned initialization is riddled with `unsafe` resulting in very poor ergonomics. Not only that, but also having to call two functions possibly multiple lines apart makes it very easy to forget it outright or during refactoring. Here is an example of the current way of initializing a struct with two synchronization primitives (see [3] for the full example): struct SharedState { state_changed: CondVar, inner: Mutex<SharedStateInner>, } impl SharedState { fn try_new() -> Result<Arc<Self>> { let mut state = Pin::from(UniqueArc::try_new(Self { // SAFETY: `condvar_init!` is called below. state_changed: unsafe { CondVar::new() }, // SAFETY: `mutex_init!` is called below. inner: unsafe { Mutex::new(SharedStateInner { token_count: 0 }) }, })?); // SAFETY: `state_changed` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(\|s\| &mut s.state_changed) }; kernel::condvar_init!(pinned, "SharedState::state_changed"); // SAFETY: `inner` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(\|s\| &mut s.inner) }; kernel::mutex_init!(pinned, "SharedState::inner"); Ok(state.into()) } } The pin-init API of this patch solves this issue by providing a comprehensive solution comprised of macros and traits. Here is the example from above using the pin-init API: #[pin_data] struct SharedState { #[pin] state_changed: CondVar, #[pin] inner: Mutex<SharedStateInner>, } impl SharedState { fn new() -> impl PinInit<Self> { pin_init!(Self { state_changed <- new_condvar!("SharedState::state_changed"), inner <- new_mutex!( SharedStateInner { token_count: 0 }, "SharedState::inner", ), }) } } Notably the way the macro is used here requires no `unsafe` and thus comes with the usual Rust promise of safe code not introducing any memory violations. Additionally it is now up to the caller of `new()` to decide the memory location of the `SharedState`. They can choose at the moment `Arc<T>`, `Box<T>` or the stack. -- The API has the following architecture: 1. Initializer traits `PinInit<T, E>` and `Init<T, E>` that act like closures. 2. Macros to create these initializer traits safely. 3. Functions to allow manually writing initializers. The initializers (an `impl PinInit<T, E>`) receive a raw pointer pointing to uninitialized memory and their job is to fully initialize a `T` at that location. If initialization fails, they return an error (`E`) by value. This way of initializing cannot be safely exposed to the user, since it relies upon these properties outside of the control of the trait: - the memory location (slot) needs to be valid memory, - if initialization fails, the slot should not be read from, - the value in the slot should be pinned, so it cannot move and the memory cannot be deallocated until the value is dropped. This is why using an initializer is facilitated by another trait that ensures these requirements. These initializers can be created manually by just supplying a closure that fulfills the same safety requirements as `PinInit<T, E>`. But this is an `unsafe` operation. To allow safe initializer creation, the `pin_init!` is provided along with three other variants: `try_pin_init!`, `try_init!` and `init!`. These take a modified struct initializer as a parameter and generate a closure that initializes the fields in sequence. The macros take great care in upholding the safety requirements: - A shadowed struct type is used as the return type of the closure instead of `()`. This is to prevent early returns, as these would prevent full initialization. - To ensure every field is only initialized once, a normal struct initializer is placed in unreachable code. The type checker will emit errors if a field is missing or specified multiple times. - When initializing a field fails, the whole initializer will fail and automatically drop fields that have been initialized earlier. - Only the correct initializer type is allowed for unpinned fields. You cannot use a `impl PinInit<T, E>` to initialize a structurally not pinned field. To ensure the last point, an additional macro `#[pin_data]` is needed. This macro annotates the struct itself and the user specifies structurally pinned and not pinned fields. Because dropping a pinned struct is also not allowed to break the pinning invariants, another macro attribute `#[pinned_drop]` is needed. This macro is introduced in a following commit. These two macros also have mechanisms to ensure the overall safety of the API. Additionally, they utilize a combined proc-macro, declarative macro design: first a proc-macro enables the outer attribute syntax `#[...]` and does some important pre-parsing. Notably this prepares the generics such that the declarative macro can handle them using token trees. Then the actual parsing of the structure and the emission of code is handled by a declarative macro. For pin-projections the crates `pin-project` [4] and `pin-project-lite` [5] had been considered, but were ultimately rejected: - `pin-project` depends on `syn` [6] which is a very big dependency, around 50k lines of code. - `pin-project-lite` is a more reasonable 5k lines of code, but contains a very complex declarative macro to parse generics. On top of that it would require modification that would need to be maintained independently. Link: https://rust-for-linux.com/the-safe-pinned-initialization-problem [1] Link: `0a04dc4ddd` [2] Link: `f509ede33f/samples/rust/rust_miscdev.rs` [3] Link: https://crates.io/crates/pin-project [4] Link: https://crates.io/crates/pin-project-lite [5] Link: https://crates.io/crates/syn [6] Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Wedson Almeida Filho <wedsonaf@gmail.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20230408122429.1103522-7-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	3ff6e785ad	rust: types: add `Opaque::raw_get` This function mirrors `UnsafeCell::raw_get`. It avoids creating a reference and allows solely using raw pointers. The `pin-init` API will be using this, since uninitialized memory requires raw pointers. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20230408122429.1103522-6-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Benno Lossin	d6dbca3592	rust: sync: change error type of constructor functions Change the error type of the constructors of `Arc` and `UniqueArc` to be `AllocError` instead of `Error`. This makes the API more clear as to what can go wrong when calling `try_new` or its variants. Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20230408122429.1103522-4-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00
Gary Guo	70a21e54a4	rust: macros: add `quote!` macro Add the `quote!` macro for creating `TokenStream`s directly via the given Rust tokens. It also supports repetitions using iterators. It will be used by the pin-init API proc-macros to generate code. Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20230408122429.1103522-3-y86-dev@protonmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2023-04-12 18:41:05 +02:00

... 84 85 86 87 88 ...

1185157 Commits