linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-03 17:51:57 +09:00

Author	SHA1	Message	Date
Gao Xiang	cec6e93bea	erofs: support parsing big pcluster compress indexes When INCOMPAT_BIG_PCLUSTER sb feature is enabled, legacy compress indexes will also have the same on-disk header compact indexes to keep per-file configurations instead of leaving it zeroed. If ADVISE_BIG_PCLUSTER is set for a file, CBLKCNT will be loaded for each pcluster in this file by parsing 1st non-head lcluster. Link: https://lore.kernel.org/r/20210407043927.10623-8-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:18 +08:00
Gao Xiang	4fea63f7d7	erofs: adjust per-CPU buffers according to max_pclusterblks Adjust per-CPU buffers on demand since big pcluster definition is available. Also, bail out unsupported pcluster size according to Z_EROFS_PCLUSTER_MAX_SIZE. Link: https://lore.kernel.org/r/20210407043927.10623-7-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:17 +08:00
Gao Xiang	5404c33010	erofs: add big physical cluster definition Big pcluster indicates the size of compressed data for each physical pcluster is no longer fixed as block size, but could be more than 1 block (more accurately, 1 logical pcluster) When big pcluster feature is enabled for head0/1, delta0 of the 1st non-head lcluster index will keep block count of this pcluster in lcluster size instead of 1. Or, the compressed size of pcluster should be 1 lcluster if pcluster has no non-head lcluster index. Also note that BIG_PCLUSTER feature reuses COMPR_CFGS feature since it depends on COMPR_CFGS and will be released together. Link: https://lore.kernel.org/r/20210407043927.10623-6-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:17 +08:00
Gao Xiang	81382f5f5c	erofs: fix up inplace I/O pointer for big pcluster When picking up inplace I/O pages, it should be traversed in reverse order in aligned with the traversal order of file-backed online pages. Also, index should be updated together when preloading compressed pages. Previously, only page-sized pclustersize was supported so no problem at all. Also rename `compressedpages' to `icpage_ptr' to reflect its functionality. Link: https://lore.kernel.org/r/20210407043927.10623-5-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:16 +08:00
Gao Xiang	9f6cc76e6f	erofs: introduce physical cluster slab pools Since multiple pcluster sizes could be used at once, the number of compressed pages will become a variable factor. It's necessary to introduce slab pools rather than a single slab cache now. This limits the pclustersize to 1M (Z_EROFS_PCLUSTER_MAX_SIZE), and get rid of the obsolete EROFS_FS_CLUSTER_PAGE_LIMIT, which has no use now. Link: https://lore.kernel.org/r/20210407043927.10623-4-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:16 +08:00
Gao Xiang	524887347f	erofs: introduce multipage per-CPU buffers To deal the with the cases which inplace decompression is infeasible for some inplace I/O. Per-CPU buffers was introduced to get rid of page allocation latency and thrash for low-latency decompression algorithms such as lz4. For the big pcluster feature, introduce multipage per-CPU buffers to keep such inplace I/O pclusters temporarily as well but note that per-CPU pages are just consecutive virtually. When a new big pcluster fs is mounted, its max pclustersize will be read and per-CPU buffers can be growed if needed. Shrinking adjustable per-CPU buffers is more complex (because we don't know if such size is still be used), so currently just release them all when unloading. Link: https://lore.kernel.org/r/20210409190630.19569-1-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:19:59 +08:00
Gao Xiang	54e0b6c873	erofs: reserve physical_clusterbits[] Formal big pcluster design is actually more powerful / flexable than the previous thought whose pclustersize was fixed as power-of-2 blocks, which was obviously inefficient and space-wasting. Instead, pclustersize can now be set independently for each pcluster, so various pcluster sizes can also be used together in one file if mkfs wants (for example, according to data type and/or compression ratio). Let's get rid of previous physical_clusterbits[] setting (also notice that corresponding on-disk fields are still 0 for now). Therefore, head1/2 can be used for at most 2 different algorithms in one file and again pclustersize is now independent of these. Link: https://lore.kernel.org/r/20210407043927.10623-2-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-07 12:41:22 +08:00
Ruiqi Gong	fe6adcce7e	erofs: Clean up spelling mistakes found in fs/erofs zmap.c: s/correspoinding/corresponding zdata.c: s/endding/ending Link: https://lore.kernel.org/r/20210331093920.31923-1-gongruiqi1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-03 12:23:47 +08:00
Gao Xiang	14373711dd	erofs: add on-disk compression configurations Add a bitmap for available compression algorithms and a variable-sized on-disk table for compression options in preparation for upcoming big pcluster and LZMA algorithm, which follows the end of super block. To parse the compression options, the bitmap is scanned one by one. For each available algorithm, there is data followed by 2-byte `length' correspondingly (it's enough for most cases, or entire fs blocks should be used.) With such available algorithm bitmap, kernel itself can also refuse to mount such filesystem if any unsupported compression algorithm exists. Note that COMPR_CFGS feature will be enabled with BIG_PCLUSTER. Link: https://lore.kernel.org/r/20210329100012.12980-1-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 18:01:42 +08:00
Gao Xiang	46249cded1	erofs: introduce on-disk lz4 fs configurations Introduce z_erofs_lz4_cfgs to store all lz4 configurations. Currently it's only max_distance, but will be used for new features later. Link: https://lore.kernel.org/r/20210329012308.28743-4-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:24:58 +08:00
Huang Jianan	5d50538fc5	erofs: support adjust lz4 history window size lz4 uses LZ4_DISTANCE_MAX to record history preservation. When using rolling decompression, a block with a higher compression ratio will cause a larger memory allocation (up to 64k). It may cause a large resource burden in extreme cases on devices with small memory and a large number of concurrent IOs. So appropriately reducing this value can improve performance. Decreasing this value will reduce the compression ratio (except when input_size <LZ4_DISTANCE_MAX). But considering that erofs currently only supports 4k output, reducing this value will not significantly reduce the compression benefits. The maximum value of LZ4_DISTANCE_MAX defined by lz4 is 64k, and we can only reduce this value. For the old kernel, it just can't reduce the memory allocation during rolling decompression without affecting the decompression result. Link: https://lore.kernel.org/r/20210329012308.28743-3-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> [ Gao Xiang: introduce struct erofs_sb_lz4_info for configurations. ] Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:24:58 +08:00
Gao Xiang	de06a6a375	erofs: introduce erofs_sb_has_xxx() helpers Introduce erofs_sb_has_xxx() to make long checks short, especially for later big pcluster & LZMA features. Link: https://lore.kernel.org/r/20210329012308.28743-2-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:24:57 +08:00
Gao Xiang	24a806d849	erofs: add unsupported inode i_format check If any unknown i_format fields are set (may be of some new incompat inode features), mark such inode as unsupported. Just in case of any new incompat i_format fields added in the future. Link: https://lore.kernel.org/r/20210329003614.6583-1-hsiangkao@aol.com Fixes: `431339ba90` ("staging: erofs: add inode operations") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:20:45 +08:00
Yue Hu	8137824edd	erofs: don't use erofs_map_blocks() any more Currently, erofs_map_blocks() will be called only from erofs_{bmap, read_raw_page} which are all for uncompressed files. So, the compression branch in erofs_map_blocks() is pointless. Let's remove it and use erofs_map_blocks_flatmode() directly. Also update related comments. Link: https://lore.kernel.org/r/20210325071008.573-1-zbestahu@gmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:19:53 +08:00
Gao Xiang	0b964600d3	erofs: complete a missing case for inplace I/O Add a missing case which could cause unnecessary page allocation but not directly use inplace I/O instead, which increases runtime extra memory footprint. The detail is, considering an online file-backed page, the right half of the page is chosen to be cached (e.g. the end page of a readahead request) and some of its data doesn't exist in managed cache, so the pcluster will be definitely kept in the submission chain. (IOWs, it cannot be decompressed without I/O, e.g., due to the bypass queue). Currently, DELAYEDALLOC/TRYALLOC cases can be downgraded as NOINPLACE, and stop online pages from inplace I/O. After this patch, unneeded page allocations won't be observed in pickup_page_for_submission() then. Link: https://lore.kernel.org/r/20210321183227.5182-1-hsiangkao@aol.com Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:18:01 +08:00
Huang Jianan	30048cdac4	erofs: use sync decompression for atomic contexts only Sync decompression was introduced to get rid of additional kworker scheduling overhead. But there is no such overhead in non-atomic contexts. Therefore, it should be better to turn off sync decompression to avoid the current thread waiting in z_erofs_runqueue. Link: https://lore.kernel.org/r/20210317035448.13921-3-huangjianan@oppo.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:18:01 +08:00
Huang Jianan	648f2de053	erofs: use workqueue decompression for atomic contexts only z_erofs_decompressqueue_endio may not be executed in the atomic context, for example, when dm-verity is turned on. In this scenario, data can be decompressed directly to get rid of additional kworker scheduling overhead. Link: https://lore.kernel.org/r/20210317035448.13921-2-huangjianan@oppo.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:18:00 +08:00
Huang Jianan	b4892fa3e7	erofs: avoid memory allocation failure during rolling decompression Currently, err would be treated as io error. Therefore, it'd be better to ensure memory allocation during rolling decompression to avoid such io error. In the long term, we might consider adding another !Uptodate case for such case. Link: https://lore.kernel.org/r/20210316031515.90954-1-huangjianan@oppo.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:18:00 +08:00
Linus Torvalds	a5e13c6df0	Linux 5.12-rc5	2021-03-28 15:48:16 -07:00
Linus Torvalds	f9e2bb42cf	Merge tag 'perf-tools-fixes-for-v5.12-2020-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tooling fixes from Arnaldo Carvalho de Melo: - Avoid write of uninitialized memory when generating PERF_RECORD_MMAP* records. - Fix 'perf top' BPF support related crash with perf_event_paranoid=3 + kptr_restrict. - Validate raw event with sysfs exported format bits. - Fix waipid on SIGCHLD delivery bugs in 'perf daemon'. - Change to use bash for daemon test on Debian, where the default is dash and thus fails for use of bashisms in this test. - Fix memory leak in vDSO found using ASAN. - Remove now useless (due to the fact that BPF now supports static vars) failing sub test "BPF relocation checker". - Fix auxtrace queue conflict. - Sync linux/kvm.h with the kernel sources. * tag 'perf-tools-fixes-for-v5.12-2020-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf test: Change to use bash for daemon test perf record: Fix memory leak in vDSO found using ASAN perf test: Remove now useless failing sub test "BPF relocation checker" perf daemon: Return from kill functions perf daemon: Force waipid for all session on SIGCHLD delivery perf top: Fix BPF support related crash with perf_event_paranoid=3 + kptr_restrict perf pmu: Validate raw event with sysfs exported format bits perf synthetic events: Avoid write of uninitialized memory when generating PERF_RECORD_MMAP* records tools headers UAPI: Sync linux/kvm.h with the kernel sources perf synthetic-events: Fix uninitialized 'kernel_thread' variable perf auxtrace: Fix auxtrace queue conflict	2021-03-28 13:22:54 -07:00
Linus Torvalds	3fef15f872	Merge tag 'auxdisplay-for-linus-v5.12-rc6' of git://github.com/ojeda/linux Pull auxdisplay fix from Miguel Ojeda: "Remove in_interrupt() usage (Sebastian Andrzej Siewior)" * tag 'auxdisplay-for-linus-v5.12-rc6' of git://github.com/ojeda/linux: auxdisplay: Remove in_interrupt() usage.	2021-03-28 13:20:38 -07:00
Linus Torvalds	36a14638f7	Merge tag 'x86-urgent-2021-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: - Fix build failure on Ubuntu with new GCC packages that turn on -fcf-protection - Fix SME memory encryption PTE encoding bug - AFAICT the code worked on 4K page sizes (level 1) but had the wrong shift at higher page level orders (level 2 and higher)" * tag 'x86-urgent-2021-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Turn off -fcf-protection for realmode targets x86/mem_encrypt: Correct physical address calculation in __set_clr_pte_enc()	2021-03-28 12:19:16 -07:00
Linus Torvalds	47fbbc94da	Merge tag 'locking-urgent-2021-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix the non-debug mutex_lock_io_nested() method to map to mutex_lock_io() instead of mutex_lock(). Right now nothing uses this API explicitly, but this is an accident waiting to happen" * tag 'locking-urgent-2021-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/mutex: Fix non debug version of mutex_lock_io_nested()	2021-03-28 12:12:22 -07:00
Linus Torvalds	81b1d39fd3	Merge tag '5.12-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Five cifs/smb3 fixes, two for stable. Includes an important fix for encryption and an ACL fix, as well as a fix for possible reflink data corruption" * tag '5.12-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6: smb3: fix cached file size problems in duplicate extents (reflink) cifs: Silently ignore unknown oplock break handle cifs: revalidate mapping when we open files for SMB1 POSIX cifs: Fix chmod with modefromsid when an older ACE already exists. cifs: Adjust key sizes and key generation routines for AES256 encryption	2021-03-28 12:06:21 -07:00
Linus Torvalds	b44d1ddcf8	Merge tag 'io_uring-5.12-2021-03-27' of git://git.kernel.dk/linux-block Pull io_uring fixes from Jens Axboe: - Use thread info versions of flag testing, as discussed last week. - The series enabling PF_IO_WORKER to just take signals, instead of needing to special case that they do not in a bunch of places. Ends up being pretty trivial to do, and then we can revert all the special casing we're currently doing. - Kill dead pointer assignment - Fix hashed part of async work queue trace - Fix sign extension issue for IORING_OP_PROVIDE_BUFFERS - Fix a link completion ordering regression in this merge window - Cancellation fixes * tag 'io_uring-5.12-2021-03-27' of git://git.kernel.dk/linux-block: io_uring: remove unsued assignment to pointer io io_uring: don't cancel extra on files match io_uring: don't cancel-track common timeouts io_uring: do post-completion chore on t-out cancel io_uring: fix timeout cancel return code Revert "signal: don't allow STOP on PF_IO_WORKER threads" Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing" Revert "kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals" Revert "signal: don't allow sending any signals to PF_IO_WORKER threads" kernel: stop masking signals in create_io_thread() io_uring: handle signals for IO threads like a normal thread kernel: don't call do_exit() for PF_IO_WORKER threads io_uring: maintain CQE order of a failed link io-wq: fix race around pending work on teardown io_uring: do ctx sqd ejection in a clear context io_uring: fix provide_buffers sign extension io_uring: don't skip file_end_write() on reissue io_uring: correct io_queue_async_work() traces io_uring: don't use {test,clear}_tsk_thread_flag() for current	2021-03-28 11:42:05 -07:00
Linus Torvalds	abed516ecd	Merge tag 'block-5.12-2021-03-27' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - Fix regression from this merge window with the xarray partition change, which allowed partition counts that overflow the u8 that holds the partition number (Ming) - Fix zone append warning (Johannes) - Segmentation count fix for multipage bvecs (David) - Partition scan fix (Chris) * tag 'block-5.12-2021-03-27' of git://git.kernel.dk/linux-block: block: don't create too many partitions block: support zone append bvecs block: recalculate segment count for multi-segment discards correctly block: clear GD_NEED_PART_SCAN later in bdev_disk_changed	2021-03-28 11:37:42 -07:00
Linus Torvalds	e8cfe8fa22	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Seven fixes, all in drivers (qla2xxx, mkt3sas, qedi, target, ibmvscsi). The most serious are the target pscsi oom and the qla2xxx revert which can otherwise cause a use after free" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: pscsi: Clean up after failure in pscsi_map_sg() scsi: target: pscsi: Avoid OOM in pscsi_map_sg() scsi: mpt3sas: Fix error return code of mpt3sas_base_attach() scsi: qedi: Fix error return code of qedi_alloc_global_queues() scsi: Revert "qla2xxx: Make sure that aborted commands are freed" scsi: ibmvfc: Make ibmvfc_wait_for_ops() MQ aware scsi: ibmvfc: Fix potential race in ibmvfc_wait_for_ops()	2021-03-28 11:34:47 -07:00
Colin Ian King	2b8ed1c941	io_uring: remove unsued assignment to pointer io There is an assignment to io that is never read after the assignment, the assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Pavel Begunkov	78d9d7c2a3	io_uring: don't cancel extra on files match As tasks always wait and kill their io-wq on exec/exit, files are of no more concern to us, so we don't need to specifically cancel them by hand in those cases. Moreover we should not, because io_match_task() looks at req->task->files now, which is always true and so leads to extra cancellations, that wasn't a case before per-task io-wq. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0566c1de9b9dd417f5de345c817ca953580e0e2e.1616696997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Pavel Begunkov	2482b58ffb	io_uring: don't cancel-track common timeouts Don't account usual timeouts (i.e. not linked) as REQ_F_INFLIGHT but keep behaviour prior to `dd59a3d595` ("io_uring: reliably cancel linked timeouts"). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/104441ef5d97e3932113d44501fda0df88656b83.1616696997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Pavel Begunkov	80c4cbdb5e	io_uring: do post-completion chore on t-out cancel Don't forget about io_commit_cqring() + io_cqring_ev_posted() after exit/exec cancelling timeouts. Both functions declared only after io_kill_timeouts(), so to avoid tons of forward declarations move it down. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/72ace588772c0f14834a6a4185d56c445a366fb4.1616696997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Pavel Begunkov	1ee4160c73	io_uring: fix timeout cancel return code When we cancel a timeout we should emit a sensible return code, like -ECANCELED but not 0, otherwise it may trick users. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7b0ad1065e3bd1994722702bd0ba9e7bc9b0683b.1616696997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Jens Axboe	1e4cf0d3d0	Revert "signal: don't allow STOP on PF_IO_WORKER threads" This reverts commit `4db4b1a0d1`. The IO threads allow and handle SIGSTOP now, so don't special case them anymore in task_set_jobctl_pending(). Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:11 -06:00
Jens Axboe	d3dc04cd81	Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing" This reverts commit `15b2219fac`. Before IO threads accepted signals, the freezer using take signals to wake up an IO thread would cause them to loop without any way to clear the pending signal. That is no longer the case, so stop special casing PF_IO_WORKER in the freezer. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:10 -06:00
Jens Axboe	e8b33b8cfa	Revert "kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals" This reverts commit `6fb8f43ced`. The IO threads do allow signals now, including SIGSTOP, and we can allow ptrace attach. Attaching won't reveal anything interesting for the IO threads, but it will allow eg gdb to attach to a task with io_urings and IO threads without complaining. And once attached, it will allow the usual introspection into regular threads. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:10 -06:00
Jens Axboe	5a842a7448	Revert "signal: don't allow sending any signals to PF_IO_WORKER threads" This reverts commit `5be28c8f85`. IO threads now take signals just fine, so there's no reason to limit them specifically. Revert the change that prevented that from happening. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:10 -06:00
Jens Axboe	b16b3855d8	kernel: stop masking signals in create_io_thread() This is racy - move the blocking into when the task is created and we're marking it as PF_IO_WORKER anyway. The IO threads are now prepared to handle signals like SIGSTOP as well, so clear that from the mask to allow proper stopping of IO threads. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:10 -06:00
Jens Axboe	dbe1bdbb39	io_uring: handle signals for IO threads like a normal thread We go through various hoops to disallow signals for the IO threads, but there's really no reason why we cannot just allow them. The IO threads never return to userspace like a normal thread, and hence don't go through normal signal processing. Instead, just check for a pending signal as part of the work loop, and call get_signal() to handle it for us if anything is pending. With that, we can support receiving signals, including special ones like SIGSTOP. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 14:09:07 -06:00
Ming Lei	e82fc78557	block: don't create too many partitions Commit `a33df75c63` ("block: use an xarray for disk->part_tbl") drops the check on max supported number of partitionsr, and allows partition with bigger partition numbers to be added. However, ->bd_partno is defined as u8, so partition index of xarray table may not match with ->bd_partno. Then delete_partition() may delete one unmatched partition, and caused use-after-free. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reported-by: syzbot+8fede7e30c7cee0de139@syzkaller.appspotmail.com Fixes: `a33df75c63` ("block: use an xarray for disk->part_tbl") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-27 09:22:18 -06:00
Steve French	cfc63fc812	smb3: fix cached file size problems in duplicate extents (reflink) There were two problems (one of which could cause data corruption) that were noticed with duplicate extents (ie reflink) when debugging why various xfstests were being incorrectly skipped (e.g. generic/138, generic/140, generic/142). First, we were not updating the file size locally in the cache when extending a file due to reflink (it would refresh after actimeo expires) but xfstest was checking the size immediately which was still 0 so caused the test to be skipped. Second, we were setting the target file size (which could shrink the file) in all cases to the end of the reflinked range rather than only setting the target file size when reflink would extend the file. CC: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-03-26 18:41:55 -05:00
Vincent Whitchurch	219481a8f9	cifs: Silently ignore unknown oplock break handle Make SMB2 not print out an error when an oplock break is received for an unknown handle, similar to SMB1. The debug message which is printed for these unknown handles may also be misleading, so fix that too. The SMB2 lease break path is not affected by this patch. Without this, a program which writes to a file from one thread, and opens, reads, and writes the same file from another thread triggers the below errors several times a minute when run against a Samba server configured with "smb2 leases = no". CIFS: VFS: \\192.168.0.1 No task to wake, unknown frame received! NumMids 2 00000000: 424d53fe 00000040 00000000 00000012 .SMB@........... 00000010: 00000001 00000000 ffffffff ffffffff ................ 00000020: 00000000 00000000 00000000 00000000 ................ 00000030: 00000000 00000000 00000000 00000000 ................ Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-03-26 18:05:26 -05:00
Ronnie Sahlberg	cee8f4f6fc	cifs: revalidate mapping when we open files for SMB1 POSIX RHBZ: 1933527 Under SMB1 + POSIX, if an inode is reused on a server after we have read and cached a part of a file, when we then open the new file with the re-cycled inode there is a chance that we may serve the old data out of cache to the application. This only happens for SMB1 (deprecated) and when posix are used. The simplest solution to avoid this race is to force a revalidate on smb1-posix open. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-03-26 18:04:58 -05:00
Shyam Prasad N	3bffbe9e0b	cifs: Fix chmod with modefromsid when an older ACE already exists. My recent fixes to cifsacl to maintain inherited ACEs had regressed modefromsid when an older ACL already exists. Found testing xfstest 495 with modefromsid mount option Fixes: `f506550889` ("cifs: Retain old ACEs when converting between mode bits and ACL") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-03-26 18:04:35 -05:00
Jens Axboe	10442994ba	kernel: don't call do_exit() for PF_IO_WORKER threads Right now we're never calling get_signal() from PF_IO_WORKER threads, but in preparation for doing so, don't handle a fatal signal for them. The workers have state they need to cleanup when exiting, so just return instead of calling do_exit() on their behalf. The threads themselves will detect a fatal signal and do proper shutdown. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-03-26 16:10:14 -06:00
Linus Torvalds	0f4498cef9	Merge tag 'for-5.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM verity target's optional argument processing. - Fix DM core's zoned model and zone sectors checks. - Fix spurious "detected capacity change" pr_info() when creating new DM device. - Fix DM ioctl out of bounds array access in handling of DM_LIST_DEVICES_CMD when no devices exist. * tag 'for-5.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm ioctl: fix out of bounds array access when no devices dm: don't report "detected capacity change" on device creation dm table: Fix zoned model check and zone sectors check dm verity: fix DM_VERITY_OPTS_MAX value	2021-03-26 12:21:05 -07:00
Mikulas Patocka	4edbe1d7bc	dm ioctl: fix out of bounds array access when no devices If there are not any dm devices, we need to zero the "dev" argument in the first structure dm_name_list. However, this can cause out of bounds write, because the "needed" variable is zero and len may be less than eight. Fix this bug by reporting DM_BUFFER_FULL_FLAG if the result buffer is too small to hold the "nl->dev" value. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-03-26 14:51:50 -04:00
Linus Torvalds	7931c531fc	Merge tag 'acpi-5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a memory management regression in ACPICA, repair an ACPI blacklist entry damaged inadvertently during the 5.11 cycle and fix the bookkeeping of devices with the same primary device ID in the ACPI core. Specifics: - Make ACPICA use the same object cache consistently when allocating and freeing objects (Vegard Nossum) - Add a callback pointer removed inadvertently during the 5.11 cycle to the ACPI backlight blacklist entry for Sony VPCEH3U1E (Chris Chiu) - Make the ACPI device enumeration core use IDA for creating names of ACPI device objects with the same primary device ID to avoid using duplicate device object names in some cases (Andy Shevchenko)" * tag 'acpi-5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Always create namespace nodes using acpi_ns_create_node() ACPI: scan: Use unique number for instance_no ACPI: video: Add missing callback back for Sony VPCEH3U1E	2021-03-26 11:33:39 -07:00
Linus Torvalds	8a3cbdda18	Merge tag 'pm-5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix an issue related to device links in the runtime PM framework and debugfs usage in the Energy Model code. Specifics: - Modify the runtime PM device suspend to avoid suspending supplier devices before the consumer device's status changes to RPM_SUSPENDED (Rafael Wysocki) - Change the Energy Model code to prevent it from attempting to create its main debugfs directory too early (Lukasz Luba)" * tag 'pm-5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: EM: postpone creating the debugfs dir till fs_initcall PM: runtime: Defer suspending suppliers	2021-03-26 11:29:36 -07:00
Linus Torvalds	eb3991ef2c	Merge tag 'soc-fixes-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "Too many fixes have accumulated in the soc tree, so this is a fairly large set. As usual, most of the fixes are for devicetree files, but there are also notable code changes for imx and omap regressions as well as some maintainer file updates. imx: - Fix an Ethernet issue on imx6ul-14x14-evk board that is caused by independent PHY reset. - Add missing `dma-coherent` property for LayerScape device trees to fix a kernel BUG report. - Use IRQCHIP_DECLARE for AVIC driver to fix a boot issue on i.MX25 with fw_devlink=on. - Add missing I2C pinctrl entry for imx8mp-phyboard-pollux-rdk board to fix the broken I2C GPIO recovery support. - Add `fsl,use-minimum-ecc` property for imx6ull-myir-mys-6ulx-eval device tree to fix UBI filesystem mount failure. at91: - wrong phy address that blocks Ethernet use on boards with sama5d27 SoM1 - restrictive pin possibilities for sam9x60 omap: - Fix ocp interconnect bus access error reporting for omap_l3_noc by setting IRQF_NO_THREAD - Fix changed mmc slot order regression by adding mmc aliases for am335x - Fix dra7 reboot regression caused by invalid pcie reset map - Fix smartreflex init regression caused by dropped legacy data - Fix ti-sysc driver warning on unbind if reset is not deasserted - Fix flakey reset deassert for dra7 iva stm32: - MAINTAINER file updates broadcom: - brcmstb SoC ID build fix - MAINTAINER file updates" * tag 'soc-fixes-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: Add Alain Volmat as STM32 I2C/SMBUS maintainer MAINTAINERS: Remove Vincent Abriou for STM/STI DRM drivers. MAINTAINERS: Update some st.com email addresses to foss.st.com ARM: dts: imx6ull: fix ubi filesystem mount failed ARM: imx6ul-14x14-evk: Do not reset the Ethernet PHYs independently arm64: dts: imx8mp-phyboard-pollux-rdk: Add missing pinctrl entry arm64: dts: ls1012a: mark crypto engine dma coherent arm64: dts: ls1043a: mark crypto engine dma coherent arm64: dts: ls1046a: mark crypto engine dma coherent ARM: imx: avic: Convert to using IRQCHIP_DECLARE ARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet ARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C ARM: dts: at91-sama5d27_som1: fix phy address to 7 soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva bus: ti-sysc: Fix warning on unbind if reset is not deasserted ARM: OMAP2+: Fix smartreflex init regression after dropping legacy data soc: ti: omap-prm: Fix reboot issue with invalid pcie reset map for dra7 MAINTAINERS: rectify BROADCOM PMB (POWER MANAGEMENT BUS) DRIVER ARM: dts: am33xx: add aliases for mmc interfaces bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD	2021-03-26 11:19:38 -07:00
Linus Torvalds	6c20f6df61	Merge tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains a small series with a more elegant fix of a problem which was originally fixed in rc2" * tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Revert "xen: fix p2m size in dom0 for disabled memory hotplug case" xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG	2021-03-26 11:15:25 -07:00

1 2 3 4 5 ...

997556 Commits