linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-17 19:12:35 +09:00

Author	SHA1	Message	Date
Chao Yu	b5f4684b5f	f2fs: remove redundant compress inode check due to f2fs_post_read_required() has did that. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-05-08 06:55:56 -07:00
Eric Biggers	3c57f75182	f2fs: use strcmp() in parse_options() Remove the pointless string length checks. Just use strcmp(). Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-05-08 06:55:56 -07:00
Nishad Kamdar	d29fbcdb05	f2fs: Use the correct style for SPDX License Identifier This patch corrects the SPDX License Identifier style in header files related to F2FS File System support. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used). Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-05-08 06:55:55 -07:00
Andreas Gruenbacher	aa83da7f47	gfs2: More gfs2_find_jhead fixes It turns out that when extending an existing bio, gfs2_find_jhead fails to check if the block number is consecutive, which leads to incorrect reads for fragmented journals. In addition, limit the maximum bio size to an arbitrary value of 2 megabytes: since commit `07173c3ec2` ("block: enable multipage bvecs"), if we just keep adding pages until bio_add_page fails, bios will grow much larger than useful, which pins more memory than necessary with barely any additional performance gains. Fixes: `f4686c26ec` ("gfs2: read journal in large chunks") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-05-08 15:15:12 +02:00
Andreas Gruenbacher	566a2ab3c9	gfs2: Another gfs2_walk_metadata fix Make sure we don't walk past the end of the metadata in gfs2_walk_metadata: the inode holds fewer pointers than indirect blocks. Slightly clean up gfs2_iomap_get. Fixes: `a27a0c9b6a` ("gfs2: gfs2_walk_metadata fix") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-05-08 15:15:12 +02:00
Bob Peterson	d22f69a08d	gfs2: Fix use-after-free in gfs2_logd after withdraw When the gfs2_logd daemon withdrew, the withdraw sequence called into make_fs_ro() to make the file system read-only. That caused the journal descriptors to be freed. However, those journal descriptors were used by gfs2_logd's call to gfs2_ail_flush_reqd(). This caused a use-after free and NULL pointer dereference. This patch changes function gfs2_logd() so that it stops all logd work until the thread is told to stop. Once a withdraw is done, it only does an interruptible sleep. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-05-08 15:15:12 +02:00
Bob Peterson	53af80ce0e	gfs2: Fix BUG during unmount after file system withdraw Before this patch, when the logd daemon was forced to withdraw, it would try to request its journal be recovered by another cluster node. However, in single-user cases with lock_nolock, there are no other nodes to recover the journal. Function signal_our_withdraw() was recognizing the lock_nolock situation, but not until after it had evicted its journal inode. Since the journal descriptor that points to the inode was never removed from the master list, when the unmount occurred, it did another iput on the evicted inode, which resulted in a BUG_ON(inode->i_state & I_CLEAR). This patch moves the check for this situation earlier in function signal_our_withdraw(), which avoids the extra iput, so the unmount may happen normally. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-05-08 15:13:27 +02:00
Bob Peterson	a8b7528b69	gfs2: Fix error exit in do_xmote Before this patch, if an error was detected from glock function go_sync by function do_xmote, it would return. But the function had temporarily unlocked the gl_lockref spin_lock, and it never re-locked it. When the caller of do_xmote tried to unlock it again, it was already unlocked, which resulted in a corrupted spin_lock value. This patch makes sure the gl_lockref spin_lock is re-locked after it is unlocked. Thanks to Wu Bo <wubo40@huawei.com> for reporting this problem. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-05-08 14:45:38 +02:00
Gustavo A. R. Silva	374ad001f7	fanotify: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Link: https://lore.kernel.org/r/20200507185230.GA14229@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>	2020-05-08 10:38:12 +02:00
Eric Biggers	2aaba014b5	crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h <linux/cryptohash.h> sounds very generic and important, like it's the header to include if you're doing cryptographic hashing in the kernel. But actually it only includes the library implementation of the SHA-1 compression function (not even the full SHA-1). This should basically never be used anymore; SHA-1 is no longer considered secure, and there are much better ways to do cryptographic hashing in the kernel. Most files that include this header don't actually need it. So in preparation for removing it, remove all these unneeded includes of it. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-05-08 15:32:17 +10:00
Eric Biggers	f80df38512	ubifs: use crypto_shash_tfm_digest() Instead of manually allocating a 'struct shash_desc' on the stack and calling crypto_shash_digest(), switch to using the new helper function crypto_shash_tfm_digest() which does this for us. Cc: linux-mtd@lists.infradead.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-05-08 15:32:15 +10:00
Eric Biggers	ea794db264	nfsd: use crypto_shash_tfm_digest() Instead of manually allocating a 'struct shash_desc' on the stack and calling crypto_shash_digest(), switch to using the new helper function crypto_shash_tfm_digest() which does this for us. Cc: linux-nfs@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-05-08 15:32:15 +10:00
Eric Biggers	1979811388	ecryptfs: use crypto_shash_tfm_digest() Instead of manually allocating a 'struct shash_desc' on the stack and calling crypto_shash_digest(), switch to using the new helper function crypto_shash_tfm_digest() which does this for us. Cc: ecryptfs@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-05-08 15:32:15 +10:00
Eric Biggers	3e185a56eb	fscrypt: use crypto_shash_tfm_digest() Instead of manually allocating a 'struct shash_desc' on the stack and calling crypto_shash_digest(), switch to using the new helper function crypto_shash_tfm_digest() which does this for us. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-05-08 15:32:14 +10:00
Roman Penyaev	412895f03c	epoll: atomically remove wait entry on wake up This patch does two things: - fixes a lost wakeup introduced by commit `339ddb53d3` ("fs/epoll: remove unnecessary wakeups of nested epoll") - improves performance for events delivery. The description of the problem is the following: if N (>1) threads are waiting on ep->wq for new events and M (>1) events come, it is quite likely that >1 wakeups hit the same wait queue entry, because there is quite a big window between __add_wait_queue_exclusive() and the following __remove_wait_queue() calls in ep_poll() function. This can lead to lost wakeups, because thread, which was woken up, can handle not all the events in ->rdllist. (in better words the problem is described here: https://lkml.org/lkml/2019/10/7/905) The idea of the current patch is to use init_wait() instead of init_waitqueue_entry(). Internally init_wait() sets autoremove_wake_function as a callback, which removes the wait entry atomically (under the wq locks) from the list, thus the next coming wakeup hits the next wait entry in the wait queue, thus preventing lost wakeups. Problem is very well reproduced by the epoll60 test case [1]. Wait entry removal on wakeup has also performance benefits, because there is no need to take a ep->lock and remove wait entry from the queue after the successful wakeup. Here is the timing output of the epoll60 test case: With explicit wakeup from ep_scan_ready_list() (the state of the code prior `339ddb53d3`): real 0m6.970s user 0m49.786s sys 0m0.113s After this patch: real 0m5.220s user 0m36.879s sys 0m0.019s The other testcase is the stress-epoll [2], where one thread consumes all the events and other threads produce many events: With explicit wakeup from ep_scan_ready_list() (the state of the code prior `339ddb53d3`): threads events/ms run-time ms 8 5427 1474 16 6163 2596 32 6824 4689 64 7060 9064 128 6991 18309 After this patch: threads events/ms run-time ms 8 5598 1429 16 7073 2262 32 7502 4265 64 7640 8376 128 7634 16767 (number of "events/ms" represents event bandwidth, thus higher is better; number of "run-time ms" represents overall time spent doing the benchmark, thus lower is better) [1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c [2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jason Baron <jbaron@akamai.com> Cc: Khazhismel Kumykov <khazhy@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Heiher <r@hev.cc> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200430130326.1368509-2-rpenyaev@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-05-07 19:27:21 -07:00
Khazhismel Kumykov	0c54a6a44b	eventpoll: fix missing wakeup for ovflist in ep_poll_callback In the event that we add to ovflist, before commit `339ddb53d3` ("fs/epoll: remove unnecessary wakeups of nested epoll") we would be woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback. With that wakeup removed, if we add to ovflist here, we may never wake up. Rather than adding back the ep_scan_ready_list wakeup - which was resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback. We noticed that one of our workloads was missing wakeups starting with `339ddb53d3` and upon manual inspection, this wakeup seemed missing to me. With this patch added, we no longer see missing wakeups. I haven't yet tried to make a small reproducer, but the existing kselftests in filesystem/epoll passed for me with this patch. [khazhy@google.com: use if/elif instead of goto + cleanup suggested by Roman] Link: http://lkml.kernel.org/r/20200424190039.192373-1-khazhy@google.com Fixes: `339ddb53d3` ("fs/epoll: remove unnecessary wakeups of nested epoll") Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Roman Penyaev <rpenyaev@suse.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Roman Penyaev <rpenyaev@suse.de> Cc: Heiher <r@hev.cc> Cc: Jason Baron <jbaron@akamai.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-05-07 19:27:20 -07:00
Eric W. Biederman	2388777a0a	exec: Rename flush_old_exec begin_new_exec There is and has been for a very long time been a lot more going on in flush_old_exec than just flushing the old state. After the movement of code from setup_new_exec there is a whole lot more going on than just flushing the old executables state. Rename flush_old_exec to begin_new_exec to more accurately reflect what this function does. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	df9e4d2c4a	exec: Move most of setup_new_exec into flush_old_exec The current idiom for the callers is: flush_old_exec(bprm); set_personality(...); setup_new_exec(bprm); In 2010 Linus split flush_old_exec into flush_old_exec and setup_new_exec. With the intention that setup_new_exec be what is called after the processes new personality is set. Move the code that doesn't depend upon the personality from setup_new_exec into flush_old_exec. This is to facilitate future changes by having as much code together in one function as possible. To see why it is safe to move this code please note that effectively this change moves the personality setting in the binfmt and the following three lines of code after everything except unlocking the mutexes: arch_pick_mmap_layout arch_setup_new_exec mm->task_size = TASK_SIZE The function arch_pick_mmap_layout at most sets: mm->get_unmapped_area mm->mmap_base mm->mmap_legacy_base mm->mmap_compat_base mm->mmap_compat_legacy_base which nothing in flush_old_exec or setup_new_exec depends on. The function arch_setup_new_exec only sets architecture specific state and the rest of the functions only deal in state that applies to all architectures. The last line just sets mm->task_size and again nothing in flush_old_exec or setup_new_exec depend on task_size. Ref: `221af7f87b` ("Split 'flush_old_exec' into two functions") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	7d503feba0	exec: In setup_new_exec cache current in the local variable me At least gcc 8.3 when generating code for x86_64 has a hard time consolidating multiple calls to current aka get_current(), and winds up unnecessarily rereading %gs:current_task several times in setup_new_exec. Caching the value of current in the local variable of me generates slightly better and shorter assembly. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	96ecee29b0	exec: Merge install_exec_creds into setup_new_exec The two functions are now always called one right after the other so merge them together to make future maintenance easier. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	1507b7a30a	exec: Rename the flag called_exec_mmap point_of_no_return Update the comments and make the code easier to understand by renaming this flag. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	89826cce37	exec: Make unlocking exec_update_mutex explict With install_exec_creds updated to follow immediately after setup_new_exec, the failure of unshare_sighand is the only code path where exec_update_mutex is held but not explicitly unlocked. Update that code path to explicitly unlock exec_update_mutex. Remove the unlocking of exec_update_mutex from free_bprm. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:55:47 -05:00
Eric W. Biederman	e7f7785449	binfmt: Move install_exec_creds after setup_new_exec to match binfmt_elf In 2016 Linus moved install_exec_creds immediately after setup_new_exec, in binfmt_elf as a cleanup and as part of closing a potential information leak. Perform the same cleanup for the other binary formats. Different binary formats doing the same things the same way makes exec easier to reason about and easier to maintain. Greg Ungerer reports: > I tested the the whole series on non-MMU m68k and non-MMU arm > (exercising binfmt_flat) and it all tested out with no problems, > so for the binfmt_flat changes: Tested-by: Greg Ungerer <gerg@linux-m68k.org> Ref: `9f834ec18d` ("binfmt_elf: switch to new creds when switching to new mm") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-07 16:54:27 -05:00
Jens Axboe	63ff822358	io_uring: don't use 'fd' for openat/openat2/statx We currently make some guesses as when to open this fd, but in reality we have no business (or need) to do so at all. In fact, it makes certain things fail, like O_PATH. Remove the fd lookup from these opcodes, we're just passing the 'fd' to generic helpers anyway. With that, we can also remove the special casing of fd values in io_req_needs_file(), and the 'fd_non_neg' check that we have. And we can ensure that we only read sqe->fd once. This fixes O_PATH usage with openat/openat2, and ditto statx path side oddities. Cc: stable@vger.kernel.org: # v5.6 Reported-by: Max Kellermann <mk@cm4all.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-05-07 14:56:15 -06:00
Linus Torvalds	de268ccb42	Merge tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs Pull configfs fix from Christoph Hellwig: "Fix a refcount leak in configfs_rmdir (Xiyu Yang)" * tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs: configfs: fix config_item refcnt leak in configfs_rmdir()	2020-05-07 09:48:37 -07:00
Pavel Begunkov	90da2e3f25	splice: move f_mode checks to do_{splice,tee}() do_splice() is used by io_uring, as will be do_tee(). Move f_mode checks from sys_{splice,tee}() to do_{splice,tee}(), so they're enforced for io_uring as well. Fixes: `7d67af2c01` ("io_uring: add splice(2) support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-05-07 09:45:07 -06:00
Brian Foster	c199507993	xfs: remove unused iget_flags param from xfs_imap_to_bp() iget_flags is unused in xfs_imap_to_bp(). Remove the parameter and fix up the callers. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:49 -07:00
Brian Foster	28d8462079	xfs: remove unused shutdown types Both types control shutdown messaging and neither is used in the current codebase. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:48 -07:00
Brian Foster	7376d74547	xfs: random buffer write failure errortag Introduce an error tag to randomly fail async buffer writes. This is primarily to facilitate testing of the XFS error configuration mechanism. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:48 -07:00
Brian Foster	88fc187984	xfs: remove unused iflush stale parameter The stale parameter was used to control the now unused shutdown parameter of xfs_trans_ail_remove(). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:48 -07:00
Brian Foster	2b3cf09356	xfs: combine xfs_trans_ail_[remove\|delete]() Now that the functions and callers of xfs_trans_ail_[remove\|delete]() have been fixed up appropriately, the only difference between the two is the shutdown behavior. There are only a few callers of the _remove() variant, so make the shutdown conditional on the parameter and combine the two functions. Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:48 -07:00
Brian Foster	6af0479d8b	xfs: drop unused shutdown parameter from xfs_trans_ail_remove() The shutdown parameter of xfs_trans_ail_remove() is no longer used. The remaining callers use it for items that legitimately might not be in the AIL or from contexts where AIL state has already been checked. Remove the unnecessary parameter and fix up the callers. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:47 -07:00
Brian Foster	655879290c	xfs: use delete helper for items expected to be in AIL Various intent log items call xfs_trans_ail_remove() with a log I/O error shutdown type, but this helper historically checks whether an item is in the AIL before calling xfs_trans_ail_delete(). This means the shutdown check is essentially a no-op for users of xfs_trans_ail_remove(). It is possible that some items might not be AIL resident when the AIL remove attempt occurs, but this should be isolated to cases where the filesystem has already shutdown. For example, this includes abort of the transaction committing the intent and I/O error of the iclog buffer committing the intent to the log. Therefore, update these callsites to use xfs_trans_ail_delete() to provide AIL state validation for the common path of items being released and removed when associated done items commit to the physical log. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:47 -07:00
Brian Foster	849274c103	xfs: acquire ->ail_lock from xfs_trans_ail_delete() Several callers acquire the lock just prior to the call. Callers that require ->ail_lock for other purposes already check IN_AIL state and thus don't require the additional shutdown check in the helper. Push the lock down into xfs_trans_ail_delete(), open code the instances that still acquire it, and remove the unnecessary ailp parameter. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:47 -07:00
Brian Foster	b707fffda6	xfs: abort consistently on dquot flush failure The dquot flush handler effectively aborts the dquot flush if the filesystem is already shut down, but doesn't actually shut down if the flush fails. Update xfs_qm_dqflush() to consistently abort the dquot flush and shutdown the fs if the flush fails with an unexpected error. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:47 -07:00
Brian Foster	629dcb38dc	xfs: fix duplicate verification from xfs_qm_dqflush() The pre-flush dquot verification in xfs_qm_dqflush() duplicates the read verifier by checking the dquot in the on-disk buffer. Instead, verify the in-core variant before it is flushed to the buffer. Fixes: `7224fa482a` ("xfs: add full xfs_dqblk verifier") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:47 -07:00
Brian Foster	61948b6fb2	xfs: ratelimit unmount time per-buffer I/O error alert At unmount time, XFS emits an alert for every in-core buffer that might have undergone a write error. In practice this behavior is probably reasonable given that the filesystem is likely short lived once I/O errors begin to occur consistently. Under certain test or otherwise expected error conditions, this can spam the logs and slow down the unmount. Now that we have a ratelimit mechanism specifically for buffer alerts, reuse it for the per-buffer alerts in xfs_wait_buftarg(). Also lift the final repair message out of the loop so it always prints and assert that the metadata error handling code has shut down the fs. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:46 -07:00
Brian Foster	f9bccfcc3b	xfs: refactor ratelimited buffer error messages into helper XFS has some inconsistent log message rate limiting with respect to buffer alerts. The metadata I/O error notification uses the generic ratelimited alert, the buffer push code uses a custom rate limit and the similar quiesce time failure checks are not rate limited at all (when they should be). The custom rate limit defined in the buf item code is specifically crafted for buffer alerts. It is more aggressive than generic rate limiting code because it must accommodate a high frequency of I/O error events in a relative short timeframe. Factor out the custom rate limit state from the buf item code into a per-buftarg rate limit so various alerts are limited based on the target. Define a buffer alert helper function and use it for the buffer alerts that are already ratelimited. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:46 -07:00
Brian Foster	b6983e80b0	xfs: reset buffer write failure state on successful completion The buffer write failure flag is intended to control the internal write retry that XFS has historically implemented to help mitigate the severity of transient I/O errors. The flag is set when a buffer is resubmitted from the I/O completion path due to a previous failure. It is checked on subsequent I/O completions to skip the internal retry and fall through to the higher level configurable error handling mechanism. The flag is cleared in the synchronous and delwri submission paths and also checked in various places to log write failure messages. There are a couple minor problems with the current usage of this flag. One is that we issue an internal retry after every submission from xfsaild due to how delwri submission clears the flag. This results in double the expected or configured number of write attempts when under sustained failures. Another more subtle issue is that the flag is never cleared on successful I/O completion. This can cause xfs_wait_buftarg() to suggest that dirty buffers are being thrown away due to the existence of the flag, when the reality is that the flag might still be set because the write succeeded on the retry. Clear the write failure flag on successful I/O completion to address both of these problems. This means that the internal retry attempt occurs once since the last time a buffer write failed and that various other contexts only see the flag set when the immediately previous write attempt has failed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:46 -07:00
Brian Foster	15fab3b9be	xfs: remove unnecessary shutdown check from xfs_iflush() The shutdown check in xfs_iflush() duplicates checks down in the buffer code. If the fs is shut down, xfs_trans_read_buf_map() always returns an error and falls into the same error path. Remove the unnecessary check along with the warning in xfs_imap_to_bp() that generates excessive noise in the log if the fs is shut down. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:46 -07:00
Brian Foster	f20192991d	xfs: simplify inode flush error handling The inode flush code has several layers of error handling between the inode and cluster flushing code. If the inode flush fails before acquiring the backing buffer, the inode flush is aborted. If the cluster flush fails, the current inode flush is aborted and the cluster buffer is failed to handle the initial inode and any others that might have been attached before the error. Since xfs_iflush() is the only caller of xfs_iflush_cluster(), the error handling between the two can be condensed in the top-level function. If we update xfs_iflush_int() to always fall through to the log item update and attach the item completion handler to the buffer, any errors that occur after the first call to xfs_iflush_int() can be handled with a buffer I/O failure. Lift the error handling from xfs_iflush_cluster() into xfs_iflush() and consolidate with the existing error handling. This also replaces the need to release the buffer because failing the buffer with XBF_ASYNC drops the current reference. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:45 -07:00
Brian Foster	54b3b1f619	xfs: factor out buffer I/O failure code We use the same buffer I/O failure code in a few different places. It's not much code, but it's not necessarily self-explanatory. Factor it into a helper and document it in one place. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:45 -07:00
Brian Foster	cb6ad0993e	xfs: refactor failed buffer resubmission into xfsaild Flush locked log items whose underlying buffers fail metadata writeback are tagged with a special flag to indicate that the flush lock is already held. This is currently implemented in the type specific ->iop_push() callback, but the processing required for such items is not type specific because we're only doing basic state management on the underlying buffer. Factor the failed log item handling out of the inode and dquot ->iop_push() callbacks and open code the buffer resubmit helper into a single helper called from xfsaild_push_item(). This provides a generic mechanism for handling failed metadata buffer writeback with a bit less code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-05-07 08:27:45 -07:00
Kaixu Xia	781c036b67	ext4: remove unnecessary test_opt for DIOREAD_NOLOCK The DIOREAD_NOLOCK flag has been cleared when doing the test_opt that is meaningless, so remove the unnecessary test_opt for DIOREAD_NOLOCK. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Link: https://lore.kernel.org/r/1586751862-19437-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-05-07 11:00:47 -04:00
Christoph Hellwig	156c757372	vboxsf: don't use the source name in the bdi name Simplify the bdi name to mirror what we are doing elsewhere, and drop them name in favor of just using a number. This avoids a potentially very long bdi name. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-05-07 08:45:47 -06:00
David S. Miller	3793faad7b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-06 22:10:13 -07:00
Jens Axboe	12aceb89b0	eventfd: convert to f_op->read_iter() eventfd is using ->read() as it's file_operations read handler, but this prevents passing in information about whether a given IO operation is blocking or not. We can only use the file flags for that. To support async (-EAGAIN/poll based) retries for io_uring, we need ->read_iter() support. Convert eventfd to using ->read_iter(). With ->read_iter(), we can support IOCB_NOWAIT. Ensure the fd setup is done such that we set file->f_mode with FMODE_NOWAIT. [missing include added] Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-05-06 22:33:43 -04:00
Darrick J. Wong	8bc3b5e4b7	xfs: clean up the error handling in xfs_swap_extents Make sure we release resources properly if we cannot clean out the COW extents in preparation for an extent swap. Fixes: `96987eea53` ("xfs: cancel COW blocks before swapext") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-05-06 13:17:21 -07:00
J. Bruce Fields	c2d715a1af	nfsd: handle repeated BIND_CONN_TO_SESSION If the client attempts BIND_CONN_TO_SESSION on an already bound connection, it should be either a no-op or an error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2020-05-06 15:59:23 -04:00
Achilles Gaikwad	580da465a0	nfsd4: add filename to states output Add filename to states output for ease of debugging. Signed-off-by: Achilles Gaikwad <agaikwad@redhat.com> Signed-off-by: Kenneth Dsouza <kdsouza@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2020-05-06 15:59:23 -04:00

... 17 18 19 20 21 ...

65089 Commits