linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-18 19:40:46 +09:00

Author	SHA1	Message	Date
Trond Myklebust	d911c57a19	NFSv4/pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() Make sure to test the stateid for validity so that we catch instances where the server may have been reusing stateids in nfs_layout_find_inode_by_stateid(). Fixes: `7b410d9ce4` ("pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlers") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	194a0dc8e2	pNFS/flexfiles: Report DELAY and GRACE errors from the DS to the server Ensure that if the DS is returning too many DELAY and GRACE errors, we also report that to the MDS through the layouterror mechanism. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	a8b373eefc	NFS: Limit the size of the access cache by default Currently, we have no real limit on the access cache size (we set it to ULONG_MAX). That can lead to credentials getting pinned for a very long time on lots of files if you have a system with a lot of memory. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	49cd32543f	NFS: Avoid referencing the cred twice in async rename/unlink In both async rename and rename, we take a reference to the cred in the call arguments. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	63ec2b69e9	NFSv4: Avoid unnecessary credential references in layoutget Layoutget is just using the credential attached to the open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	6129650720	NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O Avoid unnecessary references to the cred when we have already referenced it through the open context or the open owner. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	542b994bdb	NFS: Assume cred is pinned by open context in I/O requests In read/write/commit, we should be able to assume that the cred is pinned by the open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:29 -04:00
Trond Myklebust	1d179d6bd6	NFS: alloc_nfs_open_context() must use the file cred when available If we're creating a nfs_open_context() for a specific file pointer, we must use the cred assigned to that file. Fixes: `a52458b48a` ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:28 -04:00
Trond Myklebust	244fcd2f9a	NFS: Ensure we time out if a delegreturn does not complete We can't allow delegreturn to hold up nfs4_evict_inode() forever, since that can cause the memory shrinkers to block. This patch therefore ensures that we eventually time out, and complete the reclaim of the inode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:28 -04:00
Trond Myklebust	59b5639490	NFSv4/pnfs: pnfs_set_layout_stateid() should update the layout cred If the cred assigned to the layout that we're updating differs from the one used to retrieve the new layout segment, then we need to update the layout plh_lc_cred field. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:28 -04:00
Trond Myklebust	57f188e047	NFSv4: nfs_update_inplace_delegation() should update delegation cred If the cred assigned to the delegation that we're updating differs from the one we're updating too, then we need to update that field too. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:28 -04:00
Trond Myklebust	59e356a967	NFS: Use the 64-bit server readdir cookies when possible When we're running as a 64-bit architecture and are not running in 32-bit compatibility mode, it is better to use the 64-bit readdir cookies that supplied by the server. Doing so improves the accuracy of telldir()/seekdir(), particularly when the directory is changing, for instance, when doing 'rm -rf'. We still fall back to using the 32-bit offsets on 32-bit architectures and when in compatibility mode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-03-16 08:34:28 -04:00
Linus Torvalds	34d5a4b336	Merge tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex fix from Thomas Gleixner: "Fix for yet another subtle futex issue. The futex code used ihold() to prevent inodes from vanishing, but ihold() does not guarantee inode persistence. Replace the inode pointer with a per boot, machine wide, unique inode identifier. The second commit fixes the breakage of the hash mechanism which causes a 100% performance regression" * tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Unbreak futex hashing futex: Fix inode life-time issue	2020-03-15 12:55:52 -07:00
Darrick J. Wong	faf8ee8476	xfs: xfs_dabuf_map should return ENOMEM when map allocation fails If the xfs_buf_map array allocation in xfs_dabuf_map fails for whatever reason, we bail out with error code zero. This will confuse callers, so make sure that we return ENOMEM. Allocation failure should never happen with the small size of the array, but code defensively anyway. Fixes: `45feef8f50` ("xfs: refactor xfs_dabuf_map") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>	2020-03-15 09:22:35 -07:00
Pavel Begunkov	60cf46ae60	io-wq: hash dependent work Enable io-wq hashing stuff for dependent works simply by re-enqueueing such requests. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-14 17:02:30 -06:00
Pavel Begunkov	8766dd516c	io-wq: split hashing and enqueueing It's a preparation patch removing io_wq_enqueue_hashed(), which now should be done by io_wq_hash_work() + io_wq_enqueue(). Also, set hash value for dependant works, and do it as late as possible, because req->file can be unavailable before. This hash will be ignored by io-wq. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-14 17:02:28 -06:00
Pavel Begunkov	d78298e73a	io-wq: don't resched if there is no work This little tweak restores the behaviour that was before the recent io_worker_handle_work() optimisation patches. It makes the function do cond_resched() and flush_signals() only if there is an actual work to execute. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-14 17:02:26 -06:00
Pavel Begunkov	f1d96a8fcb	io_uring: NULL-deref for IOSQE_{ASYNC,DRAIN} Processing links, io_submit_sqe() prepares requests, drops sqes, and passes them with sqe=NULL to io_queue_sqe(). There IOSQE_DRAIN and/or IOSQE_ASYNC requests will go through the same prep, which doesn't expect sqe=NULL and fail with NULL pointer deference. Always do full prepare including io_alloc_async_ctx() for linked requests, and then it can skip the second preparation. Cc: stable@vger.kernel.org # 5.5 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-14 16:57:41 -06:00
Eric Whitney	2971148d0f	ext4: remove map_from_cluster from ext4_ext_map_blocks We can use the variable allocated_clusters rather than map_from_clusters to control reserved block/cluster accounting in ext4_ext_map_blocks. This eliminates a variable and associated code and improves readability a little. Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200311205125.25061-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:14 -04:00
Eric Whitney	3499046134	ext4: clean up ext4_ext_insert_extent() call in ext4_ext_map_blocks() Now that the eofblocks code has been removed, we don't need to assign 0 to err before calling ext4_ext_insert_extent() since it will assign a return value to ret anyway. The variable free_on_err can be eliminated and replaced by a reference to allocated_clusters which clearly conveys the idea that newly allocated blocks should be freed when recovering from an extent insertion failure. The error handling code itself should be restructured so that it errors out immediately on an insertion failure in the case where no new blocks have been allocated (bigalloc) rather than proceeding further into the mapping code. The initializer for fb_flags can also be rearranged for improved readability. Finally, insert a missing space in nearby code. No known bugs are addressed by this patch - it's simply a cleanup. Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200311205033.25013-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:13 -04:00
Dmitry Monakhov	eb5760863f	ext4: mark block bitmap corrupted when found instead of BUGON We already has similar code in ext4_mb_complex_scan_group(), but ext4_mb_simple_scan_group() still affected. Other reports: https://www.spinics.net/lists/linux-ext4/msg60231.html Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com> Link: https://lore.kernel.org/r/20200310150156.641-1-dmonakhov@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:13 -04:00
Gustavo A. R. Silva	47b1030612	ext4: use flexible-array member for xattr structs The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309180813.GA3347@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:13 -04:00
Gustavo A. R. Silva	e32ac2459c	ext4: use flexible-array member in struct fname The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309154838.GA31559@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:13 -04:00
Ritesh Harjani	d3b6f23f71	ext4: move ext4_fiemap to use iomap framework This patch moves ext4_fiemap to use iomap framework. For xattr a new 'ext4_iomap_xattr_ops' is added. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/b9f45c885814fcdd0631747ff0fe08886270828c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:13 -04:00
Ritesh Harjani	b2c5764262	ext4: make ext4_ind_map_blocks work with fiemap For indirect block mapping if the i_block > max supported block in inode then ext4_ind_map_blocks() returns a -EIO error. But in case of fiemap this could be a valid query to ->iomap_begin call. So check if the offset >= s_bitmap_maxbytes in ext4_iomap_begin_report(), then simply skip calling ext4_map_blocks(). Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/87fa0ddc5967fa707656212a3b66a7233425325c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:12 -04:00
Ritesh Harjani	ac58e4fb03	ext4: move ext4 bmap to use iomap infrastructure ext4_iomap_begin is already implemented which provides ext4_map_blocks, so just move the API from generic_block_bmap to iomap_bmap for iomap conversion. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/8bbd53bd719d5ccfecafcce93f2bf1d7955a44af.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:12 -04:00
Ritesh Harjani	2f424a5a09	ext4: optimize ext4_ext_precache for 0 depth This patch avoids the memory alloc & free path when depth is 0, since anyway there is no extra caching done in that case. So on checking depth 0, simply return early. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/93da0d0f073c73358e85bb9849d8a5378d1da539.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:12 -04:00
Ritesh Harjani	6386722a32	ext4: add IOMAP_F_MERGED for non-extent based mapping IOMAP_F_MERGED needs to be set in case of non-extent based mapping. This is needed in later patches for conversion of ext4_fiemap to use iomap. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/a4764c91c08c16d4d4a4b36defb2a08625b0e9b3.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-03-14 14:43:12 -04:00
David S. Miller	44ef976ab3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-03-13 The following pull-request contains BPF updates for your net-next tree. We've added 86 non-merge commits during the last 12 day(s) which contain a total of 107 files changed, 5771 insertions(+), 1700 deletions(-). The main changes are: 1) Add modify_return attach type which allows to attach to a function via BPF trampoline and is run after the fentry and before the fexit programs and can pass a return code to the original caller, from KP Singh. 2) Generalize BPF's kallsyms handling and add BPF trampoline and dispatcher objects to be visible in /proc/kallsyms so they can be annotated in stack traces, from Jiri Olsa. 3) Extend BPF sockmap to allow for UDP next to existing TCP support in order in order to enable this for BPF based socket dispatch, from Lorenz Bauer. 4) Introduce a new bpftool 'prog profile' command which attaches to existing BPF programs via fentry and fexit hooks and reads out hardware counters during that period, from Song Liu. Example usage: bpftool prog profile id 337 duration 3 cycles instructions llc_misses 4228 run_cnt 3403698 cycles (84.08%) 3525294 instructions # 1.04 insn per cycle (84.05%) 13 llc_misses # 3.69 LLC misses per million isns (83.50%) 5) Batch of improvements to libbpf, bpftool and BPF selftests. Also addition of a new bpf_link abstraction to keep in particular BPF tracing programs attached even when the applicaion owning them exits, from Andrii Nakryiko. 6) New bpf_get_current_pid_tgid() helper for tracing to perform PID filtering and which returns the PID as seen by the init namespace, from Carlos Neira. 7) Refactor of RISC-V JIT code to move out common pieces and addition of a new RV32G BPF JIT compiler, from Luke Nelson. 8) Add gso_size context member to __sk_buff in order to be able to know whether a given skb is GSO or not, from Willem de Bruijn. 9) Add a new bpf_xdp_output() helper which reuses XDP's existing perf RB output implementation but can be called from tracepoint programs, from Eelco Chaudron. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-13 20:52:03 -07:00
Al Viro	6dfd9fe54d	follow_dotdot{,_rcu}(): switch to use of step_into() gets the regular mount crossing on result of .. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	7521f22b3c	handle_dots(), follow_dotdot{,_rcu}(): preparation to switch to step_into() Right now the tail ends of follow_dotdot{,_rcu}() are pretty much the open-coded analogues of step_into(). The differences: * the lack of proper LOOKUP_NO_XDEV handling in non-RCU case (arguably a bug) * the lack of ->d_manage() handling (again, arguably a bug) Adjust the calling conventions so that on the next step with could just switch those functions to returning step_into(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	957dd41d88	move handle_dots(), follow_dotdot() and follow_dotdot_rcu() past step_into() pure move; we are going to have step_into() called by that bunch. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	c9a0f75d81	follow_dotdot{,_rcu}(): lift LOOKUP_BENEATH checks out of loop Behaviour change: LOOKUP_BENEATH lookup of .. in absolute root yields an error even if it's not the process' root. That's possible only if you'd managed to escape chroot jail by way of procfs symlinks, but IMO the resulting behaviour is not worse - more consistent and easier to describe: ".." in root is "stay where you are", uness LOOKUP_BENEATH has been given, in which case it's "fail with EXDEV". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	abc2c632e0	follow_dotdot{,_rcu}(): lift switching nd->path to parent out of loop Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	a6a7eb7628	expand path_parent_directory() in its callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	63b27720a4	path_parent_directory(): leave changing path->dentry to callers Instead of returning 0, return new dentry; instead of returning -ENOENT, return NULL. Adjust the callers accordingly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	6b03f7edf4	path_connected(): pass mount and dentry separately eventually we'll want to do that check before mangling nd->path.dentry... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	c981a48281	split the lookup-related parts of do_last() into a separate helper Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	973d4b73fb	do_last(): rejoin the common path even earlier in FMODE_{OPENED,CREATED} case ... getting may_create_in_sticky() checks in FMODE_OPENED case as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	8795e7d482	do_last(): simplify the liveness analysis past finish_open_created Don't mess with got_write there - it is guaranteed to be false on entry and it will be set true if and only if we decide to go for truncation and manage to get write access for that. Don't carry acc_mode through the entire thing - it's only used in that part. And don't bother with gotos in there - compiler is quite capable of optimizing that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:13 -04:00
Al Viro	5a2d3edd8d	do_last(): rejoing the common path earlier in FMODE_{OPENED,CREATED} case Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	59e96e6583	do_last(): don't bother with keeping got_write in FMODE_OPENED case it's easier to drop it right after lookup_open() and regain if needed (i.e. if we will need to truncate). On the non-FMODE_OPENED path we do that anyway. In case of FMODE_CREATED we won't be needing it. And it's easier to prove correctness that way, especially since the initial failure to get write access is not always fatal; proving that we'll never end up truncating in that case is rather convoluted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	3ad5615a07	do_last(): merge the may_open() calls have FMODE_OPENED case rejoin the main path at earlier point Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	7be219b4dc	atomic_open(): lift the call of may_open() into do_last() there we'll be able to merge it with its counterparts in other cases, and there's no reason to do it before the parent has been unlocked Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	6fb968cdf9	atomic_open(): return the right dentry in FMODE_OPENED case ->atomic_open() might have used a different alias than the one we'd passed to it; in "not opened" case we take care of that, in "opened" one we don't. Currently we don't care downstream of "opened" case which alias to return; however, that will change shortly when we get to unifying may_open() calls. It's not hard to get right in all cases, anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	9deed3ebca	new helper: traverse_mounts() common guts of follow_down() and follow_managed() taken to a new helper - traverse_mounts(). The remnants of follow_managed() are folded into its sole remaining caller (handle_mounts()). Calling conventions of handle_mounts() slightly sanitized - instead of the weird "1 for success, -E... for failure" that used to be imposed by the calling conventions of walk_component() et.al. we can use the normal "0 for success, -E... for failure". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	ea936aeb3e	massage __follow_mount_rcu() a bit make the loop more similar to that in follow_managed(), with explicit tracking of flags, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	c108837e06	namei: have link_path_walk() maintain LOOKUP_PARENT set on entry, clear when we get to the last component. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	d8d4611a4f	link_path_walk(): simplify stack handling We use nd->stack to store two things: pinning down the symlinks we are resolving and resuming the name traversal when a nested symlink is finished. Currently, nd->depth is used to keep track of both. It's 0 when we call link_path_walk() for the first time (for the pathname itself) and 1 on all subsequent calls (for trailing symlinks, if any). That's fine, as far as pinning symlinks goes - when handling a trailing symlink, the string we are interpreting is the body of symlink pinned down in nd->stack[0]. It's rather inconvenient with respect to handling nested symlinks, though - when we run out of a string we are currently interpreting, we need to decide whether it's a nested symlink (in which case we need to pick the string saved back when we started to interpret that nested symlink and resume its traversal) or not (in which case we are done with link_path_walk()). Current solution is a bit of a kludge - in handling of trailing symlink (in lookup_last() and open_last_lookups() we clear nd->stack[0].name. That allows link_path_walk() to use the following rules when running out of a string to interpret: * if nd->depth is zero, we are at the end of pathname itself. * if nd->depth is positive, check the saved string; for nested symlink it will be non-NULL, for trailing symlink - NULL. It works, but it's rather non-obvious. Note that we have two sets: the set of symlinks currently being traversed and the set of postponed pathname tails. The former is stored in nd->stack[0..nd->depth-1].link and it's valid throught the pathname resolution; the latter is valid only during an individual call of link_path_walk() and it occupies nd->stack[0..nd->depth-1].name for the first call of link_path_walk() and nd->stack[1..nd->depth-1].name for subsequent ones. The kludge is basically a way to recognize the second set becoming empty. The things get simpler if we keep track of the second set's size explicitly and always store it in nd->stack[0..depth-1].name. We access the second set only inside link_path_walk(), so its size can live in a local variable; that way the check becomes trivial without the need of that kludge. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00
Al Viro	b1a8197240	pick_link(): check for WALK_TRAILING, not LOOKUP_PARENT Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-13 21:09:12 -04:00

... 37 38 39 40 41 ...

65089 Commits