linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-10 21:07:02 +09:00

Author	SHA1	Message	Date
Sasha Levin	7a104fcedf	SUNRPC: Prevent kernel stack corruption on long values of flush commit `212ba90696` upstream. The buffer size in read_flush() is too small for the longest possible values for it. This can lead to a kernel stack corruption: [ 43.047329] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff833e64b4 [ 43.047329] [ 43.049030] Pid: 6015, comm: trinity-child18 Tainted: G W 3.5.0-rc7-next-20120716-sasha #221 [ 43.050038] Call Trace: [ 43.050435] [<ffffffff836c60c2>] panic+0xcd/0x1f4 [ 43.050931] [<ffffffff833e64b4>] ? read_flush.isra.7+0xe4/0x100 [ 43.051602] [<ffffffff810e94e6>] __stack_chk_fail+0x16/0x20 [ 43.052206] [<ffffffff833e64b4>] read_flush.isra.7+0xe4/0x100 [ 43.052951] [<ffffffff833e6500>] ? read_flush_pipefs+0x30/0x30 [ 43.053594] [<ffffffff833e652c>] read_flush_procfs+0x2c/0x30 [ 43.053596] [<ffffffff812b9a8c>] proc_reg_read+0x9c/0xd0 [ 43.053596] [<ffffffff812b99f0>] ? proc_reg_write+0xd0/0xd0 [ 43.053596] [<ffffffff81250d5b>] do_loop_readv_writev+0x4b/0x90 [ 43.053596] [<ffffffff81250fd6>] do_readv_writev+0xf6/0x1d0 [ 43.053596] [<ffffffff812510ee>] vfs_readv+0x3e/0x60 [ 43.053596] [<ffffffff812511b8>] sys_readv+0x48/0xb0 [ 43.053596] [<ffffffff8378167d>] system_call_fastpath+0x1a/0x1f Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-28 10:02:11 -07:00
Trond Myklebust	ef9fd53c07	SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT commit `a519fc7a70` upstream. Instead of doing a shutdown() call, we need to do an actual close(). Ditto if/when the server is sending us junk RPC headers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-21 09:17:10 -07:00
J. Bruce Fields	b8e52a4288	svcrpc: sends on closed socket should stop immediately commit `f06f00a24d` upstream. svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply. However, the XPT_CLOSE won't be acted on immediately. Meanwhile other threads could send further replies before the socket is really shut down. This can manifest as data corruption: for example, if a truncated read reply is followed by another rpc reply, that second reply will look to the client like further read data. Symptoms were data corruption preceded by svc_tcp_sendto logging something like kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket Reported-by: Malahal Naineni <malahal@us.ibm.com> Tested-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-14 10:00:39 -07:00
J. Bruce Fields	299ee06757	svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping commit `d10f27a750` upstream. The rpc server tries to ensure that there will be room to send a reply before it receives a request. It does this by tracking, in xpt_reserved, an upper bound on the total size of the replies that is has already committed to for the socket. Currently it is adding in the estimate for a new reply before it checks whether there is space available. If it finds that there is not space, it then subtracts the estimate back out. This may lead the subsequent svc_xprt_enqueue to decide that there is space after all. The results is a svc_recv() that will repeatedly return -EAGAIN, causing server threads to loop without doing any actual work. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-14 10:00:39 -07:00
J. Bruce Fields	e684493e06	svcrpc: fix BUG() in svc_tcp_clear_pages commit `be1e44441a` upstream. Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if sk_tcplen would lead us to expect n pages of data). svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen. This is OK, since both functions are serialized by XPT_BUSY. However, that means the inconsistency must be repaired before dropping XPT_BUSY. Therefore we should be ensuring that svc_tcp_save_pages repairs the problem before exiting svc_tcp_recv_record on error. Symptoms were a BUG() in svc_tcp_clear_pages. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-14 10:00:38 -07:00
Stanislav Kinsbursky	d90c97ba98	SUNRPC: return negative value in case rpcbind client creation error commit `caea33da89` upstream. Without this patch kernel will panic on LockD start, because lockd_up() checks lockd_up_net() result for negative value. From my pow it's better to return negative value from rpcbind routines instead of replacing all such checks like in lockd_up(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-15 12:04:09 -07:00
Jeff Layton	eb65b85e1b	nfs: skip commit in releasepage if we're freeing memory for fs-related reasons commit `5cf02d09b5` upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-09 08:27:51 -07:00
Trond Myklebust	00c4792f75	NFSv4.1: Fix a request leak on the back channel commit `b3b02ae586` upstream. If the call to svc_process_common() fails, then the request needs to be freed before we can exit bc_svc_process. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-22 11:34:13 -07:00
Dan Carpenter	8bb8ebe7b7	nfsd: don't allow zero length strings in cache_parse() commit `6d8d174998` upstream. There is no point in passing a zero length string here and quite a few of that cache_parse() implementations will Oops if count is zero. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-02 09:27:21 -07:00
Trond Myklebust	77d77ab09b	SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() commit `540a0f7584` upstream. The problem is that for the case of priority queues, we have to assume that __rpc_remove_wait_queue_priority will move new elements from the tk_wait.links lists into the queue->tasks[] list. We therefore cannot use list_for_each_entry_safe() on queue->tasks[], since that will skip these new tasks that __rpc_remove_wait_queue_priority is adding. Without this fix, rpc_wake_up and rpc_wake_up_status will both fail to wake up all functions on priority wait queues, which can result in some nasty hangs. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-02 09:27:17 -07:00
J. Bruce Fields	a141a5eb3a	svcrpc: avoid memory-corruption on pool shutdown commit `b4f36f88b3` upstream. Socket callbacks use svc_xprt_enqueue() to add an xprt to a pool->sp_sockets list. In normal operation a server thread will later come along and take the xprt off that list. On shutdown, after all the threads have exited, we instead manually walk the sv_tempsocks and sv_permsocks lists to find all the xprt's and delete them. So the sp_sockets lists don't really matter any more. As a result, we've mostly just ignored them and hoped they would go away. Which has gotten us into trouble; witness for example `ebc63e531c` "svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben Greear noticing that a still-running svc_xprt_enqueue() could re-add an xprt to an sp_sockets list just before it was deleted. The fix was to remove it from the list at the end of svc_delete_xprt(). But that only made corruption less likely--I can see nothing that prevents a svc_xprt_enqueue() from adding another xprt to the list at the same moment that we're removing this xprt from the list. In fact, despite the earlier xpo_detach(), I don't even see what guarantees that svc_xprt_enqueue() couldn't still be running on this xprt. So, instead, note that svc_xprt_enqueue() essentially does: lock sp_lock if XPT_BUSY unset add to sp_sockets unlock sp_lock So, if we do: set XPT_BUSY on every xprt. Empty every sp_sockets list, under the sp_socks locks. Then we're left knowing that the sp_sockets lists are all empty and will stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under the sp_lock and see it set. And then we can continue deleting the xprt's. (Thanks to Jeff Layton for being correctly suspicious of this code....) Cc: Ben Greear <greearb@candelatech.com> Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-25 17:24:48 -08:00
J. Bruce Fields	7df22768c0	svcrpc: destroy server sockets all at once commit `2fefb8a09e` upstream. There's no reason I can see that we need to call sv_shutdown between closing the two lists of sockets. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-25 17:24:48 -08:00
J. Bruce Fields	b09577ca66	svcrpc: fix double-free on shutdown of nfsd after changing pool mode commit `61c8504c42` upstream. The pool_to and to_pool fields of the global svc_pool_map are freed on shutdown, but are initialized in nfsd startup only in the SVC_POOL_PERCPU and SVC_POOL_PERNODE cases. They are initialized to zero on kernel startup. So as long as you use only SVC_POOL_GLOBAL (the default), this will never be a problem. You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE. However, the following sequence events leads to a double-free: 1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE 2. start nfsd: both fields are initialized. 3. shutdown nfsd: both fields are freed. 4. set SVC_POOL_GLOBAL 5. start nfsd: the fields are left untouched. 6. shutdown nfsd: now we try to free them again. Step 4 is actually unnecessary, since (for some bizarre reason), nfsd automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-25 17:24:47 -08:00
Trond Myklebust	421f55945f	SUNRPC: Ensure we return EAGAIN in xs_nospace if congestion is cleared commit `24ca9a8477` upstream. By returning '0' instead of 'EAGAIN' when the tests in xs_nospace() fail to find evidence of socket congestion, we are making the RPC engine believe that the message was incorrectly sent and so it disconnects the socket instead of just retrying. The bug appears to have been introduced by commit `5e3771ce2d` (SUNRPC: Ensure that xs_nospace return values are propagated). Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-09 08:52:27 -08:00
NeilBrown	6fa9e3e3e0	NFS/sunrpc: don't use a credential with extra groups. commit `dc6f55e9f8` upstream. The sunrpc layer keeps a cache of recently used credentials and 'unx_match' is used to find the credential which matches the current process. However unx_match allows a match when the cached credential has extra groups at the end of uc_gids list which are not in the process group list. So if a process with a list of (say) 4 group accesses a file and gains access because of the last group in the list, then another process with the same uid and gid, and a gid list being the first tree of the gids of the original process tries to access the file, it will be granted access even though it shouldn't as the wrong rpc credential will be used. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-11-11 09:37:07 -08:00
J. Bruce Fields	83d20a07d3	svcrpc: fix list-corrupting race on nfsd shutdown commit `ebc63e531c` upstream. After commit `3262c816a3` "[PATCH] knfsd: split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no longer removed its xpt_ready (then sk_ready) field from whatever list it was on, noting that there was no point since the whole list was about to be destroyed anyway. That was mostly true, but forgot that a few svc_xprt_enqueue()'s might still be hanging around playing with the about-to-be-destroyed list, and could get themselves into trouble writing to freed memory if we left this xprt on the list after freeing it. (This is actually functionally identical to a patch made first by Ben Greear, but with more comments.) Cc: gnb@fmeh.org Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-08-04 21:58:40 -07:00
Ben Greear	ec0dd267bf	SUNRPC: Fix use of static variable in rpcb_getport_async Because struct rpcbind_args *map was declared static, if two threads entered this method at the same time, the values assigned to map could be sent two two differen tasks. This could cause all sorts of problems, include use-after-free and double-free of memory. Fix this by removing the static declaration so that the map pointer is on the stack. Signed-off-by: Ben Greear <greearb@candelatech.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-07-12 13:40:13 -04:00
Trond Myklebust	b55c59892e	SUNRPC: Fix a race between work-queue and rpc_killall_tasks Since rpc_killall_tasks may modify the rpc_task's tk_action field without any locking, we need to be careful when dereferencing it. Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2011-07-07 20:45:37 -04:00
Linus Torvalds	2992c4bd57	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix decode_secinfo_maxsz NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test NFSv4.1: Fix some issues with pnfs_generic_pg_test NFSv4.1: file layout must consider pg_bsize for coalescing pnfs-obj: No longer needed to take an extra ref at add_device SUNRPC: Ensure the RPC client only quits on fatal signals NFSv4: Fix a readdir regression nfs4.1: mark layout as bad on error path in _pnfs_return_layout nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path NFS: (d)printks should use %zd for ssize_t arguments NFSv4.1: fix break condition in pnfs_find_lseg nfs4.1: fix several problems with _pnfs_return_layout NFSv4.1: allow zero fh array in filelayout decode layout NFSv4.1: allow nfs_fhget to succeed with mounted on fileid NFSv4.1: Fix a refcounting issue in the pNFS device id cache NFSv4.1: deprecate headerpadsz in CREATE_SESSION NFS41: do not update isize if inode needs layoutcommit NLM: Don't hang forever on NLM unlock requests NFS: fix umount of pnfs filesystems	2011-06-21 18:20:55 -07:00
Trond Myklebust	5afa9133cf	SUNRPC: Ensure the RPC client only quits on fatal signals Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-06-17 10:17:19 -04:00
Trond Myklebust	0b760113a3	NLM: Don't hang forever on NLM unlock requests If the NLM daemon is killed on the NFS server, we can currently end up hanging forever on an 'unlock' request, instead of aborting. Basically, if the rpcbind request fails, or the server keeps returning garbage, we really want to quit instead of retrying. Tested-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2011-06-15 11:24:27 -04:00
J. Bruce Fields	b084f598df	nfsd: fix dependency of nfsd on auth_rpcgss Commit `b0b0c0a26e` "nfsd: add proc file listing kernel's gss_krb5 enctypes" added an nunnecessary dependency of nfsd on the auth_rpcgss module. It's a little ad hoc, but since the only piece of information nfsd needs from rpcsec_gss_krb5 is a single static string, one solution is just to share it with an include file. Cc: stable@kernel.org Reported-by: Michael Guntsche <mike@it-loops.com> Cc: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-06-06 15:07:15 -04:00
Linus Torvalds	cd1acdf172	Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd * 'pnfs-submit' of git://git.open-osd.org/linux-open-osd: (32 commits) pnfs-obj: pg_test check for max_io_size NFSv4.1: define nfs_generic_pg_test NFSv4.1: use pnfs_generic_pg_test directly by layout driver NFSv4.1: change pg_test return type to bool NFSv4.1: unify pnfs_pageio_init functions pnfs-obj: objlayout_encode_layoutcommit implementation pnfs: encode_layoutcommit pnfs-obj: report errors and .encode_layoutreturn Implementation. pnfs: encode_layoutreturn pnfs: layoutret_on_setattr pnfs: layoutreturn pnfs-obj: osd raid engine read/write implementation pnfs: support for non-rpc layout drivers pnfs-obj: define per-inode private structure pnfs: alloc and free layout_hdr layoutdriver methods pnfs-obj: objio_osd device information retrieval and caching pnfs-obj: decode layout, alloc/free lseg pnfs-obj: pnfs_osd XDR client implementation pnfs-obj: pnfs_osd XDR definitions pnfs-obj: objlayoutdriver module skeleton ...	2011-05-29 14:10:13 -07:00
Linus Torvalds	a74d70b63f	Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux * 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd: make local functions static NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session() NFSD: Check status from nfsd4_map_bcts_dir() NFSD: Remove setting unused variable in nfsd_vfs_read() nfsd41: error out on repeated RECLAIM_COMPLETE nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence nfsd v4.1 lOCKT clientid field must be ignored nfsd41: add flag checking for create_session nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly nfsd4: fix wrongsec handling for PUTFH + op cases nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller nfsd4: introduce OPDESC helper nfsd4: allow fh_verify caller to skip pseudoflavor checks nfsd: distinguish functions of NFSD_MAY_* flags svcrpc: complete svsk processing on cb receive failure svcrpc: take advantage of tcp autotuning SUNRPC: Don't wait for full record to receive tcp data svcrpc: copy cb reply instead of pages svcrpc: close connection if client sends short packet svcrpc: note network-order types in svc_process_calldir ...	2011-05-29 11:21:12 -07:00
Linus Torvalds	f1d1c9fa8f	Merge branch 'nfs-for-2.6.40' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.40' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: SUNRPC: Support for RPC over AF_LOCAL transports SUNRPC: Remove obsolete comment SUNRPC: Use AF_LOCAL for rpcbind upcalls SUNRPC: Clean up use of curly braces in switch cases NFS: Revert NFSROOT default mount options SUNRPC: Rename xs_encode_tcp_fragment_header() nfs,rcu: convert call_rcu(nfs_free_delegation_callback) to kfree_rcu() nfs41: Correct offset for LAYOUTCOMMIT NFS: nfs_update_inode: print current and new inode size in debug output NFSv4.1: Fix the handling of NFS4ERR_SEQ_MISORDERED errors NFSv4: Handle expired stateids when the lease is still valid SUNRPC: Deal with the lack of a SYN_SENT sk->sk_state_change callback...	2011-05-29 11:20:02 -07:00
Benny Halevy	f7da7a129d	SUNRPC: introduce xdr_init_decode_pages Initialize xdr_stream and xdr_buf using an array of page pointers and length of buffer. Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2011-05-29 20:52:32 +03:00
Chuck Lever	176e21ee2e	SUNRPC: Support for RPC over AF_LOCAL transports TI-RPC introduces the capability of performing RPC over AF_LOCAL sockets. It uses this mainly for registering and unregistering local RPC services securely with the local rpcbind, but we could also conceivably use it as a generic upcall mechanism. This patch provides a client-side only implementation for the moment. We might also consider a server-side implementation to provide AF_LOCAL access to NLM (for statd downcalls, and such like). Autobinding is not supported on kernel AF_LOCAL transports at this time. Kernel ULPs must specify the pathname of the remote endpoint when an AF_LOCAL transport is created. rpcbind supports registering services available via AF_LOCAL, so the kernel could handle it with some adjustment to ->rpcbind and ->set_port. But we don't need this feature for doing upcalls via well-known named sockets. This has not been tested with ULPs that move a substantial amount of data. Thus, I can't attest to how robust the write_space and congestion management logic is. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-05-27 17:42:47 -04:00
Chuck Lever	559649efb9	SUNRPC: Remove obsolete comment Clean up. The documenting comment at the top of net/sunrpc/clnt.c is out of date. We adopted BSD's RTO estimation mechanism years ago. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-05-27 17:42:47 -04:00
Chuck Lever	7402ab19cd	SUNRPC: Use AF_LOCAL for rpcbind upcalls As libtirpc does in user space, have our registration API try using an AF_LOCAL transport first when registering and unregistering. This means we don't chew up privileged ports, and our registration is bound to an "owner" (the effective uid of the process on the sending end of the transport). Only that "owner" may unregister the service. The kernel could probe rpcbind via an rpcbind query to determine whether rpcbind has an AF_LOCAL service. For simplicity, we use the same technique that libtirpc uses: simply fail over to network loopback if creating an AF_LOCAL transport to the well-known rpcbind service socket fails. This means we open-code the pathname of the rpcbind socket in the kernel. For now we have to do that anyway because the kernel's RPC over AF_LOCAL implementation does not support autobind. That may be undesirable in the long term. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-05-27 17:42:47 -04:00
Chuck Lever	da09eb9303	SUNRPC: Clean up use of curly braces in switch cases Clean up. Preferred style is not to use curly braces around switch cases. I'm about to add another case that needs a third type cast. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-05-27 17:42:47 -04:00
Chuck Lever	61677eeec2	SUNRPC: Rename xs_encode_tcp_fragment_header() Clean up: Use a more generic name for xs_encode_tcp_fragment_header(); it's appropriate to use for all stream transport types. We're about to add new stream transport. Also, move it to a place where it is more easily shared amongst the various send_request methods. And finally, replace the "htonl" macro invocation with its modern equivalent. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-05-27 17:42:47 -04:00
Trond Myklebust	fe19a96b10	SUNRPC: Deal with the lack of a SYN_SENT sk->sk_state_change callback... The TCP connection state code depends on the state_change() callback being called when the SYN_SENT state is set. However the networking layer doesn't actually call us back in that case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2011-05-27 17:42:00 -04:00
Linus Torvalds	4c171acc20	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cma: Save PID of ID's owner RDMA/cma: Add support for netlink statistics export RDMA/cma: Pass QP type into rdma_create_id() RDMA: Update exported headers list RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h> RDMA/nes: Add a check for strict_strtoul() RDMA/cxgb3: Don't post zero-byte read if endpoint is going away RDMA/cxgb4: Use completion objects for event blocking IB/srp: Fix integer -> pointer cast warnings IB: Add devnode methods to cm_class and umad_class IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required IB/uverbs: Add devnode method to set path/mode RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node RDMA: Add netlink infrastructure RDMA: Add error handling to ib_core_init()	2011-05-26 12:13:57 -07:00
Sean Hefty	b26f9b9949	RDMA/cma: Pass QP type into rdma_create_id() The RDMA CM currently infers the QP type from the port space selected by the user. In the future (eg with RDMA_PS_IB or XRC), there may not be a 1-1 correspondence between port space and QP type. For netlink export of RDMA CM state, we want to export the QP type to userspace, so it is cleaner to explicitly associate a QP type to an ID. Modify rdma_create_id() to allow the user to specify the QP type, and use it to make our selections of datagram versus connected mode. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-25 13:46:23 -07:00
Ying Han	1495f230fa	vmscan: change shrinker API by passing shrink_control struct Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:26 -07:00
Linus Torvalds	57d19e80f4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) b43: fix comment typo reqest -> request Haavard Skinnemoen has left Atmel cris: typo in mach-fs Makefile Kconfig: fix copy/paste-ism for dell-wmi-aio driver doc: timers-howto: fix a typo ("unsgined") perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). treewide: fix a few typos in comments regulator: change debug statement be consistent with the style of the rest Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations" audit: acquire creds selectively to reduce atomic op overhead rtlwifi: don't touch with treewide double semicolon removal treewide: cleanup continuations and remove logging message whitespace ath9k_hw: don't touch with treewide double semicolon removal include/linux/leds-regulator.h: fix syntax in example code tty: fix typo in descripton of tty_termios_encode_baud_rate xtensa: remove obsolete BKL kernel option from defconfig m68k: fix comment typo 'occcured' arch:Kconfig.locks Remove unused config option. treewide: remove extra semicolons ...	2011-05-23 09:12:26 -07:00
Justin P. Mattock	70f23fd66b	treewide: fix a few typos in comments - kenrel -> kernel - whetehr -> whether - ttt -> tt - sss -> ss Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-05-10 10:16:21 +02:00
Jiri Kosina	07f9479a40	Merge branch 'master' into for-next Fast-forwarded to current state of Linus' tree as there are patches to be applied for files that didn't exist on the old branch.	2011-04-26 10:22:59 +02:00
Trond Myklebust	7494d00c7b	SUNRPC: Allow RPC calls to return ETIMEDOUT instead of EIO On occasion, it is useful for the NFS layer to distinguish between soft timeouts and other EIO errors due to (say) encoding errors, or authentication errors. The following patch ensures that the default behaviour of the RPC layer remains to return EIO on soft timeouts (until we have audited all the callers). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-04-24 14:28:45 -04:00
Bryan Schumaker	468f86134e	NFSv4.1: Don't update sequence number if rpc_task is not sent If we fail to contact the gss upcall program, then no message will be sent to the server. The client still updated the sequence number, however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several RPC calls. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-04-18 17:05:48 -04:00
Trond Myklebust	e3b2854faa	SUNRPC: Fix the SUNRPC Kerberos V RPCSEC_GSS module dependencies Since kernel 2.6.35, the SUNRPC Kerberos support has had an implicit dependency on a number of additional crypto modules. The following patch makes that dependency explicit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-04-15 18:28:18 -04:00
Bryan Schumaker	d1a8016a2d	NFS: Fix infinite loop in gss_create_upcall() There can be an infinite loop if gss_create_upcall() is called without the userspace program running. To prevent this, we return -EACCES if we notice that pipe_version hasn't changed (indicating that the pipe has not been opened). Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-04-13 15:12:22 -04:00
Justin P. Mattock	6eab04a876	treewide: remove extra semicolons Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-04-10 17:01:05 +02:00
J. Bruce Fields	8985ef0b8a	svcrpc: complete svsk processing on cb receive failure Currently when there's some failure to receive a callback (because we couldn't find a matching xid, for example), we exit svc_recv with sk_tcplen still set but without any pages saved with the socket. This will cause a crash later in svc_tcp_restore_pages. Instead, make sure we reset that tcp information whether the callback received failed or succeeded. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-10 10:47:46 -04:00
Linus Torvalds	94c8a984ae	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC NFS: Fix a signed vs. unsigned secinfo bug Revert "net/sunrpc: Use static const char arrays"	2011-04-08 11:47:35 -07:00
Olga Kornievskaia	9660439861	svcrpc: take advantage of tcp autotuning Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	31d68ef65c	SUNRPC: Don't wait for full record to receive tcp data Ensure that we immediately read and buffer data from the incoming TCP stream so that we grow the receive window quickly, and don't deadlock on large READ or WRITE requests. Also do some minor exit cleanup. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
Trond Myklebust	586c52cc61	svcrpc: copy cb reply instead of pages It's much simpler just to copy the cb reply data than to play tricks with pages. Callback replies will typically be very small (at least until we implement cb_getattr, in which case files with very long ACLs could pose a problem), so there's no loss in efficiency. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	cc6c2127f2	svcrpc: close connection if client sends short packet If the client sents a record too short to contain even the beginning of the rpc header, then just close the connection. The current code drops the record data and continues. I don't see the point. It's a hopeless situation and simpler just to cut off the connection completely. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	48e6555c7b	svcrpc: note network-order types in svc_process_calldir Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00

1 2 3 4 5 ...

1533 Commits