[ Upstream commit 0ce772fe79 ]
The p9_tag_alloc() does not initialize the transport error t_err field.
The struct p9_req_t *req is allocated and stored in a struct p9_client
variable. The field t_err is never initialized before p9_conn_cancel()
checks its value.
KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool)
reports this bug.
==================================================================
BUG: KUMSAN: use of uninitialized memory in p9_conn_cancel+0x2d9/0x3b0
Read of size 4 at addr ffff88805f9b600c by task kworker/1:2/1216
CPU: 1 PID: 1216 Comm: kworker/1:2 Not tainted 5.2.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Workqueue: events p9_write_work
Call Trace:
dump_stack+0x75/0xae
__kumsan_report+0x17c/0x3e6
kumsan_report+0xe/0x20
p9_conn_cancel+0x2d9/0x3b0
p9_write_work+0x183/0x4a0
process_one_work+0x4d1/0x8c0
worker_thread+0x6e/0x780
kthread+0x1ca/0x1f0
ret_from_fork+0x35/0x40
Allocated by task 1979:
save_stack+0x19/0x80
__kumsan_kmalloc.constprop.3+0xbc/0x120
kmem_cache_alloc+0xa7/0x170
p9_client_prepare_req.part.9+0x3b/0x380
p9_client_rpc+0x15e/0x880
p9_client_create+0x3d0/0xac0
v9fs_session_init+0x192/0xc80
v9fs_mount+0x67/0x470
legacy_get_tree+0x70/0xd0
vfs_get_tree+0x4a/0x1c0
do_mount+0xba9/0xf90
ksys_mount+0xa8/0x120
__x64_sys_mount+0x62/0x70
do_syscall_64+0x6d/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 0:
(stack is not available)
The buggy address belongs to the object at ffff88805f9b6008
which belongs to the cache p9_req_t of size 144
The buggy address is located 4 bytes inside of
144-byte region [ffff88805f9b6008, ffff88805f9b6098)
The buggy address belongs to the page:
page:ffffea00017e6d80 refcount:1 mapcount:0 mapping:ffff888068b63740 index:0xffff88805f9b7d90 compound_mapcount: 0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 ffff888068b66450 ffff888068b66450 ffff888068b63740
raw: ffff88805f9b7d90 0000000000100001 00000001ffffffff 0000000000000000
page dumped because: kumsan: bad access detected
==================================================================
Link: http://lkml.kernel.org/r/20190613070854.10434-1-shuaibinglu@126.com
Signed-off-by: Lu Shuaibing <shuaibinglu@126.com>
[dominique.martinet@cea.fr: grouped the added init with the others]
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef5305f1f7 ]
strcpy to dirent->d_name could overflow the buffer, use strscpy to check
the provided string length and error out if the size was too big.
While we are here, make the function return an error when the pdu
parsing failed, instead of returning the pdu offset as if it had been a
success...
Link: http://lkml.kernel.org/r/1536339057-21974-4-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 139133 ("Copy into fixed size buffer")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 728356dede ]
To avoid use-after-free(s), use a refcount to keep track of the
usable references to any instantiated struct p9_req_t.
This commit adds p9_req_put(), p9_req_get() and p9_req_try_get() as
wrappers to kref_put(), kref_get() and kref_get_unless_zero().
These are used by the client and the transports to keep track of
valid requests' references.
p9_free_req() is added back and used as callback by kref_put().
Add SLAB_TYPESAFE_BY_RCU as it ensures that the memory freed by
kmem_cache_free() will not be reused for another type until the rcu
synchronisation period is over, so an address gotten under rcu read
lock is safe to inc_ref() without corrupting random memory while
the lock is held.
Link: http://lkml.kernel.org/r/1535626341-20693-1-git-send-email-asmadeus@codewreck.org
Co-developed-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+467050c1ce275af2a5b8@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 91a76be37f ]
Having a specific cache for the fcall allocations helps speed up
end-to-end latency.
The caches will automatically be merged if there are multiple caches
of items with the same size so we do not need to try to share a cache
between different clients of the same size.
Since the msize is negotiated with the server, only allocate the cache
after that negotiation has happened - previous allocations or
allocations of different sizes (e.g. zero-copy fcall) are made with
kmalloc directly.
Some figures on two beefy VMs with Connect-IB (sriov) / trans=rdma,
with ior running 32 processes in parallel doing small 32 bytes IOs:
- no alloc (4.18-rc7 request cache): 65.4k req/s
- non-power of two alloc, no patch: 61.6k req/s
- power of two alloc, no patch: 62.2k req/s
- non-power of two alloc, with patch: 64.7k req/s
- power of two alloc, with patch: 65.1k req/s
Link: http://lkml.kernel.org/r/1532943263-24378-2-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Greg Kurz <groug@kaod.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Pull 9p updates from Dominique Martinet:
"This contains mostly fixes (6 to be backported to stable) and a few
changes, here is the breakdown:
- rework how fids are attributed by replacing some custom tracking in
a list by an idr
- for packet-based transports (virtio/rdma) validate that the packet
length matches what the header says
- a few race condition fixes found by syzkaller
- missing argument check when NULL device is passed in sys_mount
- a few virtio fixes
- some spelling and style fixes"
* tag '9p-for-4.19-2' of git://github.com/martinetd/linux: (21 commits)
net/9p/trans_virtio.c: add null terminal for mount tag
9p/virtio: fix off-by-one error in sg list bounds check
9p: fix whitespace issues
9p: fix multiple NULL-pointer-dereferences
fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed
9p: validate PDU length
net/9p/trans_fd.c: fix race by holding the lock
net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree()
net/9p/virtio: Fix hard lockup in req_done
net/9p/trans_virtio.c: fix some spell mistakes in comments
9p/net: Fix zero-copy path in the 9p virtio transport
9p: Embed wait_queue_head into p9_req_t
9p: Replace the fidlist with an IDR
9p: Change p9_fid_create calling convention
9p: Fix comment on smp_wmb
net/9p/client.c: version pointer uninitialized
fs/9p/v9fs.c: fix spelling mistake "Uknown" -> "Unknown"
net/9p: fix error path of p9_virtio_probe
9p/net/protocol.c: return -ENOMEM when kmalloc() failed
net/9p/client.c: add missing '\n' at the end of p9_debug()
...
rdma.git merge resolution for the 4.19 merge window
Conflicts:
drivers/infiniband/core/rdma_core.c
- Use the rdma code and revise with the new spelling for
atomic_fetch_add_unless
drivers/nvme/host/rdma.c
- Replace max_sge with max_send_sge in new blk code
drivers/nvme/target/rdma.c
- Use the blk code and revise to use NULL for ib_post_recv when
appropriate
- Replace max_sge with max_recv_sge in new blk code
net/rds/ib_send.c
- Use the net code and revise to use NULL for ib_post_recv when
appropriate
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Resolve merge conflicts from the -rc cycle against the rdma.git tree:
Conflicts:
drivers/infiniband/core/uverbs_cmd.c
- New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
- Merge removal of file->ucontext in for-next with new code in -rc
drivers/infiniband/core/uverbs_main.c
- for-next removed code from ib_uverbs_write() that was modified
in for-rc
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
chan->tag is Non-null terminated which will result in printing messy code
when debugging code. So we should add '\0' for tag to make the code more
convenient and robust. In addition, I drop char->tag_len to simplify the
code.
Link: http://lkml.kernel.org/r/5B641ECC.5030401@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
The zero-copy optimization when reading or writing large chunks of data
is quite useful. However, the 9p messages created through the zero-copy
write path have an incorrect message size: it should be the size of the
header + size of the data being written but instead it's just the size
of the header.
This only works if the server ignores the size field of the message and
otherwise breaks the framing of the protocol. Fix this by re-writing the
message size field with the correct value.
Tested by running `dd if=/dev/zero of=out bs=4k count=1` inside a
virtio-9p mount.
Link: http://lkml.kernel.org/r/20180717003529.114368-1-chirantan@chromium.org
Signed-off-by: Chirantan Ekbote <chirantan@chromium.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Cc: Dylan Reid <dgreid@chromium.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Return NULL instead of ERR_PTR when we can't allocate a FID. The ENOSPC
return value was getting all the way back to userspace, and that's
confusing for a userspace program which isn't expecting read() to tell it
there's no space left on the filesystem. The best error we can return to
indicate a temporary failure caused by lack of client resources is ENOMEM.
Maybe it would be better to sleep until a FID is available, but that's
not a change I'm comfortable making.
Link: http://lkml.kernel.org/r/20180711210225.19730-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Yiwen Jiang <jiangyiwen@huwei.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
The ret is modified after initalization, so just remove it and
return 0.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL
as third argument to ib_post_(send|recv|srq_recv)().
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In my testing, the second mount will fail after umounting successfully.
The reason is that we put refcount of trans_mod in the correct case
rather than the error case in parse_opts() at last. That will cause the
refcount decrease to -1, and when we try to get trans_mod again in
try_module_get(), we could only increase refcount to 0 which will cause
failure as follows:
parse_opts
v9fs_get_trans_by_name
try_module_get : return NULL to caller which cause error
So we should put refcount of trans_mod in error case.
Link: http://lkml.kernel.org/r/5B3F39A0.2030509@huawei.com
Fixes: 9421c3e641 ("net/9p/client.c: fix potential refcnt problem of trans module")
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Tested-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem tests from Roman Mashak.
3) Add TCP zero copy receive and fix delayed acks and autotuning with
SO_RCVLOWAT, from Eric Dumazet.
4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
Brouer.
5) Add ttl inherit support to vxlan, from Hangbin Liu.
6) Properly separate ipv6 routes into their logically independant
components. fib6_info for the routing table, and fib6_nh for sets of
nexthops, which thus can be shared. From David Ahern.
7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
messages from XDP programs. From Nikita V. Shirokov.
8) Lots of long overdue cleanups to the r8169 driver, from Heiner
Kallweit.
9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.
10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.
11) Plumb extack down into fib_rules, from Roopa Prabhu.
12) Add Flower classifier offload support to igb, from Vinicius Costa
Gomes.
13) Add UDP GSO support, from Willem de Bruijn.
14) Add documentation for eBPF helpers, from Quentin Monnet.
15) Add TLS tx offload to mlx5, from Ilya Lesokhin.
16) Allow applications to be given the number of bytes available to read
on a socket via a control message returned from recvmsg(), from
Soheil Hassas Yeganeh.
17) Add x86_32 eBPF JIT compiler, from Wang YanQing.
18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
From Björn Töpel.
19) Remove indirect load support from all of the BPF JITs and handle
these operations in the verifier by translating them into native BPF
instead. From Daniel Borkmann.
20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.
21) Allow XDP programs to do lookups in the main kernel routing tables
for forwarding. From David Ahern.
22) Allow drivers to store hardware state into an ELF section of kernel
dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.
23) Various RACK and loss detection improvements in TCP, from Yuchung
Cheng.
24) Add TCP SACK compression, from Eric Dumazet.
25) Add User Mode Helper support and basic bpfilter infrastructure, from
Alexei Starovoitov.
26) Support ports and protocol values in RTM_GETROUTE, from Roopa
Prabhu.
27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
Brouer.
28) Add lots of forwarding selftests, from Petr Machata.
29) Add generic network device failover driver, from Sridhar Samudrala.
* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
strparser: Add __strp_unpause and use it in ktls.
rxrpc: Fix terminal retransmission connection ID to include the channel
net: hns3: Optimize PF CMDQ interrupt switching process
net: hns3: Fix for VF mailbox receiving unknown message
net: hns3: Fix for VF mailbox cannot receiving PF response
bnx2x: use the right constant
Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
enic: fix UDP rss bits
netdev-FAQ: clarify DaveM's position for stable backports
rtnetlink: validate attributes in do_setlink()
mlxsw: Add extack messages for port_{un, }split failures
netdevsim: Add extack error message for devlink reload
devlink: Add extack to reload and port_{un, }split operations
net: metrics: add proper netlink validation
ipmr: fix error path when ipmr_new_table fails
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
net: hns3: remove unused hclgevf_cfg_func_mta_filter
netfilter: provide udp*_lib_lookup for nf_tproxy
qed*: Utilize FW 8.37.2.0
...
Pull aio updates from Al Viro:
"Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.
The only thing I'm holding back for a day or so is Adam's aio ioprio -
his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
but let it sit in -next for decency sake..."
* 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
aio: sanitize the limit checking in io_submit(2)
aio: fold do_io_submit() into callers
aio: shift copyin of iocb into io_submit_one()
aio_read_events_ring(): make a bit more readable
aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
aio: take list removal to (some) callers of aio_complete()
aio: add missing break for the IOCB_CMD_FDSYNC case
random: convert to ->poll_mask
timerfd: convert to ->poll_mask
eventfd: switch to ->poll_mask
pipe: convert to ->poll_mask
crypto: af_alg: convert to ->poll_mask
net/rxrpc: convert to ->poll_mask
net/iucv: convert to ->poll_mask
net/phonet: convert to ->poll_mask
net/nfc: convert to ->poll_mask
net/caif: convert to ->poll_mask
net/bluetooth: convert to ->poll_mask
net/sctp: convert to ->poll_mask
net/tipc: convert to ->poll_mask
...