commit d58431eacb upstream.
A recent commit added a call to cache_fresh_locked()
when an expired item was found.
The call sets the CACHE_VALID flag, so it is important
that the item actually is valid.
There are two ways it could be valid:
1/ If ->update has been called to fill in relevant content
2/ if CACHE_NEGATIVE is set, to say that content doesn't exist.
An expired item that is waiting for an update will be neither.
Setting CACHE_VALID will mean that a subsequent call to cache_put()
will be likely to dereference uninitialised pointers.
So we must make sure the item is valid, and we already have code to do
that in try_to_negate_entry(). This takes the hash lock and so cannot
be used directly, so take out the two lines that we need and use them.
Now cache_fresh_locked() is certain to be called only on
a valid item.
Cc: stable@kernel.org # 2.6.35
Fixes: 4ecd55ea07 ("sunrpc: fix cache_head leak due to queued request")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6abc8caa6 upstream.
If there are multiple callbacks queued, waiting for the callback
slot when the callback gets shut down, then they all currently
end up acting as if they hold the slot, and call
nfsd4_cb_sequence_done() resulting in interesting side-effects.
In addition, the 'retry_nowait' path in nfsd4_cb_sequence_done()
causes a loop back to nfsd4_cb_prepare() without first freeing the
slot, which causes a deadlock when nfsd41_cb_get_slot() gets called
a second time.
This patch therefore adds a boolean to track whether or not the
callback did pick up the slot, so that it can do the right thing
in these 2 cases.
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bcb344086 upstream.
Ben reported tripping the BUG_ON in create_request_message during some
performance testing. Analysis of the vmcore showed that the length of
the r_dentry->d_name string changed after we allocated the buffer, but
before we encoded it.
build_dentry_path returns pointers to d_name in the common case of
non-snapped dentries, but this optimization isn't safe unless the parent
directory is locked. When it isn't, have the code make a copy of the
d_name while holding the d_lock.
Cc: stable@vger.kernel.org
Reported-by: Ben England <bengland@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c39f7f671 upstream.
Current implementation was not properly handling frwr memory
registrations. This was uncovered by commit 27f26cec761das ("xprtrdma:
Plant XID in on-the-wire RDMA offset (FRWR)") in which xprtrdma, which is
used for NFS over RDMA, started failing as it was the first ULP to modify
the ib_mr iova resulting in the NFS server getting REMOTE ACCESS ERROR
when attempting to perform RDMA Writes to the client.
The fix is to properly capture the true iova, offset, and length in the
call to ib_map_mr_sg, and then update the iova when processing the
IB_WR_REG_MEM on the send queue.
Fixes: a41081aa59 ("IB/rdmavt: Add support for ib_map_mr_sg")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Josh Collier <josh.d.collier@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79b4a9cf0e upstream.
Commit 4c21b8fd8f (MIPS: seccomp: Handle indirect system calls (o32))
added indirect syscall detection for O32 processes running on MIPS64,
but it did not work correctly for big endian kernel/processes. The
reason is that the syscall number is loaded from ARG1 using the lw
instruction while this is a 64-bit value, so zero is loaded instead of
the syscall number.
Fix the code by using the ld instruction instead. When running a 32-bit
processes on a 64 bit CPU, the values are properly sign-extended, so it
ensures the value passed to syscall_trace_enter is correct.
Recent systemd versions with seccomp enabled whitelist the getpid
syscall for their internal processes (e.g. systemd-journald), but call
it through syscall(SYS_getpid). This fix therefore allows O32 big endian
systems with a 64-bit kernel to run recent systemd versions.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: <stable@vger.kernel.org> # v3.15+
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 102bbe34b3 upstream.
When setting sync EIC as IRQ_TYPE_EDGE_BOTH type, we missed to set the
SPRD_EIC_SYNC_INTMODE register to 0, which means detecting edge signals.
Thus this patch fixes the issue.
Fixes: 25518e024e ("gpio: Add Spreadtrum EIC driver support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b987222654 upstream.
This fixes multiple issues in buffer_pipe_buf_ops:
- The ->steal() handler must not return zero unless the pipe buffer has
the only reference to the page. But generic_pipe_buf_steal() assumes
that every reference to the pipe is tracked by the page's refcount,
which isn't true for these buffers - buffer_pipe_buf_get(), which
duplicates a buffer, doesn't touch the page's refcount.
Fix it by using generic_pipe_buf_nosteal(), which refuses every
attempted theft. It should be easy to actually support ->steal, but the
only current users of pipe_buf_steal() are the virtio console and FUSE,
and they also only use it as an optimization. So it's probably not worth
the effort.
- The ->get() and ->release() handlers can be invoked concurrently on pipe
buffers backed by the same struct buffer_ref. Make them safe against
concurrency by using refcount_t.
- The pointers stored in ->private were only zeroed out when the last
reference to the buffer_ref was dropped. As far as I know, this
shouldn't be necessary anyway, but if we do it, let's always do it.
Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Fixes: 73a757e631 ("ring-buffer: Return reader page back into existing ring buffer")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91862cc786 upstream.
In trace_pid_write(), the buffer for trace parser is allocated through
kmalloc() in trace_parser_get_init(). Later on, after the buffer is used,
it is then freed through kfree() in trace_parser_put(). However, it is
possible that trace_pid_write() is terminated due to unexpected errors,
e.g., ENOMEM. In that case, the allocated buffer will not be freed, which
is a memory leak bug.
To fix this issue, free the allocated buffer when an error is encountered.
Link: http://lkml.kernel.org/r/1555726979-15633-1-git-send-email-wang6495@umn.edu
Fixes: f4d34a87e9 ("tracing: Use pid bitmap instead of a pid array for set_event_pid")
Cc: stable@vger.kernel.org
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 652727bbe1 upstream.
A path-based rename returning EBUSY will incorrectly try opening
the file with a cifs (NT Create AndX) operation on an smb2+ mount,
which causes the server to force a session close.
If the mount is smb2+, skip the fallback.
Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3acca1dd17 ]
In the original patch I missed to add mv88e6xxx_ports_cmode_init()
to the second probe function, the one for the new DSA framework.
Fixes: ed8fe20205 ("net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode")
Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c97617a81a ]
A call of pci_iounmap() call without CONFIG_PCI leads to a build error
on some architectures. We tried to address this and add a check of
IS_ENABLED(CONFIG_PCI), but this still doesn't seem enough for sh.
Ideally we should fix it globally, it's really a corner case, so let's
paper over it with a simpler ifdef.
Fixes: 1e73359a24 ("ALSA: hda/ca0132 - make pci_iounmap() call conditional")
Reported-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dd9a994fc6 ]
Commit b5b4453e79 ("powerpc/vdso64: Fix CLOCK_MONOTONIC
inconsistencies across Y2038") changed the type of wtom_clock_sec
to s64 on PPC64. Therefore, VDSO32 needs to read it with a 4 bytes
shift in order to retrieve the lower part of it.
Fixes: b5b4453e79 ("powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c93a49b976 ]
When CONFIG_IP_VS_IPV6 is not defined, build produced this warning:
net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable]
int ret = 0;
^~~
Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6
section in the same function.
While at it, drop its unneeded initialisation.
Fixes: 098e13f5b2 ("ipvs: fix dependency on nf_defrag_ipv6")
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2ac695d1d6 ]
Syzbot found a crash:
BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x54f/0xcd0 net/tipc/netlink_compat.c:872
Call Trace:
tipc_nl_compat_name_table_dump+0x54f/0xcd0 net/tipc/netlink_compat.c:872
__tipc_nl_compat_dumpit+0x59e/0xda0 net/tipc/netlink_compat.c:215
tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:280
tipc_nl_compat_handle net/tipc/netlink_compat.c:1226 [inline]
tipc_nl_compat_recv+0x1b5f/0x2750 net/tipc/netlink_compat.c:1265
genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336
netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg net/socket.c:632 [inline]
Uninit was created at:
__alloc_skb+0x309/0xa20 net/core/skbuff.c:208
alloc_skb include/linux/skbuff.h:1012 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg net/socket.c:632 [inline]
It was supposed to be fixed on commit 974cb0e3e7 ("tipc: fix uninit-value
in tipc_nl_compat_name_table_dump") by checking TLV_GET_DATA_LEN(msg->req)
in cmd->header()/tipc_nl_compat_name_table_dump_header(), which is called
ahead of tipc_nl_compat_name_table_dump().
However, tipc_nl_compat_dumpit() doesn't handle the error returned from cmd
header function. It means even when the check added in that fix fails, it
won't stop calling tipc_nl_compat_name_table_dump(), and the issue will be
triggered again.
So this patch is to add the process for the err returned from cmd header
function in tipc_nl_compat_dumpit().
Reported-by: syzbot+3ce8520484b0d4e260a5@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b318f30ab ]
The Phylink validate function is the Marvell PPv2 driver makes a check
on the GoP id. This is valid an has to be done when using PPv2.2 engines
but makes no sense when using PPv2.1. The check done when using an RGMII
interface makes sure the GoP id is not 0, but this breaks PPv2.1. Fixes
it.
Fixes: 0fb628f0f2 ("net: mvpp2: fix phylink handling of invalid PHY modes")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 986103e792 ]
Commit a5681e20b5 ("net/ibmnvic: Fix deadlock problem
in reset") made the change to hold the RTNL lock during
driver reset but still calls netdev_notify_peers, which
results in a deadlock. Instead, use call_netdevice_notifiers,
which is functionally the same except that it does not
take the RTNL lock again.
Fixes: a5681e20b5 ("net/ibmnvic: Fix deadlock problem in reset")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ffcd32f64 ]
Proper use counter updates when activating and deactivating the object,
otherwise, this hits bogus EBUSY error.
Fixes: cd5125d8f5 ("netfilter: nf_tables: split set destruction in deactivate and destroy phase")
Reported-by: Laura Garcia <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 273fe3f100 ]
Set deletion after flush coming in the same batch results in EBUSY. Add
set use counter to track the number of references to this set from
rules. We cannot rely on the list of bindings for this since such list
is still populated from the preparation phase.
Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 40ba1d9b4d ]
The abort path can cause a double-free of an anonymous set.
Added-and-to-be-aborted rule looks like this:
udp dport { 137, 138 } drop
The to-be-aborted transaction list looks like this:
newset
newsetelem
newsetelem
rule
This gets walked in reverse order, so first pass disables the rule, the
set elements, then the set.
After synchronize_rcu(), we then destroy those in same order: rule, set
element, set element, newset.
Problem is that the anonymous set has already been bound to the rule, so
the rule (lookup expression destructor) already frees the set, when then
cause use-after-free when trying to delete the elements from this set,
then try to free the set again when handling the newset expression.
Rule releases the bound set in first place from the abort path, this
causes the use-after-free on set element removal when undoing the new
element transactions. To handle this, skip new element transaction if
set is bound from the abort path.
This is still causes the use-after-free on set element removal. To
handle this, remove transaction from the list when the set is already
bound.
Joint work with Florian Westphal.
Fixes: f6ac858589 ("netfilter: nf_tables: unbind set in rule from commit path")
Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8e2040063 ]
Add .release_ops, that is called in case of error at a later stage in
the expression initialization path, ie. .select_ops() has been already
set up operations and that needs to be undone. This allows us to unwind
.select_ops from the error path, ie. release the dynamic operations for
this extension.
Moreover, allocate one single operation instead of recycling them, this
comes at the cost of consuming a bit more memory per rule, but it
simplifies the infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 947e492c0f ]
When I moved the refcount to refcount_t type I missed the fact that
refcount_inc() will result in use-after-free warning with
CONFIG_REFCOUNT_FULL=y builds.
The correct fix would be to init the reference count to 1 at allocation
time, but, unfortunately we cannot do this, as we can't undo that
in case something else fails later in the batch.
So only solution I see is to special-case the 'new entry' condition
and replace refcount_inc() with a "delayed" refcount_set(1) in this case,
as done here.
The .activate callback can be removed to simplify things, we only
need to make sure that deactivate() decrements/unlinks the entry
from the list at end of transaction phase (commit or abort).
Fixes: 12c44aba66 ("netfilter: nft_compat: use refcnt_t type for nft_xt reference count")
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anonymous sets that are bound to rules from the same transaction trigger
a kernel splat from the abort path due to double set list removal and
double free.
This patch updates the logic to search for the transaction that is
responsible for creating the set and disable the set list removal and
release, given the rule is now responsible for this. Lookup is reverse
since the transaction that adds the set is likely to be at the tail of
the list.
Moreover, this patch adds the unbind step to deliver the event from the
commit path. This should not be done from the worker thread, since we
have no guarantees of in-order delivery to the listener.
This patch removes the assumption that both activate and deactivate
callbacks need to be provided.
Fixes: cd5125d8f5 ("netfilter: nf_tables: split set destruction in deactivate and destroy phase")
Reported-by: Mikhail Morfikov <mmorfikov@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
->destroy is only allowed to free data, or do other cleanups that do not
have side effects on other state, such as visibility to other netlink
requests.
Such things need to be done in ->deactivate.
As a transaction can fail, we need to make sure we can undo such
operations, therefore ->activate() has to be provided too.
So print a warning and refuse registration if expr->ops provides
only one of the two operations.
v2: fix nft_expr_check_ops to not repeat same check twice (Jones Desougi)
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The nft_compat destroy function deletes the nft_xt object from a list.
This isn't allowed anymore. Destroy functions are called asynchronously,
i.e. next batch can find the object that has a pending ->destroy()
invocation:
cpu0 cpu1
worker
->destroy for_each_entry()
if (x == ...
return x->ops;
list_del(x)
kfree_rcu(x)
expr->ops->... // ops was free'd
To resolve this, the list_del needs to occur before the transaction
mutex gets released. nf_tables has a 'deactivate' hook for this
purpose, so use that to unlink the object from the list.
Fixes: 0935d55884 ("netfilter: nf_tables: asynchronous release")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[ Upstream commit cd5125d8f5 ]
Splits unbind_set into destroy_set and unbinding operation.
Unbinding removes set from lists (so new transaction would not
find it anymore) but keeps memory allocated (so packet path continues
to work).
Rebind function is added to allow unrolling in case transaction
that wants to remove set is aborted.
Destroy function is added to free the memory, but this could occur
outside of transaction in the future.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf52572ebb ]
There are two problems with nft_compat since the netlink config
plane uses a per-netns mutex:
1. Concurrent add/del accesses to the same list
2. accesses to a list element after it has been free'd already.
This patch fixes the first problem.
Freeing occurs from a work queue, after transaction mutexes have been
released, i.e., it still possible for a new transaction (even from
same net ns) to find the to-be-deleted expression in the list.
The ->destroy functions are not allowed to have any such side effects,
i.e. the list_del() in the destroy function is not allowed.
This part of the problem is solved in the next patch.
I tried to make this work by serializing list access via mutex
and by moving list_del() to a deactivate callback, but
Taehee spotted following race on this approach:
NET #0 NET #1
>select_ops()
->init()
->select_ops()
->deactivate()
->destroy()
nft_xt_put()
kfree_rcu(xt, rcu_head);
->init() <-- use-after-free occurred.
Unfortunately, we can't increment reference count in
select_ops(), because we can't undo the refcount increase in
case a different expression fails in the same batch.
(The destroy hook will only be called in case the expression
was initialized successfully).
Fixes: f102d66b33 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12c44aba66 ]
Using standard integer type was fine while all operations on it were
guarded by the nftnl subsys mutex.
This isn't true anymore:
1. transactions are guarded only by a pernet mutex, so concurrent
rule manipulation in different netns is racy
2. the ->destroy hook runs from a work queue after the transaction
mutex has been released already.
cpu0 cpu1 (net 1) cpu2 (net 2)
kworker
nft_compat->destroy nft_compat->init nft_compat->init
if (--nft_xt->ref == 0) nft_xt->ref++ nft_xt->ref++
Switch to refcount_t. Doing this however only fixes a minor aspect,
nft_compat also performs linked-list operations in an unsafe way.
This is addressed in the next two patches.
Fixes: f102d66b33 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Fixes: 0935d55884 ("netfilter: nf_tables: asynchronous release")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 9002b21465 upstream.
Commit 32a5ad9c22 ("sysctl: handle overflow for file-max") hooked up
min/max values for the file-max sysctl parameter via the .extra1 and
.extra2 fields in the corresponding struct ctl_table entry.
Unfortunately, the minimum value points at the global 'zero' variable,
which is an int. This results in a KASAN splat when accessed as a long
by proc_doulongvec_minmax on 64-bit architectures:
| BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0
| Read of size 8 at addr ffff2000133d1c20 by task systemd/1
|
| CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 #2
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace+0x0/0x228
| show_stack+0x14/0x20
| dump_stack+0xe8/0x124
| print_address_description+0x60/0x258
| kasan_report+0x140/0x1a0
| __asan_report_load8_noabort+0x18/0x20
| __do_proc_doulongvec_minmax+0x5d8/0x6a0
| proc_doulongvec_minmax+0x4c/0x78
| proc_sys_call_handler.isra.19+0x144/0x1d8
| proc_sys_write+0x34/0x58
| __vfs_write+0x54/0xe8
| vfs_write+0x124/0x3c0
| ksys_write+0xbc/0x168
| __arm64_sys_write+0x68/0x98
| el0_svc_common+0x100/0x258
| el0_svc_handler+0x48/0xc0
| el0_svc+0x8/0xc
|
| The buggy address belongs to the variable:
| zero+0x0/0x40
|
| Memory state around the buggy address:
| ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa
| ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
| >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
| ^
| ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
| ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Fix the splat by introducing a unsigned long 'zero_ul' and using that
instead.
Link: http://lkml.kernel.org/r/20190403153409.17307-1-will.deacon@arm.com
Fixes: 32a5ad9c22 ("sysctl: handle overflow for file-max")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Christian Brauner <christian@brauner.io>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b59dfdaef1 upstream.
Commit 9ee3e06610 ("HID: i2c-hid: override HID descriptors for certain
devices") added a new dmi_system_id quirk table to override certain HID
report descriptors for some systems that lack them.
But the table wasn't properly terminated, causing the dmi matching to
walk off into la-la-land, and starting to treat random data as dmi
descriptor pointers, causing boot-time oopses if you were at all
unlucky.
Terminate the array.
We really should have some way to just statically check that arrays that
should be terminated by an empty entry actually are so. But the HID
people really should have caught this themselves, rather than have me
deal with an oops during the merge window. Tssk, tssk.
Cc: Julian Sax <jsbc@gmx.de>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ambrož Bizjak <abizjak.pro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba4aa02b41 upstream.
So that we reduce the difference of tools/include/linux/bitops.h to the
original kernel file, include/linux/bitops.h, trying to remove the need
to define BITS_PER_LONG, to avoid clashes with asm/bitsperlong.h.
And the things removed from tools/include/linux/bitops.h are really in
linux/bits.h, so that we can have a copy and then
tools/perf/check_headers.sh will tell us when new stuff gets added to
linux/bits.h so that we can check if it is useful and if any adjustment
needs to be done to the tools/{include,arch}/ copies.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-y1sqyydvfzo0bjjoj4zsl562@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00206a69ee upstream.
Since commit ad67b74d24 ("printk: hash addresses printed with %p"),
at boot "____ptrval____" is printed instead of actual addresses:
percpu: Embedded 38 pages/cpu @(____ptrval____) s124376 r0 d31272 u524288
Instead of changing the print to "%px", and leaking kernel addresses,
just remove the print completely, cfr. e.g. commit 071929dbdd
("arm64: Stop printing the virtual memory layout").
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c2f870890 upstream.
The ALSA proc helper manages the child nodes in a linked list, but its
addition and deletion is done without any lock. This leads to a
corruption if they are operated concurrently. Usually this isn't a
problem because the proc entries are added sequentially in the driver
probe procedure itself. But the card registrations are done often
asynchronously, and the crash could be actually reproduced with
syzkaller.
This patch papers over it by protecting the link addition and deletion
with the parent's mutex. There is "access" mutex that is used for the
file access, and this can be reused for this purpose as well.
Reported-by: syzbot+48df349490c36f9f54ab@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>