[ Upstream commit ffff71328a ]
The location of the RBUF overflow and error counters has moved between
different version of the GENET MAC. This commit corrects the driver to
read from the correct locations depending on the version of the GENET
MAC.
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f138fa609 ]
KMSAN reports a use of uninitialized memory in put_cmsg() because
msg.msg_flags in recvfrom haven't been initialized properly.
The flag values don't affect the result on this path, but it's still a
good idea to initialize them explicitly.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6bbc4a4144 ]
__do_fault assumes vmf->page has been initialized and is valid if
VM_FAULT_NOPAGE is not returned by vma->vm_ops->fault(vma, vmf).
handle_userfault() in turn should return VM_FAULT_NOPAGE if it doesn't
return VM_FAULT_SIGBUS or VM_FAULT_RETRY (the other two possibilities).
This VM_FAULT_NOPAGE case is only invoked when signal are pending and it
didn't matter for anonymous memory before. It only started to matter
since shmem was introduced. hugetlbfs also takes a different path and
doesn't exercise __do_fault.
Link: http://lkml.kernel.org/r/20170228154201.GH5816@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fd567653bd ]
The driver doesn't have a struct of_device_id table but supported devices
are registered via Device Trees. This is working on the assumption that a
I2C device registered via OF will always match a legacy I2C device ID and
that the MODALIAS reported will always be of the form i2c:<device>.
But this could change in the future so the correct approach is to have an
OF device ID table if the devices are registered via OF.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4dca6ea1d9 upstream.
When the request_key() syscall is not passed a destination keyring, it
links the requested key (if constructed) into the "default" request-key
keyring. This should require Write permission to the keyring. However,
there is actually no permission check.
This can be abused to add keys to any keyring to which only Search
permission is granted. This is because Search permission allows joining
the keyring. keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_SESSION_KEYRING)
then will set the default request-key keyring to the session keyring.
Then, request_key() can be used to add keys to the keyring.
Both negatively and positively instantiated keys can be added using this
method. Adding negative keys is trivial. Adding a positive key is a
bit trickier. It requires that either /sbin/request-key positively
instantiates the key, or that another thread adds the key to the process
keyring at just the right time, such that request_key() misses it
initially but then finds it in construct_alloc_key().
Fix this bug by checking for Write permission to the keyring in
construct_get_dest_keyring() when the default keyring is being used.
We don't do the permission check for non-default keyrings because that
was already done by the earlier call to lookup_user_key(). Also,
request_key_and_link() is currently passed a 'struct key *' rather than
a key_ref_t, so the "possessed" bit is unavailable.
We also don't do the permission check for the "requestor keyring", to
continue to support the use case described by commit 8bbf4976b5
("KEYS: Alter use of key instantiation link-to-keyring argument") where
/sbin/request-key recursively calls request_key() to add keys to the
original requestor's destination keyring. (I don't know of any users
who actually do that, though...)
Fixes: 3e30148c3d ("[PATCH] Keys: Make request-key create an authorisation key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d5afec6b8 upstream.
On a ppc64 machine, when mounting a fuzzed ext2 image (generated by
fsfuzzer) the following call trace is seen,
VFS: brelse: Trying to free free buffer
WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40
.__brelse.part.6+0x20/0x40 (unreliable)
.ext4_find_entry+0x384/0x4f0
.ext4_lookup+0x84/0x250
.lookup_slow+0xdc/0x230
.walk_component+0x268/0x400
.path_lookupat+0xec/0x2d0
.filename_lookup+0x9c/0x1d0
.vfs_statx+0x98/0x140
.SyS_newfstatat+0x48/0x80
system_call+0x58/0x6c
This happens because the directory that ext4_find_entry() looks up has
inode->i_size that is less than the block size of the filesystem. This
causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not
reading any of the directory file's blocks. This renders the entries in
bh_use[] array to continue to have garbage data. buffer_uptodate() on
bh_use[0] can then return a zero value upon which brelse() function is
invoked.
This commit fixes the bug by returning -ENOENT when the directory file
has no associated blocks.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c894aa9757 upstream.
Currently, fallocate(2) with KEEP_SIZE followed by a fdatasync(2)
then crash, we'll see wrong allocated block number (stat -c %b), the
blocks allocated beyond EOF are all lost. fstests generic/468
exposes this bug.
Commit 67a7d5f561 ("ext4: fix fdatasync(2) after extent
manipulation operations") fixed all the other extent manipulation
operation paths such as hole punch, zero range, collapse range etc.,
but forgot the fallocate case.
So similarly, fix it by recording the correct journal tid in ext4
inode in fallocate(2) path, so that ext4_sync_file() will wait for
the right tid to be committed on fdatasync(2).
This addresses the test failure in xfstests test generic/468.
Signed-off-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f6a23a213 upstream.
Commit adfa543e73 ("dmatest: don't use set_freezable_with_signal()")
introduced a bug (that is in fact documented by the patch commit text)
that leaves behind a dangling pointer. Since the done_wait structure is
allocated on the stack, future invocations to the DMATEST can produce
undesirable results (e.g., corrupted spinlocks).
Commit a9df21e34b ("dmaengine: dmatest: warn user when dma test times
out") attempted to WARN the user that the stack was likely corrupted but
did not fix the actual issue.
This patch fixes the issue by pushing the wait queue and callback
structs into the the thread structure. If a failure occurs due to time,
dmaengine_terminate_all will force the callback to safely call
wake_up_all() without possibility of using a freed pointer.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605
Fixes: adfa543e73 ("dmatest: don't use set_freezable_with_signal()")
Reviewed-by: Sinan Kaya <okaya@codeaurora.org>
Suggested-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f73c52a5bc upstream.
Daniel Wagner reported a crash on the BeagleBone Black SoC.
This is a single CPU architecture, and does not have a functional
arch_send_call_function_single_ipi() implementation which can crash
the kernel if that is called.
As it only has one CPU, it shouldn't be called, but if the kernel is
compiled for SMP, the push/pull RT scheduling logic now calls it for
irq_work if the one CPU is overloaded, it can use that function to call
itself and crash the kernel.
Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
only has a single CPU. But SCHED_FEAT is a constant if sched debugging
is turned off. Another fix can also be used, and this should also help
with normal SMP machines. That is, do not initiate the pull code if
there's only one RT overloaded CPU, and that CPU happens to be the
current CPU that is scheduling in a lower priority task.
Even on a system with many CPUs, if there's many RT tasks waiting to
run on a single CPU, and that CPU schedules in another RT task of lower
priority, it will initiate the PULL logic in case there's a higher
priority RT task on another CPU that is waiting to run. But if there is
no other CPU with waiting RT tasks, it will initiate the RT pull logic
on itself (as it still has RT tasks waiting to run). This is a wasted
effort.
Not only does this help with SMP code where the current CPU is the only
one with RT overloaded tasks, it should also solve the issue that
Daniel encountered, because it will prevent the PULL logic from
executing, as there's only one CPU on the system, and the check added
here will cause it to exit the RT pull code.
Reported-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d9b70f7d5 upstream.
Avoid null pointer dereference if some function is walking through the
devs array accessing members of a new virt_dev that is mid allocation.
Add the virt_dev to xhci->devs[i] _after_ the virt_device and all its
members are properly allocated.
issue found by KASAN: null-ptr-deref in xhci_find_slot_id_by_port
"Quick analysis suggests that xhci_alloc_virt_device() is not mutex
protected. If so, there is a time frame where xhci->devs[slot_id] is set
but not fully initialized. Specifically, xhci->devs[i]->udev can be NULL."
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0085f2510 upstream.
BT-Controller connected as platform non-root-hub device and
usb-driver initialize such device with wakeup disabled,
Ref. usb_new_device().
At present wakeup-capability get enabled by hid-input device from usb
function driver(e.g. BT HID device) at runtime. Again some functional
driver does not set usb-wakeup capability(e.g LE HID device implement
as HID-over-GATT), and can't wakeup the host on USB.
Most of the device operation (such as mass storage) initiated from host
(except HID) and USB wakeup aligned with host resume procedure. For BT
device, usb-wakeup capability need to enable form btusc driver as a
generic solution for multiple profile use case and required for USB remote
wakeup (in-bus wakeup) while host is suspended. Also usb-wakeup feature
need to enable/disable with HCI interface up and down.
Signed-off-by: Sukumar Ghorai <sukumar.ghorai@intel.com>
Signed-off-by: Amit K Bag <amit.k.bag@intel.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be6123df1e upstream.
stub_send_ret_submit() handles urb with a potential null transfer_buffer,
when it replays a packet with potential malicious data that could contain
a null buffer. Add a check for the condition when actual_length > 0 and
transfer_buffer is null.
Reported-by: Secunia Research <vuln@secunia.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48a4ff1c7b upstream.
A malicious USB device with crafted descriptors can cause the kernel
to access unallocated memory by setting the bNumInterfaces value too
high in a configuration descriptor. Although the value is adjusted
during parsing, this adjustment is skipped in one of the error return
paths.
This patch prevents the problem by setting bNumInterfaces to 0
initially. The existing code already sets it to the proper value
after parsing is complete.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6235445462 upstream.
There is another JMS567-based USB3 UAS enclosure (152d:0578) that fails
with the following error:
[sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[sda] tag#0 Sense Key : Illegal Request [current]
[sda] tag#0 Add. Sense: Invalid field in cdb
The issue occurs both with UAS (occasionally) and mass storage
(immediately after mounting a FS on a disk in the enclosure).
Enabling US_FL_BROKEN_FUA quirk solves this issue.
This patch adds an UNUSUAL_DEV with US_FL_BROKEN_FUA for the enclosure
for both UAS and mass storage.
Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90e406f96f upstream.
The default NR_CPUS can be very large, but actual possible nr_cpu_ids
usually is very small. For my x86 distribution, the NR_CPUS is 8192 and
nr_cpu_ids is 4. About 2 pages are wasted.
Most machines don't have so many CPUs, so define a array with NR_CPUS
just wastes memory. So let's allocate the buffer dynamically when need.
With this change, the mutext tracing_cpumask_update_lock also can be
removed now, which was used to protect mask_str.
Link: http://lkml.kernel.org/r/1512013183-19107-1-git-send-email-changbin.du@intel.com
Fixes: 36dfe9252b ("ftrace: make use of tracing_cpumask")
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecaaab5649 upstream.
When asked to encrypt or decrypt 0 bytes, both the generic and x86
implementations of Salsa20 crash in blkcipher_walk_done(), either when
doing 'kfree(walk->buffer)' or 'free_page((unsigned long)walk->page)',
because walk->buffer and walk->page have not been initialized.
The bug is that Salsa20 is calling blkcipher_walk_done() even when
nothing is in 'walk.nbytes'. But blkcipher_walk_done() is only meant to
be called when a nonzero number of bytes have been provided.
The broken code is part of an optimization that tries to make only one
call to salsa20_encrypt_bytes() to process inputs that are not evenly
divisible by 64 bytes. To fix the bug, just remove this "optimization"
and use the blkcipher_walk API the same way all the other users do.
Reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int algfd, reqfd;
struct sockaddr_alg addr = {
.salg_type = "skcipher",
.salg_name = "salsa20",
};
char key[16] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));
reqfd = accept(algfd, 0, 0);
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
read(reqfd, key, sizeof(key));
}
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: eb6f13eb9f ("[CRYPTO] salsa20_generic: Fix multi-page processing")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af3ff8045b upstream.
Because the HMAC template didn't check that its underlying hash
algorithm is unkeyed, trying to use "hmac(hmac(sha3-512-generic))"
through AF_ALG or through KEYCTL_DH_COMPUTE resulted in the inner HMAC
being used without having been keyed, resulting in sha3_update() being
called without sha3_init(), causing a stack buffer overflow.
This is a very old bug, but it seems to have only started causing real
problems when SHA-3 support was added (requires CONFIG_CRYPTO_SHA3)
because the innermost hash's state is ->import()ed from a zeroed buffer,
and it just so happens that other hash algorithms are fine with that,
but SHA-3 is not. However, there could be arch or hardware-dependent
hash algorithms also affected; I couldn't test everything.
Fix the bug by introducing a function crypto_shash_alg_has_setkey()
which tests whether a shash algorithm is keyed. Then update the HMAC
template to require that its underlying hash algorithm is unkeyed.
Here is a reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
int main()
{
int algfd;
struct sockaddr_alg addr = {
.salg_type = "hash",
.salg_name = "hmac(hmac(sha3-512-generic))",
};
char key[4096] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (const struct sockaddr *)&addr, sizeof(addr));
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
}
Here was the KASAN report from syzbot:
BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:341 [inline]
BUG: KASAN: stack-out-of-bounds in sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161
Write of size 4096 at addr ffff8801cca07c40 by task syzkaller076574/3044
CPU: 1 PID: 3044 Comm: syzkaller076574 Not tainted 4.14.0-mm1+ #25
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x25b/0x340 mm/kasan/report.c:409
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
check_memory_region+0x137/0x190 mm/kasan/kasan.c:267
memcpy+0x37/0x50 mm/kasan/kasan.c:303
memcpy include/linux/string.h:341 [inline]
sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161
crypto_shash_update+0xcb/0x220 crypto/shash.c:109
shash_finup_unaligned+0x2a/0x60 crypto/shash.c:151
crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
hmac_finup+0x182/0x330 crypto/hmac.c:152
crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
shash_digest_unaligned+0x9e/0xd0 crypto/shash.c:172
crypto_shash_digest+0xc4/0x120 crypto/shash.c:186
hmac_setkey+0x36a/0x690 crypto/hmac.c:66
crypto_shash_setkey+0xad/0x190 crypto/shash.c:64
shash_async_setkey+0x47/0x60 crypto/shash.c:207
crypto_ahash_setkey+0xaf/0x180 crypto/ahash.c:200
hash_setkey+0x40/0x90 crypto/algif_hash.c:446
alg_setkey crypto/af_alg.c:221 [inline]
alg_setsockopt+0x2a1/0x350 crypto/af_alg.c:254
SYSC_setsockopt net/socket.c:1851 [inline]
SyS_setsockopt+0x189/0x360 net/socket.c:1830
entry_SYSCALL_64_fastpath+0x1f/0x96
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5553b142be upstream.
VTTBR_BADDR_MASK is used to sanity check the size and alignment of the
VTTBR address. It seems to currently be off by one, thereby only
allowing up to 39-bit addresses (instead of 40-bit) and also
insufficiently checking the alignment. This patch fixes it.
This patch is the 32bit pendent of Kristina's arm64 fix, and
she deserves the actual kudos for pinpointing that one.
Fixes: f7ed45be3b ("KVM: ARM: World-switch implementation")
Cc: <stable@vger.kernel.org> # 3.9
Reported-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkaller found a race condition fanout_demux_rollover() while removing
a packet socket from a fanout group.
po->rollover is read and operated on during packet_rcv_fanout(), via
fanout_demux_rollover(), but the pointer is currently cleared before the
synchronization in packet_release(). It is safer to delay the cleanup
until after synchronize_net() has been called, ensuring all calls to
packet_rcv_fanout() for this socket have finished.
To further simplify synchronization around the rollover structure, set
po->rollover in fanout_add() only if there are no errors. This removes
the need for rcu in the struct and in the call to
packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).
Crashing stack trace:
fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
xmit_one net/core/dev.c:2975 [inline]
dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
__dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
neigh_output include/net/neighbour.h:482 [inline]
ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
NF_HOOK_COND include/linux/netfilter.h:239 [inline]
ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
dst_output include/net/dst.h:459 [inline]
NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x35e/0x430 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432
Fixes: 0648ab70af ("packet: rollover prepare: per-socket state")
Fixes: 509c7a1ecc ("packet: avoid panic in packet_getsockopt()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Mike Maloney <maloney@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a7d5f107b4 ]
When the function tipc_accept_from_sock() fails to create an instance of
struct tipc_subscriber it omits to free the already created instance of
struct tipc_conn instance before it returns.
We fix that with this commit.
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b04e8f6bb upstream.
we need to take care of failure exit as well - pages already
in bio should be dropped by analogue of bio_unmap_pages(),
since their refcounts had been bumped only once per reference
in bio.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fbbd7f1a51 upstream.
The switch_to() macro has an optimization to avoid saving and
restoring register contents that aren't needed for kernel threads.
There is however the possibility that a kernel thread execve's a user
space program. In such a case the execve'd process can partially see
the contents of the previous process, which shouldn't be allowed.
To avoid this, simply always save and restore register contents on
context switch.
Cc: <stable@vger.kernel.org> # v2.6.37+
Fixes: fdb6d070ef ("switch_to: dont restore/save access & fpu regs for kernel threads")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f7f5551a7 upstream.
System may crash after unloading ipmi_si.ko module
because a timer may remain and fire after the module cleaned up resources.
cleanup_one_si() contains the following processing.
/*
* Make sure that interrupts, the timer and the thread are
* stopped and will not run again.
*/
if (to_clean->irq_cleanup)
to_clean->irq_cleanup(to_clean);
wait_for_timer_and_thread(to_clean);
/*
* Timeouts are stopped, now make sure the interrupts are off
* in the BMC. Note that timers and CPU interrupts are off,
* so no need for locks.
*/
while (to_clean->curr_msg || (to_clean->si_state != SI_NORMAL)) {
poll(to_clean);
schedule_timeout_uninterruptible(1);
}
si_state changes as following in the while loop calling poll(to_clean).
SI_GETTING_MESSAGES
=> SI_CHECKING_ENABLES
=> SI_SETTING_ENABLES
=> SI_GETTING_EVENTS
=> SI_NORMAL
As written in the code comments above,
timers are expected to stop before the polling loop and not to run again.
But the timer is set again in the following process
when si_state becomes SI_SETTING_ENABLES.
=> poll
=> smi_event_handler
=> handle_transaction_done
// smi_info->si_state == SI_SETTING_ENABLES
=> start_getting_events
=> start_new_msg
=> smi_mod_timer
=> mod_timer
As a result, before the timer set in start_new_msg() expires,
the polling loop may see si_state becoming SI_NORMAL
and the module clean-up finishes.
For example, hard LOCKUP and panic occurred as following.
smi_timeout was called after smi_event_handler,
kcs_event and hangs at port_inb()
trying to access I/O port after release.
[exception RIP: port_inb+19]
RIP: ffffffffc0473053 RSP: ffff88069fdc3d80 RFLAGS: 00000006
RAX: ffff8806800f8e00 RBX: ffff880682bd9400 RCX: 0000000000000000
RDX: 0000000000000ca3 RSI: 0000000000000ca3 RDI: ffff8806800f8e40
RBP: ffff88069fdc3d80 R8: ffffffff81d86dfc R9: ffffffff81e36426
R10: 00000000000509f0 R11: 0000000000100000 R12: 0000000000]:000000
R13: 0000000000000000 R14: 0000000000000246 R15: ffff8806800f8e00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
--- <NMI exception stack> ---
To fix the problem I defined a flag, timer_can_start,
as member of struct smi_info.
The flag is enabled immediately after initializing the timer
and disabled immediately before waiting for timer deletion.
Fixes: 0cfec916e8 ("ipmi: Start the timer and thread on internal msgs")
Signed-off-by: Yamazaki Masamitsu <m-yamazaki@ah.jp.nec.com>
[Adjusted for recent changes in the driver.]
[Some fairly major changes went into the IPMI driver in 4.15, so this
required a backport as the code had changed and moved to a different
file. The 4.14 version of this patch moved some code under an
if statement and there was an API change causing it to not apply to
4.4-4.6.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 173743dd99 ]
Prior to this patch we enabled audit in audit_init(), which is too
late for PID 1 as the standard initcalls are run after the PID 1 task
is forked. This means that we never allocate an audit_context (see
audit_alloc()) for PID 1 and therefore miss a lot of audit events
generated by PID 1.
This patch enables audit as early as possible to help ensure that when
PID 1 is forked it can allocate an audit_context if required.
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4b3526d83 ]
The handler for the CB.ProbeUuid operation in the cache manager is
implemented, but isn't listed in the switch-statement of operation
selection, so won't be used. Fix this by adding it.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 31fde034a8 ]
The UMR's QP is created by calling mlx5_ib_create_qp directly, and
therefore the send CQ and the recv CQ on the ibqp weren't assigned.
Assign them right after calling the mlx5_ib_create_qp to assure
that any access to those pointers will work as expected and won't
crash the system as might happen as part of reset flow.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5f22a1d87c ]
Maximal message should be used as a limit to the max message payload allowed,
without the headers. The ConnectX-3 check is done against this value includes
the headers. When the payload is 4K this will cause the NIC to drop packets.
Increase maximal message to 8K as workaround, this shouldn't change current
behaviour because we continue to set the MTU to 4k.
To reproduce;
set MTU to 4296 on the corresponding interface, for example:
ifconfig eth0 mtu 4296 (both server and client)
On server:
ib_send_bw -c UD -d mlx4_0 -s 4096 -n 1000000 -i1 -m 4096
On client:
ib_send_bw -d mlx4_0 -c UD <server_ip> -s 4096 -n 1000000 -i 1 -m 4096
Fixes: 6e0d733d92 ("IB/mlx4: Allow 4K messages for UD QPs")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e74aa1d79 ]
The syzbot found an ancient bug in the IPsec code. When we cloned
a socket policy (for example, for a child TCP socket derived from a
listening socket), we did not copy the family field. This results
in a live policy with a zero family field. This triggers a BUG_ON
check in the af_key code when the cloned policy is retrieved.
This patch fixes it by copying the family field over.
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92ee46efeb ]
Fengguang Wu reported that running the rcuperf test during boot can cause
the jump_label_test() to hit a WARN_ON(). The issue is that the core jump
label code relies on kernel_text_address() to detect when it can no longer
update branches that may be contained in __init sections. The
kernel_text_address() in turn assumes that if the system_state variable is
greter than or equal to SYSTEM_RUNNING then __init sections are no longer
valid (since the assumption is that they have been freed). However, when
rcuperf is setup to run in early boot it can call kernel_power_off() which
sets the system_state to SYSTEM_POWER_OFF.
Since rcuperf initialization is invoked via a module_init(), we can make
the dependency of jump_label_test() needing to complete before rcuperf
explicit by calling it via early_initcall().
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1510609727-2238-1-git-send-email-jbaron@akamai.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bde533f2ea ]
atm_dev_register() can fail here and passed parameters to free irq
which is not initialised. Initialization of 'dev->irq' happened after
the 'goto out_free_irq'. So using 'irq' insted of 'dev->irq' in
free_irq().
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cea0cc80a6 ]
Commit dfcb9f4f99 ("sctp: deny peeloff operation on asocs with threads
sleeping on it") fixed the race between peeloff and wait sndbuf by
checking waitqueue_active(&asoc->wait) in sctp_do_peeloff().
But it actually doesn't work, as even if waitqueue_active returns false
the waiting sndbuf thread may still not yet hold sk lock. After asoc is
peeled off, sk is not asoc->base.sk any more, then to hold the old sk
lock couldn't make assoc safe to access.
This patch is to fix this by changing to hold the new sk lock if sk is
not asoc->base.sk, meanwhile, also set the sk in sctp_sendmsg with the
new sk.
With this fix, there is no more race between peeloff and waitbuf, the
check 'waitqueue_active' in sctp_do_peeloff can be removed.
Thanks Marcelo and Neil for making this clear.
v1->v2:
fix it by changing to lock the new sock instead of adding a flag in asoc.
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ca3af4dd28 ]
Now in sctp_sendmsg sctp_wait_for_sndbuf could schedule out without
holding sock sk. It means the current asoc can be freed elsewhere,
like when receiving an abort packet.
If the asoc is just created in sctp_sendmsg and sctp_wait_for_sndbuf
returns err, the asoc will be freed again due to new_asoc is not nil.
An use-after-free issue would be triggered by this.
This patch is to fix it by setting new_asoc with nil if the asoc is
already dead when cpu schedules back, so that it will not be freed
again in sctp_sendmsg.
v1->v2:
set new_asoc as nil in sctp_sendmsg instead of sctp_wait_for_sndbuf.
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2a20aa1710 ]
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().
With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:
mem_init() {
register_page_bootmem_info();
free_all_bootmem();
...
}
When register_page_bootmem_info() is called only non-deferred struct
pages are initialized. But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.
mem_init
register_page_bootmem_info
register_page_bootmem_info_node
get_page_bootmem
.. setting fields here ..
such as: page->freelist = (void *)type;
free_all_bootmem()
free_low_memory_core_early()
for_each_reserved_mem_region()
reserve_bootmem_region()
init_reserved_page() <- Only if this is deferred reserved page
__init_single_pfn()
__init_single_page()
memset(0) <-- Loose the set fields here
We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed. But, if flag asserts are
changed we can start hitting issues.
Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.
The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.
Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 34d9715ac1 ]
Once blk_set_queue_dying() is done in blk_cleanup_queue(), we call
blk_freeze_queue() and wait for q->q_usage_counter becoming zero. But
if there are tasks blocked in get_request(), q->q_usage_counter can
never become zero. So we have to wake up all these tasks in
blk_set_queue_dying() first.
Fixes: 3ef28e83ab ("block: generic request_queue reference counting")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>