[ Upstream commit e58c191241 ]
Slip_open doesn't clean-up device which registration failed from the
slip_devs device list. On next open after failure this list is iterated
and freed device is accessed. Fix this by calling sl_free_netdev in error
path.
Here is the trace from the Syzbot:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
__kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:634
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
sl_sync drivers/net/slip/slip.c:725 [inline]
slip_open+0xecd/0x11b7 drivers/net/slip/slip.c:801
tty_ldisc_open.isra.0+0xa3/0x110 drivers/tty/tty_ldisc.c:469
tty_set_ldisc+0x30e/0x6b0 drivers/tty/tty_ldisc.c:596
tiocsetd drivers/tty/tty_io.c:2334 [inline]
tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2594
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:696
ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 3b5a39979d ("slip: Fix memory leak in slip_open error path")
Reported-by: syzbot+4d5170758f3762109542@syzkaller.appspotmail.com
Cc: David Miller <davem@davemloft.net>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4e81c0b3fa ]
When user-space sets the OVS_UFID_F_OMIT_* flags, and the relevant
flow has no UFID, we can exceed the computed size, as
ovs_nla_put_identifier() will always dump an OVS_FLOW_ATTR_KEY
attribute.
Take the above in account when computing the flow command message
size.
Fixes: 74ed7ab926 ("openvswitch: Add support for unique flow IDs.")
Reported-by: Qi Jun Ding <qding@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 14e54ab914 ]
When a classful qdisc's child qdisc has set the flag
TCQ_F_CPUSTATS (pfifo_fast for example), the child qdisc's
cpu_bstats should be passed to gnet_stats_copy_basic(),
but many classful qdisc didn't do that. As a result,
`tc -s class show dev DEV` always return 0 for bytes and
packets in this case.
Pass the child qdisc's cpu_bstats to gnet_stats_copy_basic()
to fix this issue.
The qstats also has this problem, but it has been fixed
in 5dd431b6b9 ("net: sched: introduce and use qstats read...")
and bstats still remains buggy.
Fixes: 22e0f8b932 ("net: sched: make bstats per cpu and estimator RCU safe")
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9bca3a0a92 ]
This function was using configuration of port 0 in devicetree for all ports.
In case CPU port was not 0, the delay settings was ignored. This resulted not
working communication between CPU and the switch.
Fixes: f5b8631c29 ("net: dsa: sja1105: Error out if RGMII delays are requested in DT")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 32085f25d7 ]
Geert Uytterhoeven reported that using devm_reset_controller_get leads
to a WARNING when probing a reset-controlled PHY. This is because the
device devm_reset_controller_get gets supplied is not actually the
one being probed.
Acquire an unmanaged reset-control as well as free the reset_control on
unregister to fix this.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David Bauer <mail@david-bauer.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1d7ea55668 ]
While enqueueing a broadcast skb to port->bc_queue, schedule_work()
is called to add port->bc_work, which processes the skbs in
bc_queue, to "events" work queue. If port->bc_queue is full, the
skb will be discarded and schedule_work(&port->bc_work) won't be
called. However, if port->bc_queue is full and port->bc_work is not
running or pending, port->bc_queue will keep full and schedule_work()
won't be called any more, and all broadcast skbs to macvlan will be
discarded. This case can happen:
macvlan_process_broadcast() is the pending function of port->bc_work,
it moves all the skbs in port->bc_queue to the queue "list", and
processes the skbs in "list". During this, new skbs will keep being
added to port->bc_queue in macvlan_broadcast_enqueue(), and
port->bc_queue may already full when macvlan_process_broadcast()
return. This may happen, especially when there are a lot of real-time
threads and the process is preempted.
Fix this by calling schedule_work(&port->bc_work) even if
port->bc_work is full in macvlan_broadcast_enqueue().
Fixes: 412ca1550c ("macvlan: Move broadcasts into a work queue")
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a95069ecb7 ]
In gve_alloc_queue_page_list(), when a page allocation fails,
qpl->num_entries will be wrong. In this case priv->num_registered_pages
can underflow in gve_free_queue_page_list(), causing subsequent calls
to gve_alloc_queue_page_list() to fail.
Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59c4bd853a upstream.
The state/owner of the FPU is saved to fpu_fpregs_owner_ctx by pointing
to the context that is currently loaded. It never changed during the
lifetime of a task - it remained stable/constant.
After deferred FPU registers loading until return to userland was
implemented, the content of fpu_fpregs_owner_ctx may change during
preemption and must not be cached.
This went unnoticed for some time and was now noticed, in particular
since gcc 9 is caching that load in copy_fpstate_to_sigframe() and
reusing it in the retry loop:
copy_fpstate_to_sigframe()
load fpu_fpregs_owner_ctx and save on stack
fpregs_lock()
copy_fpregs_to_sigframe() /* failed */
fpregs_unlock()
*** PREEMPTION, another uses FPU, changes fpu_fpregs_owner_ctx ***
fault_in_pages_writeable() /* succeed, retry */
fpregs_lock()
__fpregs_load_activate()
fpregs_state_valid() /* uses fpu_fpregs_owner_ctx from stack */
copy_fpregs_to_sigframe() /* succeeds, random FPU content */
This is a comparison of the assembly produced by gcc 9, without vs with this
patch:
| # arch/x86/kernel/fpu/signal.c:173: if (!access_ok(buf, size))
| cmpq %rdx, %rax # tmp183, _4
| jb .L190 #,
|-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|-#APP
|-# 512 "arch/x86/include/asm/fpu/internal.h" 1
|- movq %gs:fpu_fpregs_owner_ctx,%rax #, pfo_ret__
|-# 0 "" 2
|-#NO_APP
|- movq %rax, -88(%rbp) # pfo_ret__, %sfp
…
|-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|- movq -88(%rbp), %rcx # %sfp, pfo_ret__
|- cmpq %rcx, -64(%rbp) # pfo_ret__, %sfp
|+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|+#APP
|+# 512 "arch/x86/include/asm/fpu/internal.h" 1
|+ movq %gs:fpu_fpregs_owner_ctx(%rip),%rax # fpu_fpregs_owner_ctx, pfo_ret__
|+# 0 "" 2
|+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|+#NO_APP
|+ cmpq %rax, -64(%rbp) # pfo_ret__, %sfp
Use this_cpu_read() instead this_cpu_read_stable() to avoid caching of
fpu_fpregs_owner_ctx during preemption points.
The Fixes: tag points to the commit where deferred FPU loading was
added. Since this commit, the compiler is no longer allowed to move the
load of fpu_fpregs_owner_ctx somewhere else / outside of the locked
section. A task preemption will change its value and stale content will
be observed.
[ bp: Massage. ]
Debugged-by: Austin Clements <austin@google.com>
Debugged-by: David Chase <drchase@golang.org>
Debugged-by: Ian Lance Taylor <ian@airs.com>
Fixes: 5f409e20b7 ("x86/fpu: Defer FPU state load until return to userspace")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Austin Clements <austin@google.com>
Cc: Barret Rhoden <brho@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Chase <drchase@golang.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: ian@airs.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Bleecher Snyder <josharian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191128085306.hxfa2o3knqtu4wfn@linutronix.de
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205663
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a7ebfa85f upstream.
On zang's Dell XPS 13 9370 after Thunderbolt NVM firmware upgrade the
Thunderbolt controller did not come back as expected. Only after the
system was rebooted it became available again. It is not entirely clear
what happened but I suspect the new NVM firmware image authentication
failed for some reason. Regardless of this the router needs to be power
cycled if NVM authentication fails in order to get it fully functional
again.
This modifies the driver to issue a power cycle in case the NVM
authentication fails immediately when dma_port_flash_update_auth()
returns. We also need to call tb_switch_set_uuid() earlier to be able to
fetch possible NVM authentication failure when DMA port is added.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205457
Reported-by: zang <dump@tzib.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6689f0f4bb upstream.
Testing on different generations of Lantiq MIPS SoC based boards, showed
that it takes up to 1500 us until the core reset bit is cleared.
The driver from the vendor SDK (ifxhcd) uses a 1 second timeout. Use the
same timeout to fix wrong hang detections and make the driver work for
Lantiq MIPS SoCs.
At least till kernel 4.14 the hanging reset only caused a warning but
the driver was probed successful. With kernel 4.19 errors out with
EBUSY.
Cc: linux-stable <stable@vger.kernel.org> # 4.19+
Signed-off-by: Mathias Kresin <dev@kresin.me>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 492c88720d upstream.
platform_find_device_by_driver calls bus_find_device and passes
platform_match as the callback function. Casting the function to a
mismatching type trips indirect call Control-Flow Integrity (CFI) checking.
This change adds a callback function with the correct type and instead
of casting the function, explicitly casts the second parameter to struct
device_driver* as expected by platform_match.
Fixes: 36f3313d6b ("platform: Add platform_find_device_by_driver() helper")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191112214156.3430-1-samitolvanen@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8c5d882c8 upstream.
This patch corrects an error in the Transform Record Cache initialization
code that was causing intermittent stability problems on the Macchiatobin
board.
Unfortunately, due to HW platform specifics, the problem could not happen
on the main development platform, being the VCU118 Xilinx development
board. And since it was a problem with hash table access, it was very
dependent on the actual physical context record DMA buffers being used,
i.e. with some (bad) luck it could seemingly work quit stable for a while.
Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d69e07793f ]
Only io_uring uses (and added) these, and we want to disallow the
use of sendmsg/recvmsg for anything but regular data transfers.
Use the newly added prep helper to split the msghdr copy out from
the core function, to check for msg_control and msg_controllen
settings. If either is set, we return -EINVAL.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4257c8ca13 ]
This is in preparation for enabling the io_uring helpers for sendmsg
and recvmsg to first copy the header for validation before continuing
with the operation.
There should be no functional changes in this patch.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit af2e8c68b9 upstream.
On some systems that are vulnerable to Spectre v2, it is up to
software to flush the link stack (return address stack), in order to
protect against Spectre-RSB.
When exiting from a guest we do some house keeping and then
potentially exit to C code which is several stack frames deep in the
host kernel. We will then execute a series of returns without
preceeding calls, opening up the possiblity that the guest could have
poisoned the link stack, and direct speculative execution of the host
to a gadget of some sort.
To prevent this we add a flush of the link stack on exit from a guest.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39e72bf96f upstream.
In commit ee13cb249f ("powerpc/64s: Add support for software count
cache flush"), I added support for software to flush the count
cache (indirect branch cache) on context switch if firmware told us
that was the required mitigation for Spectre v2.
As part of that code we also added a software flush of the link
stack (return address stack), which protects against Spectre-RSB
between user processes.
That is all correct for CPUs that activate that mitigation, which is
currently Power9 Nimbus DD2.3.
What I got wrong is that on older CPUs, where firmware has disabled
the count cache, we also need to flush the link stack on context
switch.
To fix it we create a new feature bit which is not set by firmware,
which tells us we need to flush the link stack. We set that when
firmware tells us that either of the existing Spectre v2 mitigations
are enabled.
Then we adjust the patching code so that if we see that feature bit we
enable the link stack flush. If we're also told to flush the count
cache in software then we fall through and do that also.
On the older CPUs we don't need to do do the software count cache
flush, firmware has disabled it, so in that case we patch in an early
return after the link stack flush.
The naming of some of the functions is awkward after this patch,
because they're called "count cache" but they also do link stack. But
we'll fix that up in a later commit to ease backporting.
This is the fix for CVE-2019-18660.
Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Fixes: ee13cb249f ("powerpc/64s: Add support for software count cache flush")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5618332e5b upstream.
The userspace comedilib function 'get_cmd_generic_timed' fills
the cmd structure with an informed guess and then calls the
function 'usbduxfast_ai_cmdtest' in this driver repeatedly while
'usbduxfast_ai_cmdtest' is modifying the cmd struct until it
no longer changes. However, because of rounding errors this never
converged because 'steps = (cmd->convert_arg * 30) / 1000' and then
back to 'cmd->convert_arg = (steps * 1000) / 30' won't be the same
because of rounding errors. 'Steps' should only be converted back to
the 'convert_arg' if 'steps' has actually been modified. In addition
the case of steps being 0 wasn't checked which is also now done.
Signed-off-by: Bernd Porr <mail@berndporr.me.uk>
Cc: <stable@vger.kernel.org> # 4.4+
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20191118230759.1727-1-mail@berndporr.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92fe35fb9c upstream.
The driver was setting the device remote-wakeup feature during probe in
violation of the USB specification (which says it should only be set
just prior to suspending the device). This could potentially waste
power during suspend as well as lead to spurious wakeups.
Note that USB core would clear the remote-wakeup feature at first
resume.
Fixes: 3f5429746d ("USB: Moschip 7840 USB-Serial Driver")
Cc: stable <stable@vger.kernel.org> # 2.6.19
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea422312a4 upstream.
The driver was setting the device remote-wakeup feature during probe in
violation of the USB specification (which says it should only be set
just prior to suspending the device). This could potentially waste
power during suspend as well as lead to spurious wakeups.
Note that USB core would clear the remote-wakeup feature at first
resume.
Fixes: 0f64478cbc ("USB: add USB serial mos7720 driver")
Cc: stable <stable@vger.kernel.org> # 2.6.19
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e696d00e65 upstream.
Add USB ID for MOXA UPort 2210. This device contains mos7820 but
it passes GPIO0 check implemented by driver and it's detected as
mos7840. Hence product id check is added to force mos7820 mode.
Signed-off-by: Pavel Löbl <pavel@loebl.cz>
Cc: stable <stable@vger.kernel.org>
[ johan: rename id defines and add vendor-id check ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a9125317b upstream.
Smatch reported that nents is not initialized and used in
stub_recv_cmd_submit(). nents is currently initialized by sgl_alloc()
and used to allocate multiple URBs when host controller doesn't
support scatter-gather DMA. The use of uninitialized nents means that
buf_len is zero and use_sg is true. But buffer length should not be
zero when an URB uses scatter-gather DMA.
To prevent this situation, add the conditional that checks buf_len
and use_sg. And move the use of nents right after the sgl_alloc() to
avoid the use of uninitialized nents.
If the error occurs, it adds SDEV_EVENT_ERROR_MALLOC and stub_priv
will be released by stub event handler and connection will be shut
down.
Fixes: ea44d19076 ("usbip: Implement SG support to vhci-hcd and stub driver")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191111141035.27788-1-suwan.kim027@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a858e79c9 upstream.
The old Nvidia chips have multiple HD-audio codecs on the same
HD-audio controller, and this doesn't work as expected with the current
audio component binding that is implemented under the one-codec-per-
controller assumption; at the probe time, the driver leads to several
kernel WARNING messages.
For the proper support, we may change the pin2port and port2pin to
traverse the codec list per the given pin number, but this needs more
development and testing.
As a quick workaround, instead, this patch drops the binding in the
audio side for these legacy chips since the audio component support in
nouveau graphics driver is still not merged (hence it's basically
unused).
[ Unlike the original commit, this patch actually disables the audio
component binding for all Nvidia chips, not only for legacy chips.
It doesn't matter much, though: nouveau gfx driver still doesn't
provide the audio component binding on 5.4.y, so it's only a
placeholder for now. Also, another difference from the original
commit is that this removes the nvhdmi_audio_ops and other
definitions completely in order to avoid a compile warning due to
unused stuff. -- tiwai ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205625
Fixes: ade49db337 ("ALSA: hda/hdmi - Allow audio component for AMD/ATI and Nvidia HDMI")
Link: https://lore.kernel.org/r/20191122132000.4460-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e43148645d upstream.
Fix multiple cases of out of bounds (OOB) read associated with
MCE device receive/input data handling.
In reference for the OOB cases below, the incoming/read (byte) data
format when the MCE device responds to a command is:
{ cmd_prefix, subcmd, data0, data1, ... }
where cmd_prefix are:
MCE_CMD_PORT_SYS
MCE_CMD_PORT_IR
and subcmd examples are:
MCE_RSP_GETPORTSTATUS
MCE_RSP_EQIRNUMPORTS
...
Response size dynamically depends on cmd_prefix and subcmd.
So data0, data1, ... may or may not be present on input.
Multiple responses may return in a single receiver buffer.
The trigger condition for OOB read is typically random or
corrupt input data that fills the mceusb receiver buffer.
Case 1:
mceusb_handle_command() reads data0 (var hi) and data1 (var lo)
regardless of whether the response includes such data.
If { cmd_prefix, subcmd } is at the end of the receiver buffer,
read past end of buffer occurs.
This case was reported by
KASAN: slab-out-of-bounds Read in mceusb_dev_recv
https://syzkaller.appspot.com/bug?extid=c7fdb6cb36e65f2fe8c9
Fix: In mceusb_handle_command(), change variable hi and lo to
pointers, and dereference only when required.
Case 2:
If response with data is truncated at end of buffer after
{ cmd_prefix, subcmd }, mceusb_handle_command() reads past
end of buffer for data0, data1, ...
Fix: In mceusb_process_ir_data(), check response size with
remaining buffer size before invoking mceusb_handle_command().
+ if (i + ir->rem < buf_len)
mceusb_handle_command(ir, &ir->buf_in[i - 1]);
Case 3:
mceusb_handle_command() handles invalid/bad response such as
{ 0x??, MCE_RSP_GETPORTSTATUS } of length 2 as a response
{ MCE_CMD_PORT_SYS, MCE_RSP_GETPORTSTATUS, data0, ... }
of length 7. Read OOB occurs for non-existent data0, data1, ...
Cause is mceusb_handle_command() does not check cmd_prefix value.
Fix: mceusb_handle_command() must test both cmd_prefix and subcmd.
Case 4:
mceusb_process_ir_data() receiver parser state SUBCMD is
possible at start (i=0) of receiver buffer resulting in buffer
offset=-1 passed to mceusb_dev_printdata().
Bad offset results in OOB read before start of buffer.
[1214218.580308] mceusb 1-1.3:1.0: rx data[0]: 00 80 (length=2)
[1214218.580323] mceusb 1-1.3:1.0: Unknown command 0x00 0x80
...
[1214218.580406] mceusb 1-1.3:1.0: rx data[14]: 7f 7f (length=2)
[1214218.679311] mceusb 1-1.3:1.0: rx data[-1]: 80 90 (length=2)
[1214218.679325] mceusb 1-1.3:1.0: End of raw IR data
[1214218.679340] mceusb 1-1.3:1.0: rx data[1]: 7f 7f (length=2)
Fix: If parser_state is SUBCMD after processing receiver buffer,
reset parser_state to CMD_HEADER.
In effect, discard cmd_prefix at end of receiver buffer.
In mceusb_dev_printdata(), abort if buffer offset is out of bounds.
Case 5:
If response with data is truncated at end of buffer after
{ cmd_prefix, subcmd }, mceusb_dev_printdata() reads past
end of buffer for data0, data1, ...
while decoding the response to print out.
Fix: In mceusb_dev_printdata(), remove unneeded buffer offset
adjustments (var start and var skip) associated with MCE gen1 header.
Test for truncated MCE cmd response (compare offset+len with buf_len)
and skip decoding of incomplete response.
Move IR data tracing to execute before the truncation test.
Signed-off-by: A Sun <as1033x@comcast.net>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ef240eaff upstream.
Oleg provided the following test case:
int main(void)
{
struct sched_param sp = {};
sp.sched_priority = 2;
assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0);
int lock = vfork();
if (!lock) {
sp.sched_priority = 1;
assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0);
_exit(0);
}
syscall(__NR_futex, &lock, FUTEX_LOCK_PI, 0,0,0);
return 0;
}
This creates an unkillable RT process spinning in futex_lock_pi() on a UP
machine or if the process is affine to a single CPU. The reason is:
parent child
set FIFO prio 2
vfork() -> set FIFO prio 1
implies wait_for_child() sched_setscheduler(...)
exit()
do_exit()
....
mm_release()
tsk->futex_state = FUTEX_STATE_EXITING;
exit_futex(); (NOOP in this case)
complete() --> wakes parent
sys_futex()
loop infinite because
tsk->futex_state == FUTEX_STATE_EXITING
The same problem can happen just by regular preemption as well:
task holds futex
...
do_exit()
tsk->futex_state = FUTEX_STATE_EXITING;
--> preemption (unrelated wakeup of some other higher prio task, e.g. timer)
switch_to(other_task)
return to user
sys_futex()
loop infinite as above
Just for the fun of it the futex exit cleanup could trigger the wakeup
itself before the task sets its futex state to DEAD.
To cure this, the handling of the exiting owner is changed so:
- A refcount is held on the task
- The task pointer is stored in a caller visible location
- The caller drops all locks (hash bucket, mmap_sem) and blocks
on task::futex_exit_mutex. When the mutex is acquired then
the exiting task has completed the cleanup and the state
is consistent and can be reevaluated.
This is not a pretty solution, but there is no choice other than returning
an error code to user space, which would break the state consistency
guarantee and open another can of problems including regressions.
For stable backports the preparatory commits ac31c7ff86 .. ba31c1a485
are required as well, but for anything older than 5.3.y the backports are
going to be provided when this hits mainline as the other dependencies for
those kernels are definitely not stable material.
Fixes: 778e9a9c3e ("pi-futex: fix exit races and locking problems")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stable Team <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20191106224557.041676471@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac31c7ff86 upstream.
attach_to_pi_owner() returns -EAGAIN for various cases:
- Owner task is exiting
- Futex value has changed
The caller drops the held locks (hash bucket, mmap_sem) and retries the
operation. In case of the owner task exiting this can result in a live
lock.
As a preparatory step for seperating those cases, provide a distinct return
value (EBUSY) for the owner exiting case.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.935606117@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af8cbda2cf upstream.
exec() attempts to handle potentially held futexes gracefully by running
the futex exit handling code like exit() does.
The current implementation has no protection against concurrent incoming
waiters. The reason is that the futex state cannot be set to
FUTEX_STATE_DEAD after the cleanup because the task struct is still active
and just about to execute the new binary.
While its arguably buggy when a task holds a futex over exec(), for
consistency sake the state handling can at least cover the actual futex
exit cleanup section. This provides state consistency protection accross
the cleanup. As the futex state of the task becomes FUTEX_STATE_OK after the
cleanup has been finished, this cannot prevent subsequent attempts to
attach to the task in case that the cleanup was not successfull in mopping
up all leftovers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.753355618@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>