commit e5dfa3f902 upstream.
The usbip userspace tools call sprintf()/snprintf() and don't check for
the return value which can lead the paths to overflow, truncating the
final file in the path.
More urgently, GCC 7 now warns that these aren't checked with
-Wformat-overflow, and with -Werror enabled in configure.ac, that makes
these tools unbuildable.
This patch fixes these problems by replacing sprintf() with snprintf() in
one place and adding checks for the return value of snprintf().
Reviewed-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Jonathan Dieter <jdieter@lesbg.com>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfd6ed4537 upstream.
GCC 7 now warns when switch statements fall through implicitly, and with
-Werror enabled in configure.ac, that makes these tools unbuildable.
We fix this by notifying the compiler that this particular case statement
is meant to fall through.
Reviewed-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Jonathan Dieter <jdieter@lesbg.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f2d0088eb upstream.
When a client has a USB device attached over IP, the vhci_hcd driver is
locally leaking a socket pointer address via the
/sys/devices/platform/vhci_hcd/status file (world-readable) and in debug
output when "usbip --debug port" is run.
Fix it to not leak. The socket pointer address is not used at the moment
and it was made visible as a convenient way to find IP address from socket
pointer address by looking up /proc/net/{tcp,tcp6}.
As this opens a security hole, the fix replaces socket pointer address with
sockfd.
Reported-by: Secunia Research <vuln@secunia.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0ec1ded22 upstream.
In orangefs_devreq_read, there is a loop which picks an op off the list
of pending ops. If the loop fails to find an op, there is nothing to
read, and it returns EAGAIN. If the op has been given up on, the loop
is restarted via a goto. The bug is that the variable which the found
op is written to is not reinitialized, so if there are no more eligible
ops on the list, the code runs again on the already handled op.
This is triggered by interrupting a process while the op is being copied
to the client-core. It's a fairly small window, but it's there.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a5191efe0 upstream.
Since commit aef9a7bd9b ("serial/uart/8250: Add tunable RX interrupt
trigger I/F of FIFO buffers"), the port's default FCR value isn't used
in serial8250_do_set_termios anymore, but copied over once in
serial8250_config_port and then modified as needed.
Unfortunately, serial8250_config_port will never be called if the port
is shared between kernel and userspace, and the port's flag doesn't have
UPF_BOOT_AUTOCONF, which would trigger a serial8250_config_port as well.
This causes garbled output from userspace:
[ 5.220000] random: procd urandom read with 49 bits of entropy available
ers
[kee
Fix this by forcing it to be configured on boot, resulting in the
expected output:
[ 5.250000] random: procd urandom read with 50 bits of entropy available
Press the [f] key and hit [enter] to enter failsafe mode
Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
Fixes: aef9a7bd9b ("serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Nicolas Schichan <nschichan@freebox.fr>
Cc: linux-mips@linux-mips.org
Cc: linux-serial@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/17544/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acfb3b883f upstream.
KVM doesn't follow the SMCCC when it comes to unimplemented calls,
and inject an UNDEF instead of returning an error. Since firmware
calls are now used for security mitigation, they are becoming more
common, and the undef is counter productive.
Instead, let's follow the SMCCC which states that -1 must be returned
to the caller when getting an unknown function number.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 490ae017f5 upstream.
For btree removal, there is a corner case that a single thread
could takes 6 locks which is more than THIN_MAX_CONCURRENT_LOCKS(5)
and leads to deadlock.
A btree removal might eventually call
rebalance_children()->rebalance3() to rebalance entries of three
neighbor child nodes when shadow_spine has already acquired two
write locks. In rebalance3(), it tries to shadow and acquire the
write locks of all three child nodes. However, shadowing a child
node requires acquiring a read lock of the original child node and
a write lock of the new block. Although the read lock will be
released after block shadowing, shadowing the third child node
in rebalance3() could still take the sixth lock.
(2 write locks for shadow_spine +
2 write locks for the first two child nodes's shadow +
1 write lock for the last child node's shadow +
1 read lock for the last child node)
Signed-off-by: Dennis Yang <dennisyang@qnap.com>
Acked-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc68d0a435 upstream.
When inserting a new key/value pair into a btree we walk down the spine of
btree nodes performing the following 2 operations:
i) space for a new entry
ii) adjusting the first key entry if the new key is lower than any in the node.
If the _root_ node is full, the function btree_split_beneath() allocates 2 new
nodes, and redistibutes the root nodes entries between them. The root node is
left with 2 entries corresponding to the 2 new nodes.
btree_split_beneath() then adjusts the spine to point to one of the two new
children. This means the first key is never adjusted if the new key was lower,
ie. operation (ii) gets missed out. This can result in the new key being
'lost' for a period; until another low valued key is inserted that will uncover
it.
This is a serious bug, and quite hard to make trigger in normal use. A
reproducing test case ("thin create devices-in-reverse-order") is
available as part of the thin-provision-tools project:
https://github.com/jthornber/thin-provisioning-tools/blob/master/functional-tests/device-mapper/dm-tests.scm#L593
Fix the issue by changing btree_split_beneath() so it no longer adjusts
the spine. Instead it unlocks both the new nodes, and lets the main
loop in btree_insert_raw() relock the appropriate one and make any
neccessary adjustments.
Reported-by: Monty Pavel <monty_pavel@sina.com>
Signed-off-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62635ea8c1 upstream.
show_workqueue_state() can print out a lot of messages while being in
atomic context, e.g. sysrq-t -> show_workqueue_state(). If the console
device is slow it may end up triggering NMI hard lockup watchdog.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db5ff90979 upstream.
LITEON EP1 has the same timeout issues as CX1 series devices.
Revert max_sectors to the value of 1024.
Fixes: e0edc8c546 ("libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices")
Signed-off-by: Xinyu Lin <xinyu0123@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 883d50f56d upstream.
Since kernel 4.9, the thread_info has been moved into task_struct, no
longer locates at the bottom of kernel stack.
See commits c65eacbe29 ("sched/core: Allow putting thread_info into
task_struct") and 15f4eae70d ("x86: Move thread_info into
task_struct").
Before fix:
(gdb) set $current = $lx_current()
(gdb) p $lx_thread_info($current)
$1 = {flags = 1470918301}
(gdb) p $current.thread_info
$2 = {flags = 2147483648}
After fix:
(gdb) p $lx_thread_info($current)
$1 = {flags = 2147483648}
(gdb) p $current.thread_info
$2 = {flags = 2147483648}
Link: http://lkml.kernel.org/r/20180118210159.17223-1-imxikangjie@gmail.com
Fixes: 15f4eae70d ("x86: Move thread_info into task_struct")
Signed-off-by: Xi Kangjie <imxikangjie@gmail.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Kieran Bingham <kbingham@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8a243af1a upstream.
In some rare conditions when running one PEAK USB-FD interface over
a non high-speed USB controller, one useless USB fragment might be sent.
This patch fixes the way a USB command is fragmented when its length is
greater than 64 bytes and when the underlying USB controller is not a
high-speed one.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56aeb07c91 upstream.
MPP7 is currently muxed as "gpio", but this function doesn't exist for
MPP7, only "gpo" is available. This causes the following error:
kirkwood-pinctrl f1010000.pin-controller: unsupported function gpio on pin mpp7
pinctrl core: failed to register map default (6): invalid type given
kirkwood-pinctrl f1010000.pin-controller: error claiming hogs: -22
kirkwood-pinctrl f1010000.pin-controller: could not claim hogs: -22
kirkwood-pinctrl f1010000.pin-controller: unable to register pinctrl driver
kirkwood-pinctrl: probe of f1010000.pin-controller failed with error -22
So the pinctrl driver is not probed, all device drivers (including the
UART driver) do a -EPROBE_DEFER, and therefore the system doesn't
really boot (well, it boots, but with no UART, and no devices that
require pin-muxing).
Back when the Device Tree file for this board was introduced, the
definition was already wrong. The pinctrl driver also always described
as "gpo" this function for MPP7. However, between Linux 4.10 and 4.11,
a hog pin failing to be muxed was turned from a simple warning to a
hard error that caused the entire pinctrl driver probe to bail
out. This is probably the result of commit 6118714275 ("pinctrl:
core: Fix pinctrl_register_and_init() with pinctrl_enable()").
This commit fixes the Device Tree to use the proper "gpo" function for
MPP7, which fixes the boot of OpenBlocks A7, which was broken since
Linux 4.11.
Fixes: f24b56cbcd ("ARM: kirkwood: add support for OpenBlocks A7 platform")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c13e7f313d upstream.
The DRM driver most notably, but also out of tree drivers (for now) like
the VPU or GPU drivers, are quite big consumers of large, contiguous memory
buffers. However, the sunxi_defconfig doesn't enable CMA in order to
mitigate that, which makes them almost unusable.
Enable it to make sure it somewhat works.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7563e2796 upstream.
Stefan Wahren reports a problem with a warning fix that was merged
for v4.15: we had lots of device nodes with a 'phys' property pointing
to a device node that is not compliant with the binding documented in
Documentation/devicetree/bindings/phy/phy-bindings.txt
This generally works because USB HCD drivers that support both the generic
phy subsystem and the older usb-phy subsystem ignore most errors from
phy_get() and related calls and then use the usb-phy driver instead.
However, it turns out that making the usb-nop-xceiv device compatible with
the generic-phy binding changes the phy_get() return code from -EINVAL to
-EPROBE_DEFER, and the dwc2 usb controller driver for bcm2835 now returns
-EPROBE_DEFER from its probe function rather than ignoring the failure,
breaking all USB support on raspberry-pi when CONFIG_GENERIC_PHY is
enabled. The same code is used in the dwc3 driver and the usb_add_hcd()
function, so a reasonable assumption would be that many other platforms
are affected as well.
I have reviewed all the related patches and concluded that "usb-nop-xceiv"
is the only USB phy that is affected by the change, and since it is by far
the most commonly referenced phy, all the other USB phy drivers appear
to be used in ways that are are either safe in DT (they don't use the
'phys' property), or in the driver (they already ignore -EPROBE_DEFER
from generic-phy when usb-phy is available).
To work around the problem, this adds a special case to _of_phy_get()
so we ignore any PHY node that is compatible with "usb-nop-xceiv",
as we know that this can never load no matter how much we defer. In the
future, we might implement a generic-phy driver for "usb-nop-xceiv"
and then remove this workaround.
Since we generally want older kernels to also want to work with the
fixed devicetree files, it would be good to backport the patch into
stable kernels as well (3.13+ are possibly affected), even though they
don't contain any of the patches that may have caused regressions.
Fixes: 014d6da6cb ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells
Fixes: c5bbf358b7 arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv
Fixes: 44e5dced2e arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv
Fixes: f568f6f554 ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv
Fixes: d745d5f277 ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv
Fixes: 915fbe59cb ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv
Link: https://marc.info/?l=linux-usb&m=151518314314753&w=2
Link: https://patchwork.kernel.org/patch/10158145/
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Eric Anholt <eric@anholt.net>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ebe1eaf2f upstream.
Since enums do not get converted by the TRACE_EVENT macro into their values,
the event format displaces the enum name and not the value. This breaks
tools like perf and trace-cmd that need to interpret the raw binary data. To
solve this, an enum map was created to convert these enums into their actual
numbers on boot up. This is done by TRACE_EVENTS() adding a
TRACE_DEFINE_ENUM() macro.
Some enums were not being converted. This was caused by an optization that
had a bug in it.
All calls get checked against this enum map to see if it should be converted
or not, and it compares the call's system to the system that the enum map
was created under. If they match, then they call is processed.
To cut down on the number of iterations needed to find the maps with a
matching system, since calls and maps are grouped by system, when a match is
made, the index into the map array is saved, so that the next call, if it
belongs to the same system as the previous call, could start right at that
array index and not have to scan all the previous arrays.
The problem was, the saved index was used as the variable to know if this is
a call in a new system or not. If the index was zero, it was assumed that
the call is in a new system and would keep incrementing the saved index
until it found a matching system. The issue arises when the first matching
system was at index zero. The next map, if it belonged to the same system,
would then think it was the first match and increment the index to one. If
the next call belong to the same system, it would begin its search of the
maps off by one, and miss the first enum that should be converted. This left
a single enum not converted properly.
Also add a comment to describe exactly what that index was for. It took me a
bit too long to figure out what I was thinking when debugging this issue.
Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com
Fixes: 0c564a538a ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Teste-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b18920199 upstream.
A helper purported to look up a child node based on its name was using
the wrong of-helper and ended up prematurely freeing the parent of-node
while searching the whole device tree depth-first starting at the parent
node.
Fixes: 64b9e4d803 ("input: twl4030-vibra: Support for DT booted kernel")
Fixes: e661d0a044 ("Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcaf12a8b0 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at parent rather than just matching on
its children.
Later sanity checks on node properties (which would likely be missing)
should prevent this from causing much trouble however, especially as the
original premature free of the parent node has already been fixed
separately (but that "fix" was apparently never backported to stable).
Fixes: e7ec014a47 ("Input: twl6040-vibra - update for device tree support")
Fixes: c52c545ead ("Input: twl6040-vibra - fix DT node memory management")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com> (on Pyra OMAP5 hardware)
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 906bf7daa0 upstream.
Fix child node-lookup during probe, which ended up searching the whole
device tree depth-first starting at parent rather than just matching on
its children.
To make things worse, the parent node was prematurely freed, while the
child node was leaked.
Fixes: 2e57d56747 ("mfd: 88pm860x: Device tree support")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d94e776bd upstream.
The fix for handling two-finger scroll (i4a646580f793 - "Input: ALPS -
fix two-finger scroll breakage in right side on ALPS touchpad")
introduced a minor "typo" that broke decoding of multi-touch events are
decoded on some ALPS touchpads. For example, tapping with three-fingers
can no longer be used to emulate middle-mouse-button (the kernel doesn't
recognize this as the proper event, and doesn't report it correctly to
userspace). This affects touchpads that use SS4 "plus" protocol
variant, like those found on Dell E7270 & E7470 laptops (tested on
E7270).
First, probably the code in alps_decode_ss4_v2() for case
SS4_PACKET_ID_MULTI used inconsistent indices to "f->mt[]". You can see
0 & 1 are used for the "if" part but 2 & 3 are used for the "else" part.
Second, in the previous patch, new macros were introduced to decode X
coordinates specific to the SS4 "plus" variant, but the macro to
define the maximum X value wasn't changed accordingly. The macros to
decode X values for "plus" variant are effectively shifted right by 1
bit, but the max wasn't shifted too. This causes the driver to
incorrectly handle "no data" cases, which also interfered with how
multi-touch was handled.
Fixes: 4a646580f7 ("Input: ALPS - fix two-finger scroll breakage...")
Signed-off-by: Nir Perry <nirperry@gmail.com>
Reviewed-by: Masaki Ota <masaki.ota@jp.alps.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45d55e7bac upstream.
Keith reported the following warning:
WARNING: CPU: 28 PID: 1420 at kernel/irq/matrix.c:222 irq_matrix_remove_managed+0x10f/0x120
x86_vector_free_irqs+0xa1/0x180
x86_vector_alloc_irqs+0x1e4/0x3a0
msi_domain_alloc+0x62/0x130
The reason for this is that if the vector allocation fails the error
handling code tries to free the failed vector as well, which causes the
above imbalance warning to trigger.
Adjust the error path to handle this correctly.
Fixes: b5dc8e6c21 ("x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors")
Reported-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Keith Busch <keith.busch@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161217300.1823@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3f14c4858 upstream.
round_pipe_size() contains a right-bit-shift expression which may
overflow, which would cause undefined results in a subsequent
roundup_pow_of_two() call.
static inline unsigned int round_pipe_size(unsigned int size)
{
unsigned long nr_pages;
nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
return roundup_pow_of_two(nr_pages) << PAGE_SHIFT;
}
PAGE_SIZE is defined as (1UL << PAGE_SHIFT), so:
- 4 bytes wide on 32-bit (0 to 0xffffffff)
- 8 bytes wide on 64-bit (0 to 0xffffffffffffffff)
That means that 32-bit round_pipe_size(), nr_pages may overflow to 0:
size=0x00000000 nr_pages=0x0
size=0x00000001 nr_pages=0x1
size=0xfffff000 nr_pages=0xfffff
size=0xfffff001 nr_pages=0x0 << !
size=0xffffffff nr_pages=0x0 << !
This is bad because roundup_pow_of_two(n) is undefined when n == 0!
64-bit is not a problem as the unsigned int size is 4 bytes wide
(similar to 32-bit) and the larger, 8 byte wide unsigned long, is
sufficient to handle the largest value of the bit shift expression:
size=0xffffffff nr_pages=100000
Modify round_pipe_size() to return 0 if n == 0 and updates its callers to
handle accordingly.
Link: http://lkml.kernel.org/r/1507658689-11669-3-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dong Jinguang <dongjinguang@huawei.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b511203093 upstream.
The INTEL_FAM6_SKYLAKE_X hardcoded crystal_khz value of 25MHZ is
problematic:
- SKX workstations (with same model # as server variants) use a 24 MHz
crystal. This results in a -4.0% time drift rate on SKX workstations.
- SKX servers subject the crystal to an EMI reduction circuit that reduces its
actual frequency by (approximately) -0.25%. This results in -1 second per
10 minute time drift as compared to network time.
This issue can also trigger a timer and power problem, on configurations
that use the LAPIC timer (versus the TSC deadline timer). Clock ticks
scheduled with the LAPIC timer arrive a few usec before the time they are
expected (according to the slow TSC). This causes Linux to poll-idle, when
it should be in an idle power saving state. The idle and clock code do not
graciously recover from this error, sometimes resulting in significant
polling and measurable power impact.
Stop using native_calibrate_tsc() for INTEL_FAM6_SKYLAKE_X.
native_calibrate_tsc() will return 0, boot will run with tsc_khz = cpu_khz,
and the TSC refined calibration will update tsc_khz to correct for the
difference.
[ tglx: Sanitized change log ]
Fixes: 6baf3d6182 ("x86/tsc: Add additional Intel CPU models to the crystal quirk list")
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: Prarit Bhargava <prarit@redhat.com>
Link: https://lkml.kernel.org/r/ff6dcea166e8ff8f2f6a03c17beab2cb436aa779.1513920414.git.len.brown@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit beacd6f7ed upstream.
SEGV_PKUERR is a signal specific si_code which happens to have the same
numeric value as several others: BUS_MCEERR_AR, ILL_ILLTRP, FPE_FLTOVF,
TRAP_HWBKPT, CLD_TRAPPED, POLL_ERR, SEGV_THREAD_ID, as such it is not safe
to just test the si_code the signal number must also be tested to prevent a
false positive in fill_sig_info_pkey.
This error was by inspection, and BUS_MCEERR_AR appears to be a real
candidate for confusion. So pass in si_signo and check for SIG_SEGV to
verify that it is actually a SEGV_PKUERR
Fixes: 019132ff3d ("x86/mm/pkeys: Fill in pkey field in siginfo")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20180112203135.4669-2-ebiederm@xmission.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c995efd5a7 upstream.
On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.
This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.
Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.
On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
other conditions which may result in the RSB becoming empty. The full
solution on Skylake+ is to use IBRS, which will prevent the problem even
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.
[ tglx: Added missing vendor check and slighty massaged comments and
changelog ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd52cb26e7 upstream.
In case we fail to establish the connection we must drain our pre-posted
login recieve work request before continuing safely with connection
teardown.
Fixes: a060b5629a ("IB/core: generic RDMA READ/WRITE API")
Reported-by: Amrani, Ram <Ram.Amrani@cavium.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e765b4972 upstream.
If a message sent to a PF_KEY socket ended with an incomplete extension
header (fewer than 4 bytes remaining), then parse_exthdrs() read past
the end of the message, into uninitialized memory. Fix it by returning
-EINVAL in this case.
Reproducer:
#include <linux/pfkeyv2.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
char buf[17] = { 0 };
struct sadb_msg *msg = (void *)buf;
msg->sadb_msg_version = PF_KEY_V2;
msg->sadb_msg_type = SADB_DELETE;
msg->sadb_msg_len = 2;
write(sock, buf, 17);
}
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06b335cb51 upstream.
If a message sent to a PF_KEY socket ended with one of the extensions
that takes a 'struct sadb_address' but there were not enough bytes
remaining in the message for the ->sa_family member of the 'struct
sockaddr' which is supposed to follow, then verify_address_len() read
past the end of the message, into uninitialized memory. Fix it by
returning -EINVAL in this case.
This bug was found using syzkaller with KMSAN.
Reproducer:
#include <linux/pfkeyv2.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
char buf[24] = { 0 };
struct sadb_msg *msg = (void *)buf;
struct sadb_address *addr = (void *)(msg + 1);
msg->sadb_msg_version = PF_KEY_V2;
msg->sadb_msg_type = SADB_DELETE;
msg->sadb_msg_len = 3;
addr->sadb_address_len = 1;
addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;
write(sock, buf, 24);
}
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>