commit 6c9f0ce0df upstream.
Commit 5e572cab92 ("tpm: Enable CLKRUN protocol for Braswell systems")
added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
signal during TPM transactions.
Unfortunately this breaks other devices that are attached to the LPC bus
like for example PS/2 mouse and keyboards.
One flaw with the logic is that it assumes that the CLKRUN is always
enabled, and so it unconditionally enables it after a TPM transaction.
But it could be that the CLKRUN# signal was already disabled in the LPC
bus and so after the driver probes, CLKRUN_EN will remain enabled which
may break other devices that are attached to the LPC bus but don't have
support for the CLKRUN protocol.
Fixes: 5e572cab92 ("tpm: Enable CLKRUN protocol for Braswell systems")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Tested-by: James Ettle <james@ettle.org.uk>
Tested-by: Jeffery Miller <jmiller@neverware.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68021bf473 upstream.
The CLKRUN fix caused a few harmless compile-time warnings:
drivers/char/tpm/tpm_tis.c: In function 'tpm_tis_pnp_remove':
drivers/char/tpm/tpm_tis.c:274:23: error: unused variable 'priv' [-Werror=unused-variable]
drivers/char/tpm/tpm_tis.c: In function 'tpm_tis_plat_remove':
drivers/char/tpm/tpm_tis.c:324:23: error: unused variable 'priv' [-Werror=unused-variable]
This removes the variables that have now become unused.
Fixes: 6d0866cbc2d3 ("tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3e958ce4c upstream.
Commit 5e572cab92 ("tpm: Enable CLKRUN protocol for Braswell
systems") disabled CLKRUN protocol during TPM transactions and re-enabled
once the transaction is completed. But there were still some corner cases
observed where, reading of TPM header failed for savestate command
while going to suspend, which resulted in suspend failure.
To fix this issue keep the CLKRUN protocol disabled for the entire
duration of a single TPM command and not disabling and re-enabling
again for every TPM transaction. For the other TPM accesses outside
TPM command flow, add a higher level of disabling and re-enabling
the CLKRUN protocol, instead of doing for every TPM transaction.
Fixes: 5e572cab92 ("tpm: Enable CLKRUN protocol for Braswell systems")
Signed-off-by: Azhar Shaikh <azhar.shaikh@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d98386d55 upstream.
For some reason, Florian forgot to apply to ip6_route_me_harder
the fix that went in commit 29e09229d9 ("netfilter: use
skb_to_full_sk in ip_route_me_harder")
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4585a2823 upstream.
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
Therefore it must check that the size of the match structure
provided from userspace is sane by making sure em->match_size
is at least the minimum size of the expected structure.
The module has such a check, but its only done after accessing
a structure that might be out of bounds.
tested with: ebtables -A INPUT ... \
--among-dst fe:fe:fe:fe:fe:fe
--among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
--among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe
Reported-by: <syzbot+fe0b19af568972814355@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b718121685 upstream.
We need to make sure the offsets are not out of range of the
total size.
Also check that they are in ascending order.
The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
changed to also bail out, no point in continuing parsing.
Briefly tested with simple ruleset of
-A INPUT --limit 1/s' --log
plus jump to custom chains using 32bit ebtables binary.
Reported-by: <syzbot+845a53d13171abf8bf29@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3e456fce9 upstream.
There is a race condition between clusterip_config_entry_put()
and clusterip_config_init(), after we release the spinlock in
clusterip_config_entry_put(), a new proc file with a same IP could
be created immediately since it is already removed from the configs
list, therefore it triggers this warning:
------------[ cut here ]------------
proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered
WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 fs/proc/generic.c:329
Kernel panic - not syncing: panic_on_warn set ...
As a quick fix, just move the proc_remove() inside the spinlock.
Reported-by: <syzbot+03218bcdba6aa76441a3@syzkaller.appspotmail.com>
Fixes: 6c5d5cfbe3 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing")
Tested-by: Paolo Abeni <pabeni@redhat.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57ebd808a9 upstream.
The rationale for removing the check is only correct for rulesets
generated by ip(6)tables.
In iptables, a jump can only occur to a user-defined chain, i.e.
because we size the stack based on number of user-defined chains we
cannot exceed stack size.
However, the underlying binary format has no such restriction,
and the validation step only ensures that the jump target is a
valid rule start point.
IOW, its possible to build a rule blob that has no user-defined
chains but does contain a jump.
If this happens, no jump stack gets allocated and crash occurs
because no jumpstack was allocated.
Fixes: 7814b6ec6d ("netfilter: xtables: don't save/restore jumpstack offset")
Reported-by: syzbot+e783f671527912cd9403@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cedb6415f9 upstream.
Commit d1c4cb447a ("ASoC: Intel: Skylake: Fix jack name format
substitution") added Jack name but erroneously added a space as well,
so remove the space in Jack name.
Fixes: d1c4cb447a ("ASoC: Intel: Skylake: Fix jack name format substitution")
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1c4cb447a upstream.
Jack name is not getting formatted correctly hence resulting
in invalid name for HDMI/DP input devices.
This was recently exposed due changes brought by MST:
commit 3a13347f05 ("ASoC: Intel: kbl: Add jack port initialize
in kbl machine drivers")
Signed-off-by: Chintan Patel <chintan.m.patel@intel.com>
Acked-By: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 863204cfda upstream.
In configurations without CONFIG_OMAP3 but with secure RAM support,
we now run into a link failure:
arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram':
omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context'
The omap3_save_secure_ram() function is only called from the OMAP34xx
power management code, so we can simply hide that function in the
appropriate #ifdef.
Fixes: d09220a887 ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context")
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b3d89b402 upstream.
Gen8 and prior Proliant systems supported the "CRU" interface
to firmware. This interfaces allows linux to "call back" into firmware
to source the cause of an NMI. This feature isn't fully utilized
as the actual source of the NMI isn't printed, the driver only
indicates that the source couldn't be determined when the call
fails.
With the advent of Gen9, iCRU replaces the CRU. The call back
feature is no longer available in firmware. To be compatible and
not attempt to call back into firmware on system not supporting CRU,
the SMBIOS table is consulted to determine if it is safe to
make the call back or not.
This results in about half of the driver code being devoted
to either making CRU calls or determing if it is safe to make
CRU calls. As noted, the driver isn't really using the results of
the CRU calls.
Furthermore, as a consequence of the Spectre security issue, the
BIOS/EFI calls are being wrapped into Spectre-disabling section.
Removing the call back in hpwdt_pretimeout assists in this effort.
As the CRU sourcing of the NMI isn't required for handling the
NMI and there are security concerns with making the call back, remove
the legacy (pre Gen9) NMI sourcing and the DMI code to determine if
the system had the CRU interface.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aeebc6ba88 upstream.
The new hpwdt_my_nmi() function is used conditionally, which produces
a harmless warning in some configurations:
drivers/watchdog/hpwdt.c:478:12: error: 'hpwdt_my_nmi' defined but not used [-Werror=unused-function]
This moves it inside of the #ifdef that protects its caller, to silence
the warning.
Fixes: 621174a92851 ("watchdog: hpwdt: Check source of NMI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c42cbe4172 upstream.
This corrects:
commit cce78da766 ("watchdog: hpwdt: Add check for UEFI bits")
The test on HPE SMBIOS extension type 219 record "Misc Features"
bits for UEFI support is incorrect. The definition of the Misc Features
bits in the HPE SMBIOS OEM Extensions specification (and related
firmware) was changed to use a different pair of bits to
represent UEFI supported. Howerver, a corresponding change
to Linux was missed.
Current code/platform work because the iCRU test is working.
But purpose of cce78da766 is to ensure correct functionality
on future systems where iCRU isn't supported.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba634eceb5 upstream.
The first "_all" occurrence around line 120 is only visible when
KBUILD_SRC is unset.
If O=... is specified, the working directory is relocated, then the
only second occurrence around line 193 is visible, that is not set
to PHONY.
Move the first one to an always visible place. This clarifies "_all"
is our default target and it is always set to PHONY.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe2a3027e7 upstream.
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.
To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.
Fixes: 52a5c155cf ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: <stable@vger.kernel.org>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[jwang: port to 4.14]
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit d269176e76 ]
While working on 16338a9b3a ("bpf, arm64: fix out of bounds access in
tail call") I noticed that ppc64 JIT is partially affected as well. While
the bound checking is correctly performed as unsigned comparison, the
register with the index value however, is never truncated into 32 bit
space, so e.g. a index value of 0x100000000ULL with a map of 1 element
would pass with PPC_CMPLW() whereas we later on continue with the full
64 bit register value. Therefore, as we do in interpreter and other JITs
truncate the value to 32 bit initially in order to fix access.
Fixes: ce0761419f ("powerpc/bpf: Implement support for tail calls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit ca36960211 ]
The requirements around atomic_add() / atomic64_add() resp. their
JIT implementations differ across architectures. E.g. while x86_64
seems just fine with BPF's xadd on unaligned memory, on arm64 it
triggers via interpreter but also JIT the following crash:
[ 830.864985] Unable to handle kernel paging request at virtual address ffff8097d7ed6703
[...]
[ 830.916161] Internal error: Oops: 96000021 [#1] SMP
[ 830.984755] CPU: 37 PID: 2788 Comm: test_verifier Not tainted 4.16.0-rc2+ #8
[ 830.991790] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.29 07/17/2017
[ 830.998998] pstate: 80400005 (Nzcv daif +PAN -UAO)
[ 831.003793] pc : __ll_sc_atomic_add+0x4/0x18
[ 831.008055] lr : ___bpf_prog_run+0x1198/0x1588
[ 831.012485] sp : ffff00001ccabc20
[ 831.015786] x29: ffff00001ccabc20 x28: ffff8017d56a0f00
[ 831.021087] x27: 0000000000000001 x26: 0000000000000000
[ 831.026387] x25: 000000c168d9db98 x24: 0000000000000000
[ 831.031686] x23: ffff000008203878 x22: ffff000009488000
[ 831.036986] x21: ffff000008b14e28 x20: ffff00001ccabcb0
[ 831.042286] x19: ffff0000097b5080 x18: 0000000000000a03
[ 831.047585] x17: 0000000000000000 x16: 0000000000000000
[ 831.052885] x15: 0000ffffaeca8000 x14: 0000000000000000
[ 831.058184] x13: 0000000000000000 x12: 0000000000000000
[ 831.063484] x11: 0000000000000001 x10: 0000000000000000
[ 831.068783] x9 : 0000000000000000 x8 : 0000000000000000
[ 831.074083] x7 : 0000000000000000 x6 : 000580d428000000
[ 831.079383] x5 : 0000000000000018 x4 : 0000000000000000
[ 831.084682] x3 : ffff00001ccabcb0 x2 : 0000000000000001
[ 831.089982] x1 : ffff8097d7ed6703 x0 : 0000000000000001
[ 831.095282] Process test_verifier (pid: 2788, stack limit = 0x0000000018370044)
[ 831.102577] Call trace:
[ 831.105012] __ll_sc_atomic_add+0x4/0x18
[ 831.108923] __bpf_prog_run32+0x4c/0x70
[ 831.112748] bpf_test_run+0x78/0xf8
[ 831.116224] bpf_prog_test_run_xdp+0xb4/0x120
[ 831.120567] SyS_bpf+0x77c/0x1110
[ 831.123873] el0_svc_naked+0x30/0x34
[ 831.127437] Code: 97fffe97 17ffffec 00000000 f9800031 (885f7c31)
Reason for this is because memory is required to be aligned. In
case of BPF, we always enforce alignment in terms of stack access,
but not when accessing map values or packet data when the underlying
arch (e.g. arm64) has CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.
xadd on packet data that is local to us anyway is just wrong, so
forbid this case entirely. The only place where xadd makes sense in
fact are map values; xadd on stack is wrong as well, but it's been
around for much longer. Specifically enforce strict alignment in case
of xadd, so that we handle this case generically and avoid such crashes
in the first place.
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 32fff239de ]
syszbot managed to trigger RCU detected stalls in
bpf_array_free_percpu()
It takes time to allocate a huge percpu map, but even more time to free
it.
Since we run in process context, use cond_resched() to yield cpu if
needed.
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 16338a9b3a ]
I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:
[ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
[...]
[ 347.043065] [fffb850e96492510] address between user and kernel address ranges
[ 347.050205] Internal error: Oops: 96000004 [#1] SMP
[...]
[ 347.190829] x13: 0000000000000000 x12: 0000000000000000
[ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
[ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000
[ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000
[ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
[ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
[ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
[ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
[ 347.235221] Call trace:
[ 347.237656] 0xffff000002f3a4fc
[ 347.240784] bpf_test_run+0x78/0xf8
[ 347.244260] bpf_prog_test_run_skb+0x148/0x230
[ 347.248694] SyS_bpf+0x77c/0x1110
[ 347.251999] el0_svc_naked+0x30/0x34
[ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
[...]
In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.
While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:
# bpftool p d j i 988
[...]
38: ldr w10, [x1,x10]
3c: cmp w2, w10
40: b.ge 0x000000000000007c <-- signed cmp
44: mov x10, #0x20 // #32
48: cmp x26, x10
4c: b.gt 0x000000000000007c
50: add x26, x26, #0x1
54: mov x10, #0x110 // #272
58: add x10, x1, x10
5c: lsl x11, x2, #3
60: ldr x11, [x10,x11] <-- faulting insn (f86b694b)
64: cbz x11, 0x000000000000007c
[...]
Meaning, the tests passed because commit ddb55992b0 ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.
Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8c
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.
Result after patch:
# bpftool p d j i 268
[...]
38: ldr w10, [x1,x10]
3c: add w2, w2, #0x0
40: cmp w2, w10
44: b.cs 0x0000000000000080
48: mov x10, #0x20 // #32
4c: cmp x26, x10
50: b.hi 0x0000000000000080
54: add x26, x26, #0x1
58: mov x10, #0x110 // #272
5c: add x10, x1, x10
60: lsl x11, x2, #3
64: ldr x11, [x10,x11]
68: cbz x11, 0x0000000000000080
[...]
Fixes: ddb55992b0 ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit a493a87f38 ]
Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).
JIT dump after patch:
# bpftool p d x i 1
0: (18) r2 = map[id:1]
2: (b7) r3 = 0
3: (85) call bpf_tail_call#12
4: (b7) r0 = 2
5: (95) exit
With CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000072 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000072 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000072 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: callq 0x000000000000006d |+
66: pause |
68: lfence |
6b: jmp 0x0000000000000066 |
6d: mov %rax,(%rsp) |
71: retq |
72: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
+ retpoline for indirect jump
Without CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000063 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000063 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000063 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: jmpq *%rax |-
63: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
- plain indirect jump as before
[0] https://support.google.com/faqs/answer/7625886
[1] a31e654fa1
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 6c5f61023c ]
Commit 9a3efb6b66 ("bpf: fix memory leak in lpm_trie map_free callback function")
fixed a memory leak and removed unnecessary locks in map_free callback function.
Unfortrunately, it introduced a lockdep warning. When lockdep checking is turned on,
running tools/testing/selftests/bpf/test_lpm_map will have:
[ 98.294321] =============================
[ 98.294807] WARNING: suspicious RCU usage
[ 98.295359] 4.16.0-rc2+ #193 Not tainted
[ 98.295907] -----------------------------
[ 98.296486] /home/yhs/work/bpf/kernel/bpf/lpm_trie.c:572 suspicious rcu_dereference_check() usage!
[ 98.297657]
[ 98.297657] other info that might help us debug this:
[ 98.297657]
[ 98.298663]
[ 98.298663] rcu_scheduler_active = 2, debug_locks = 1
[ 98.299536] 2 locks held by kworker/2:1/54:
[ 98.300152] #0: ((wq_completion)"events"){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
[ 98.301381] #1: ((work_completion)(&map->work)){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
Since actual trie tree removal happens only after no other
accesses to the tree are possible, replacing
rcu_dereference_protected(*slot, lockdep_is_held(&trie->lock))
with
rcu_dereference_protected(*slot, 1)
fixed the issue.
Fixes: 9a3efb6b66 ("bpf: fix memory leak in lpm_trie map_free callback function")
Reported-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 9a3efb6b66 ]
There is a memory leak happening in lpm_trie map_free callback
function trie_free. The trie structure itself does not get freed.
Also, trie_free function did not do synchronize_rcu before freeing
various data structures. This is incorrect as some rcu_read_lock
region(s) for lookup, update, delete or get_next_key may not complete yet.
The fix is to add synchronize_rcu in the beginning of trie_free.
The useless spin_lock is removed from this function as well.
Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 9c2d63b843 ]
syzkaller recently triggered OOM during percpu map allocation;
while there is work in progress by Dennis Zhou to add __GFP_NORETRY
semantics for percpu allocator under pressure, there seems also a
missing bpf_map_precharge_memlock() check in array map allocation.
Given today the actual bpf_map_charge_memlock() happens after the
find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
is there to bail out early before we go and do the map setup work
when we find that we hit the limits anyway. Therefore add this for
array map as well.
Fixes: 6c90598174 ("bpf: pre-allocate hash map elements")
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4b591c87f upstream.
The entire completions suppress mechanism is currently broken because the
HCA might retry a send operation (due to dropped ack) after the nvme
transaction has completed.
In order to handle this, we signal all send completions and introduce a
separate done handler for async events as they will be handled differently
(as they don't include in-capsule data by definition).
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39772f0a7b upstream.
The locking protocols in md assume that a device will
never be removed from an array during resync/recovery/reshape.
When that isn't happening, rcu or reconfig_mutex is needed
to protect an rdev pointer while taking a refcount. When
it is happening, that protection isn't needed.
Unfortunately there are cases were remove_and_add_spares() is
called when recovery might be happening: is state_store(),
slot_store() and hot_remove_disk().
In each case, this is just an optimization, to try to expedite
removal from the personality so the device can be removed from
the array. If resync etc is happening, we just have to wait
for md_check_recover to find a suitable time to call
remove_and_add_spares().
This optimization and not essential so it doesn't
matter if it fails.
So change remove_and_add_spares() to abort early if
resync/recovery/reshape is happening, unless it is called
from md_check_recovery() as part of a newly started recovery.
The parameter "this" is only NULL when called from
md_check_recovery() so when it is NULL, there is no need to abort.
As this can result in a NULL dereference, the fix is suitable
for -stable.
cc: yuyufen <yuyufen@huawei.com>
Cc: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Fixes: 8430e7e0af ("md: disconnect device from personality before trying to remove it.")
Cc: stable@ver.kernel.org (v4.8+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74402055a2 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c276761 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84c7efd607 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at times.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for Logic PD DM3730
SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36904703ae upstream.
The i2c touchpad on Dell XPS 9570 and Precision M5530 doesn't work out
of box.
The touchpad relies on its _INI method to update its _HID value from
XXXX0000 to SYNA2393.
Also, the _STA relies on value of I2CN to report correct status.
Set acpi_gbl_parse_table_as_term_list so the value of I2CN can be
correctly set up, and _INI can get run. The ACPI table in this machine
is designed to get parsed this way.
Also, change the quirk table to a more generic name.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 103c763c72 upstream.
On x86, special KVM memslots such as the TSS region have anonymous
memory mappings created on behalf of userspace, and these mappings are
removed when the VM is destroyed.
It is however possible for removing these mappings via vm_munmap() to
fail. This can most easily happen if the thread receives SIGKILL while
it's waiting to acquire ->mmap_sem. This triggers the 'WARN_ON(r < 0)'
in __x86_set_memory_region(). syzkaller was able to hit this, using
'exit()' to send the SIGKILL. Note that while the vm_munmap() failure
results in the mapping not being removed immediately, it is not leaked
forever but rather will be freed when the process exits.
It's not really possible to handle this failure properly, so almost
every other caller of vm_munmap() doesn't check the return value. It's
a limitation of having the kernel manage these mappings rather than
userspace.
So just remove the WARN_ON() so that users can't spam the kernel log
with this warning.
Fixes: f0d648bdf0 ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee8bdfb656 upstream.
Even though it is unconventional, some PCIe host implementations omit the
root ports entirely, and simply consist of a host bridge (which is not
modeled as a device in the PCI hierarchy) and a link.
When the downstream device is an endpoint, our current code does not seem
to mind this unusual configuration. However, when PCIe switches are
involved, the ASPM code assumes that any downstream switch port has a
parent, and blindly dereferences the bus->parent->self field of the pci_dev
struct to chain the downstream link state to the link state of the root
port. Given that the root port is missing, the link is not modeled at all,
and nor is the link state, and attempting to access it results in a NULL
pointer dereference and a crash.
Avoid this by allowing the link state chain to terminate at the downstream
port if no root port exists.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b2e9904c1 upstream.
The initial reset of the local APIC is performed before the VMCS has been
created, but it tries to do a vmwrite:
vmwrite error: reg 810 value 4a00 (err 18944)
CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
Call Trace:
vmx_set_rvi [kvm_intel]
vmx_hwapic_irr_update [kvm_intel]
kvm_lapic_reset [kvm]
kvm_create_lapic [kvm]
kvm_arch_vcpu_init [kvm]
kvm_vcpu_init [kvm]
vmx_create_vcpu [kvm_intel]
kvm_vm_ioctl [kvm]
Move it later, after the VMCS has been created.
Fixes: 4191db26b7 ("KVM: x86: Update APICv on APIC reset")
Cc: stable@vger.kernel.org
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95e057e258 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b8ebbdb74 upstream.
The x86 MMU if full of code that returns 0 and 1 for retry/emulate. Use
the existing RET_MMIO_PF_RETRY/RET_MMIO_PF_EMULATE enum, renaming it to
drop the MMIO part.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Backlund <tmb@mageia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>