commit 79435ac78d upstream.
This used to setup the LP_COUNT register automatically, but now has been
removed.
There was an earlier fix 3c7c7a2fc8 which fixed instance in delay.h but
somehow missed this one as gcc change had not made its way into
production toolchains and was not pedantic as it is now !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 563b5cbe33 upstream.
For PCI devices behind an aliasing PCIe-to-PCI/X bridge, the bridge
alias to DevFn 0.0 on the subordinate bus may match the original RID of
the device, resulting in the same SID being present in the device's
fwspec twice. This causes trouble later in arm_smmu_write_strtab_ent()
when we wind up visiting the STE a second time and find it already live.
Avoid the issue by giving arm_smmu_install_ste_for_dev() the cleverness
to skip over duplicates. It seems mildly counterintuitive compared to
preventing the duplicates from existing in the first place, but since
the DT and ACPI probe paths build their fwspecs differently, this is
actually the cleanest and most self-contained way to deal with it.
Fixes: 8f78515425 ("iommu/arm-smmu: Implement of_xlate() for SMMUv3")
Reported-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Tested-by: Jayachandran C. <jnair@caviumnetworks.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57d72e159b upstream.
Kasan reports a double free when finalise_stage_fn fails: the io_pgtable
ops are freed by arm_smmu_domain_finalise and then again by
arm_smmu_domain_free. Prevent this by leaving pgtbl_ops empty on failure.
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 426915796c upstream.
complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.
If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except ->group_exit_task if it is not NULL) is
killed, this check breaks the rule.
After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) && SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.
This should hopefully fix the problem reported by Dmitry. This
test-case
static int init(void *arg)
{
for (;;)
pause();
}
int main(void)
{
char stack[16 * 1024];
for (;;) {
int pid = clone(init, stack + sizeof(stack)/2,
CLONE_NEWPID | SIGCHLD, NULL);
assert(pid > 0);
assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
assert(waitpid(-1, NULL, WSTOPPED) == pid);
assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
assert(pid == wait(NULL));
}
}
triggers the WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING)) in
task_participate_group_stop(). do_signal_stop()->signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.
And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.
Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Kyle Huey <me@kylehuey.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 628c1bcba2 upstream.
The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.
Change sig_ignored() to ignore ->ptrace if sig == SIGKILL and rely on
sig_task_ignored().
SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work. Perhaps we will add
another ->ptrace check later, we will see.
Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d5905dc14 upstream.
After commit 890da9cf09 (Revert "x86: do not use cpufreq_quick_get()
for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
on x86 can be either the nominal CPU frequency (which is constant)
or the frequency most recently requested by a scaling governor in
cpufreq, depending on the cpufreq configuration. That is somewhat
inconsistent and is different from what it was before 4.13, so in
order to restore the previous behavior, make it report the current
CPU frequency like the scaling_cur_freq sysfs file in cpufreq.
To that end, modify the /proc/cpuinfo implementation on x86 to use
aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
registers, if available, and use their values to compute the CPU
frequency to be reported as "cpu MHz".
However, do that carefully enough to avoid accumulating delays that
lead to unacceptable access times for /proc/cpuinfo on systems with
many CPUs. Run aperfmperf_snapshot_khz() once on all CPUs
asynchronously at the /proc/cpuinfo open time, add a single delay
upfront (if necessary) at that point and simply compute the current
frequency while running show_cpuinfo() for each individual CPU.
Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
the default delay between consecutive APERF and MPERF reads to 10 ms,
which should be sufficient to get large enough numbers for the
frequency computation in all cases.
Fixes: 890da9cf09 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b29c6ef7bb upstream.
Even though aperfmperf_snapshot_khz() caches the samples.khz value to
return if called again in a sufficiently short time, its caller,
arch_freq_get_on_cpu(), still uses smp_call_function_single() to run it
which may allow user space to trigger an IPI storm by reading from the
scaling_cur_freq cpufreq sysfs file in a tight loop.
To avoid that, move the decision on whether or not to return the cached
samples.khz value to arch_freq_get_on_cpu().
This change was part of commit 941f5f0f6e ("x86: CPU: Fix up "cpu MHz"
in /proc/cpuinfo"), but it was not the reason for the revert and it
remains applicable.
Fixes: 4815d3c56d (cpufreq: x86: Make scaling_cur_freq behave more as expected)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: WANG Chao <chao.wang@ucloud.cn>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9880150655 upstream.
Fix the default for fscache_maybe_release_page() for when the cookie isn't
valid or the page isn't cached. It mustn't return false as that indicates
the page cannot yet be freed.
The problem with the default is that if, say, there's no cache, but a
network filesystem's pages are using up almost all the available memory, a
system can OOM because the filesystem ->releasepage() op will not allow
them to be released as fscache_maybe_release_page() incorrectly prevents
it.
This can be tested by writing a sequence of 512MiB files to an AFS mount.
It does not affect NFS or CIFS because both of those wrap the call in a
check of PG_fscache and it shouldn't bother Ceph as that only has
PG_private set whilst writeback is in progress. This might be an issue for
9P, however.
Note that the pages aren't entirely stuck. Removing a file or unmounting
will clear things because that uses ->invalidatepage() instead.
Fixes: 201a15428b ("FS-Cache: Handle pages pending storage that get evicted under OOM conditions")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2bf801ecd upstream.
Include the OF-based modalias in the uevent sent when registering devices
on the sunxi RSB bus, so that user space has a chance to autoload the
kernel module for the device.
Fixes a regression caused by commit 3f241bfa60 ("arm64: allwinner: a64:
pine64: Use dcdc1 regulator for mmc0"). When the axp20x-rsb module for
the AXP803 PMIC is built as a module, it is not loaded and the system
ends up with an disfunctional MMC controller.
Fixes: d787dcdb9c ("bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus")
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d042566d8c upstream.
Without the gf128mul library support, we can run into a link
error:
drivers/crypto/chelsio/chcr_algo.o: In function `chcr_update_tweak':
chcr_algo.c:(.text+0x7e0): undefined reference to `gf128mul_x8_ble'
This adds a Kconfig select statement for it, next to the ones we
already have.
Fixes: b8fd1f4170 ("crypto: chcr - Add ctr mode and process large sg entries for cipher")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d76c68109f upstream.
pcrypt is using the old way of freeing instances, where the ->free()
method specified in the 'struct crypto_template' is passed a pointer to
the 'struct crypto_instance'. But the crypto_instance is being
kfree()'d directly, which is incorrect because the memory was actually
allocated as an aead_instance, which contains the crypto_instance at a
nonzero offset. Thus, the wrong pointer was being kfree()'d.
Fix it by switching to the new way to free aead_instance's where the
->free() method is specified in the aead_instance itself.
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 0496f56065 ("crypto: pcrypt - Add support for new AEAD interface")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 203f45003a upstream.
queue_cache_init is first called for the Control Word Queue
(n2_crypto_probe). At that time, queue_cache[0] is NULL and a new
kmem_cache will be allocated. If the subsequent n2_register_algs call
fails, the kmem_cache will be released in queue_cache_destroy, but
queue_cache_init[0] is not set back to NULL.
So when the Module Arithmetic Unit gets probed next (n2_mau_probe),
queue_cache_init will not allocate a kmem_cache again, but leave it
as its bogus value, causing a BUG() to trigger when queue_cache[0] is
eventually passed to kmem_cache_zalloc:
n2_crypto: Found N2CP at /virtual-devices@100/n2cp@7
n2_crypto: Registered NCS HVAPI version 2.0
called queue_cache_init
n2_crypto: md5 alg registration failed
n2cp f028687c: /virtual-devices@100/n2cp@7: Unable to register algorithms.
called queue_cache_destroy
n2cp: probe of f028687c failed with error -22
n2_crypto: Found NCP at /virtual-devices@100/ncp@6
n2_crypto: Registered NCS HVAPI version 2.0
called queue_cache_init
kernel BUG at mm/slab.c:2993!
Call Trace:
[0000000000604488] kmem_cache_alloc+0x1a8/0x1e0
(inlined) kmem_cache_zalloc
(inlined) new_queue
(inlined) spu_queue_setup
(inlined) handle_exec_unit
[0000000010c61eb4] spu_mdesc_scan+0x1f4/0x460 [n2_crypto]
[0000000010c62b80] n2_mau_probe+0x100/0x220 [n2_crypto]
[000000000084b174] platform_drv_probe+0x34/0xc0
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f24c4d4780 upstream.
Commit:
82c3768b8d ("efi/capsule-loader: Use a cached copy of the capsule header")
... refactored the capsule loading code that maps the capsule header,
to avoid having to map it several times.
However, as it turns out, the vmap() call we ended up removing did not
just map the header, but the entire capsule image, and dropping this
virtual mapping breaks capsules that are processed by the firmware
immediately (i.e., without a reboot).
Unfortunately, that change was part of a larger refactor that allowed
a quirk to be implemented for Quark, which has a non-standard memory
layout for capsules, and we have slightly painted ourselves into a
corner by allowing quirk code to mangle the capsule header and memory
layout.
So we need to fix this without breaking Quark. Fortunately, Quark does
not appear to care about the virtual mapping, and so we can simply
do a partial revert of commit:
2a457fb31d ("efi/capsule-loader: Use page addresses rather than struct page pointers")
... and create a vmap() mapping of the entire capsule (including header)
based on the reinstated struct page array, unless running on Quark, in
which case we pass the capsule header copy as before.
Reported-by: Ge Song <ge.song@hxt-semitech.com>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Ge Song <ge.song@hxt-semitech.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 82c3768b8d ("efi/capsule-loader: Use a cached copy of the capsule header")
Link: http://lkml.kernel.org/r/20180102172110.17018-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec35e48b28 upstream.
refcounts have a generic implementation and an asm optimized one. The
generic version has extra debugging to make sure that once a refcount
goes to zero, refcount_inc won't increase it.
The btrfs delayed inode code wasn't expecting this, and we're tripping
over the warnings when the generic refcounts are used. We ended up with
this race:
Process A Process B
btrfs_get_delayed_node()
spin_lock(root->inode_lock)
radix_tree_lookup()
__btrfs_release_delayed_node()
refcount_dec_and_test(&delayed_node->refs)
our refcount is now zero
refcount_add(2) <---
warning here, refcount
unchanged
spin_lock(root->inode_lock)
radix_tree_delete()
With the generic refcounts, we actually warn again when process B above
tries to release his refcount because refcount_add() turned into a
no-op.
We saw this in production on older kernels without the asm optimized
refcounts.
The fix used here is to use refcount_inc_not_zero() to detect when the
object is in the middle of being freed and return NULL. This is almost
always the right answer anyway, since we usually end up pitching the
delayed_node if it didn't have fresh data in it.
This also changes __btrfs_release_delayed_node() to remove the extra
check for zero refcounts before radix tree deletion.
btrfs_get_delayed_node() was the only path that was allowing refcounts
to go from zero to one.
Fixes: 6de5f18e7b ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node")
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d9570158b upstream.
As Tsukada explains, the time_is_before_jiffies(acct->needcheck) check
is very wrong, we need time_is_after_jiffies() to make sys_acct() work.
Ignoring the overflows, the code should "goto out" if needcheck >
jiffies, while currently it checks "needcheck < jiffies" and thus in the
likely case check_free_space() does nothing until jiffies overflow.
In particular this means that sys_acct() is simply broken, acct_on()
sets acct->needcheck = jiffies and expects that check_free_space()
should set acct->active = 1 after the free-space check, but this won't
happen if jiffies increments in between.
This was broken by commit 32dc730860 ("get rid of timer in
kern/acct.c") in 2011, then another (correct) commit 795a2f22a8
("acct() should honour the limits from the very beginning") made the
problem more visible.
Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com
Fixes: 32dc730860 ("get rid of timer in kern/acct.c")
Reported-by: TSUKADA Koutaro <tsukada@ascade.co.jp>
Suggested-by: TSUKADA Koutaro <tsukada@ascade.co.jp>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e5476815f upstream.
The recent changes for PTI touch cpu_tlbstate from various tlb_flush
inlines. cpu_tlbstate is exported as GPL symbol, so this causes a
regression when building out of tree drivers for certain graphics cards.
Aside of that the export was wrong since it was introduced as it should
have been EXPORT_PER_CPU_SYMBOL_GPL().
Use the correct PER_CPU export and drop the _GPL to restore the previous
state which allows users to utilize the cards they payed for.
As always I'm really thrilled to make this kind of change to support the
#friends (or however the hot hashtag of today is spelled) from that closet
sauce graphics corp.
Fixes: 1e02ce4ccc ("x86: Store a per-cpu shadow copy of CR4")
Fixes: 6fd166aae7 ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42f3bdc5dd upstream.
Thomas reported the following warning:
BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
caller is native_flush_tlb_single+0x57/0xc0
native_flush_tlb_single+0x57/0xc0
__set_pte_vaddr+0x2d/0x40
set_pte_vaddr+0x2f/0x40
cea_set_pte+0x30/0x40
ds_update_cea.constprop.4+0x4d/0x70
reserve_ds_buffers+0x159/0x410
x86_reserve_hardware+0x150/0x160
x86_pmu_event_init+0x3e/0x1f0
perf_try_init_event+0x69/0x80
perf_event_alloc+0x652/0x740
SyS_perf_event_open+0x3f6/0xd60
do_syscall_64+0x5c/0x190
set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
there are two problems with that:
1) The resulting flush is not supposed to be called in preemptible context
2) The cpu entry area is supposed to be per CPU, but the debug store
buffers are mapped for all CPUs so these mappings need to be flushed
globally.
Add the necessary preemption protection across the mapping code and flush
TLBs globally.
Fixes: c1961a4631 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5a40711fa upstream.
Since f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.
So passing page unaligned address to kasan_populate_zero_shadow() have two
possible effects:
1) It may leave one page hole in supposed to be populated area. After commit
21506525fb ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
hole happens to be in the shadow covering fixmap area and leads to crash:
BUG: unable to handle kernel paging request at fffffbffffe8ee04
RIP: 0010:check_memory_region+0x5c/0x190
Call Trace:
<NMI>
memcpy+0x1f/0x50
ghes_copy_tofrom_phys+0xab/0x180
ghes_read_estatus+0xfb/0x280
ghes_notify_nmi+0x2b2/0x410
nmi_handle+0x115/0x2c0
default_do_nmi+0x57/0x110
do_nmi+0xf8/0x150
end_repeat_nmi+0x1a/0x1e
Note, the crash likely disappeared after commit 92a0f81d89, which
changed kasan_populate_zero_shadow() call the way it was before
commit 21506525fb.
2) Attempt to load module near MODULES_END will fail, because
__vmalloc_node_range() called from kasan_module_alloc() will hit the
WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.
To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
which means that MODULES_END should be 8*PAGE_SIZE aligned.
The whole point of commit f06bdd4001 was to move MODULES_END down if
NR_CPUS is big, so the cpu_entry_area takes a lot of space.
But since 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
the cpu_entry_area is no longer in fixmap, so we could just set
MODULES_END to a fixed 8*PAGE_SIZE aligned address.
Fixes: f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13bb1d78f2 upstream.
This is a little more efficient and avoids the warning
WARNING: possible circular locking dependency detected
4.14.0-rc7-00010 #16 Not tainted
------------------------------------------------------
kworker/2:1/70 is trying to acquire lock:
(prepare_lock){+.+.}, at: [<c049300c>] clk_prepare_lock+0x80/0xf4
but task is already holding lock:
(i2c_register_adapter){+.+.}, at: [<c0690b04>]
i2c_adapter_lock_bus+0x14/0x18
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (i2c_register_adapter){+.+.}:
rt_mutex_lock+0x44/0x5c
i2c_adapter_lock_bus+0x14/0x18
i2c_transfer+0xa8/0xbc
i2c_smbus_xfer+0x20c/0x5d8
i2c_smbus_read_byte_data+0x38/0x48
m41t80_sqw_is_prepared+0x18/0x28
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cb90ed3de upstream.
This is a little more efficient, and avoids the warning
WARNING: possible circular locking dependency detected
4.14.0-rc7-00007 #14 Not tainted
------------------------------------------------------
alsactl/330 is trying to acquire lock:
(prepare_lock){+.+.}, at: [<c049300c>] clk_prepare_lock+0x80/0xf4
but task is already holding lock:
(i2c_register_adapter){+.+.}, at: [<c0690ae0>]
i2c_adapter_lock_bus+0x14/0x18
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (i2c_register_adapter){+.+.}:
rt_mutex_lock+0x44/0x5c
i2c_adapter_lock_bus+0x14/0x18
i2c_transfer+0xa8/0xbc
i2c_smbus_xfer+0x20c/0x5d8
i2c_smbus_read_byte_data+0x38/0x48
m41t80_sqw_recalc_rate+0x24/0x58
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9cdbe72c4 upstream.
The show_regs_safe() logic is wrong. When there's an iret stack frame,
it prints the entire pt_regs -- most of which is random stack data --
instead of just the five registers at the end.
show_regs_safe() is also poorly named: the on_stack() checks aren't for
safety. Rename the function to show_regs_if_on_stack() and add a
comment to explain why the checks are needed.
These issues were introduced with the "partial register dump" feature of
the following commit:
b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")
That patch had gone through a few iterations of development, and the
above issues were artifacts from a previous iteration of the patch where
'regs' pointed directly to the iret frame rather than to the (partially
empty) pt_regs.
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Fixes: b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")
Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52994c256d upstream.
Meelis reported that his K8 Athlon64 emits MCE warnings when PTI is
enabled:
[Hardware Error]: Error Addr: 0x0000ffff81e000e0
[Hardware Error]: MC1 Error: L1 TLB multimatch.
[Hardware Error]: cache level: L1, tx: INSN
The address is in the entry area, which is mapped into kernel _AND_ user
space. That's special because we switch CR3 while we are executing
there.
User mapping:
0xffffffff81e00000-0xffffffff82000000 2M ro PSE GLB x pmd
Kernel mapping:
0xffffffff81000000-0xffffffff82000000 16M ro PSE x pmd
So the K8 is complaining that the TLB entries differ. They differ in the
GLB bit.
Drop the GLB bit when installing the user shared mapping.
Fixes: 6dc72c3cbc ("x86/mm/pti: Share entry text PMD")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031407180.1957@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 694d99d409 upstream.
AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting
the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI
is set.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc32b5c3e6 upstream.
If userspace attempted to set a "security.capability" xattr shorter than
4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
cap_convert_nscap() read past the end of the buffer containing the xattr
value because it accessed the ->magic_etc field without verifying that
the xattr value is long enough to contain that field.
Fix it by validating the xattr value size first.
This bug was found using syzkaller with KASAN. The KASAN report was as
follows (cleaned up slightly):
BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852
CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xe3/0x195 lib/dump_stack.c:53
print_address_description+0x73/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x235/0x350 mm/kasan/report.c:409
cap_convert_nscap+0x514/0x630 security/commoncap.c:498
setxattr+0x2bd/0x350 fs/xattr.c:446
path_setxattr+0x168/0x1b0 fs/xattr.c:472
SYSC_setxattr fs/xattr.c:487 [inline]
SyS_setxattr+0x36/0x50 fs/xattr.c:483
entry_SYSCALL_64_fastpath+0x18/0x85
Fixes: 8db6c34f1d ("Introduce v3 namespaced file capabilities")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7e51dcf3b upstream.
The tty_ldisc_receive_buf() helper returns the number of bytes
processed so drop the bogus "not" from the kernel doc comment.
Fixes: 8d082cd300 ("tty: Unify receive_buf() code paths")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 966031f340 upstream.
We added support for EXTPROC back in 2010 in commit 26df6d1340 ("tty:
Add EXTPROC support for LINEMODE") and the intent was to allow it to
override some (all?) ICANON behavior. Quoting from that original commit
message:
There is a new bit in the termios local flag word, EXTPROC.
When this bit is set, several aspects of the terminal driver
are disabled. Input line editing, character echo, and mapping
of signals are all disabled. This allows the telnetd to turn
off these functions when in linemode, but still keep track of
what state the user wants the terminal to be in.
but the problem turns out that "several aspects of the terminal driver
are disabled" is a bit ambiguous, and you can really confuse the n_tty
layer by setting EXTPROC and then causing some of the ICANON invariants
to no longer be maintained.
This fixes at least one such case (TIOCINQ) becoming unhappy because of
the confusion over whether ICANON really means ICANON when EXTPROC is set.
This basically makes TIOCINQ match the case of read: if EXTPROC is set,
we ignore ICANON. Also, make sure to reset the ICANON state ie EXTPROC
changes, not just if ICANON changes.
Fixes: 26df6d1340 ("tty: Add EXTPROC support for LINEMODE")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f414195b0 upstream.
Andy prefers to be paranoid about the pagetable free in the error path of
write_ldt(). Make it conditional and warn whenever the installment of a
secondary LDT fails.
Requested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>