commit aec51758ce upstream.
On a 32-bit platform, the value of n_blcoks_count may be wrong during
the file system is resized to size larger than 2^32 blocks. This may
caused the superblock being corrupted with zero blocks count.
Fixes: 1c6bd7173d
Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcf5ea1099 upstream.
ext4_find_unwritten_pgoff() does not properly handle a situation when
starting index is in the middle of a page and blocksize < pagesize. The
following command shows the bug on filesystem with 1k blocksize:
xfs_io -f -c "falloc 0 4k" \
-c "pwrite 1k 1k" \
-c "pwrite 3k 1k" \
-c "seek -a -r 0" foo
In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
SEEK_DATA) will return the correct result.
Fix the problem by neglecting buffers in a page before starting offset.
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df1e76f28f upstream.
The previous fix for filtering out of unwatched events was not entirely
correct. Instead of skipping the events we don't want, they are now
interpreted as events with opposing edge.
In order to fix it: always read the GPIO line value on interrupt and
only emit the event if it corresponds with the event type we requested.
Fixes: ad537b8225 ("gpiolib: fix filtering out unwanted events")
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit efe6f24160 upstream.
IRTE[GALogIntr] bit should set when enabling guest_mode, which enables
IOMMU to generate entry in GALog when IRTE[IsRun] is not set, and send
an interrupt to notify IOMMU driver.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>
Fixes: d98de49a53 ('iommu/amd: Enable vAPIC interrupt remapping mode by default')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3db40c312c upstream.
If the decrementer wraps again and de-asserts the decrementer
exception while hard-disabled, __check_irq_replay() has a test to
notice the wrap when interrupts are re-enabled.
The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS
flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked
because the decrementer interrupt was always the first one checked
after clearing the hard disable flag, but HMI check was moved ahead of
that, which introduced this bug.
This can cause a missed decrementer interrupt if we soft-disable
interrupts then take an HMI which is recorded in irq_happened, then
hard-disable interrupts for > 4s to wrap the decrementer.
Fixes: e0e0d6b739 ("powerpc/64: Replay hypervisor maintenance interrupt first")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd63f3cf1d upstream.
Currently flush_tmregs_to_thread() does not save the TM SPRs (TFHAR,
TFIAR, TEXASR) to the thread struct, unless the process is currently
inside a suspended transaction.
If the process is core dumping, and the TM SPRs have changed since the
last time the process was context switched, then we will save stale
values of the TM SPRs to the core dump.
Fix it by saving the live register state to the thread struct in that
case.
Fixes: 08e1c01d6a ("powerpc/ptrace: Enable support for TM SPR state")
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adb1fe9ae2 upstream.
Linus suggested we try to remove some of the low-hanging fruit related
to kernel address exposure in dmesg. The only leaks I see on my local
system are:
Freeing SMP alternatives memory: 32K (ffffffff9e309000 - ffffffff9e311000)
Freeing initrd memory: 10588K (ffffa0b736b42000 - ffffa0b737599000)
Freeing unused kernel memory: 3592K (ffffffff9df87000 - ffffffff9e309000)
Freeing unused kernel memory: 1352K (ffffa0b7288ae000 - ffffa0b728a00000)
Freeing unused kernel memory: 632K (ffffa0b728d62000 - ffffa0b728e00000)
Linus says:
"I suspect we should just remove [the addresses in the 'Freeing'
messages]. I'm sure they are useful in theory, but I suspect they
were more useful back when the whole "free init memory" was
originally done.
These days, if we have a use-after-free, I suspect the init-mem
situation is the easiest situation by far. Compared to all the dynamic
allocations which are much more likely to show it anyway. So having
debug output for that case is likely not all that productive."
With this patch the freeing messages now look like this:
Freeing SMP alternatives memory: 32K
Freeing initrd memory: 10588K
Freeing unused kernel memory: 3592K
Freeing unused kernel memory: 1352K
Freeing unused kernel memory: 632K
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/6836ff90c45b71d38e5d4405aec56fa9e5d1d4b2.1477405374.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 337c017ccd upstream.
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0
CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1
RIP: 0010:rcu_note_context_switch+0x207/0x6b0
Call Trace:
__schedule+0xda/0xba0
? kvm_async_pf_task_wait+0x1b2/0x270
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
RIP: 0010:__d_lookup_rcu+0x90/0x1e0
I encounter this when trying to stress the async page fault in L1 guest w/
L2 guests running.
Commit 9b132fbe54 (Add rcu user eqs exception hooks for async page
fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu
idle eqs when needed, to protect the code that needs use rcu. However,
we need to call the pair even if the function calls schedule(), as seen
from the above backtrace.
This patch fixes it by informing the RCU subsystem exit/enter the irq
towards/away from idle for both n.halted and !n.halted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1cd2e34c6 upstream.
Multiple frontend dailinks may be connected to a backend
dailink at the same time. When one of frontend dailinks is
closed, the associated backend dailink should not be closed
if it is connected to other active frontend dailinks. Change
ensures that backend dailink is closed only after all
connected frontend dailinks are closed.
Signed-off-by: Gopikrishnaiah Anandan <agopik@codeaurora.org>
Signed-off-by: Banajit Goswami <bgoswami@codeaurora.org>
Signed-off-by: Patrick Lai <plai@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5694785cf0 upstream.
As I was staring at the si_init_golden_registers code, I noticed that
the Pitcairn initialization silently falls through the Cape Verde
initialization, and the Oland initialization falls through the Hainan
initialization. However there is no comment stating that this is
intentional, and the radeon driver doesn't have any such fallthrough,
so I suspect this is not supposed to happen.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 62a3755341 ("drm/amdgpu: add si implementation v10")
Cc: Ken Wang <Qingqing.Wang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Marek Olšák" <maraeo@gmail.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89affbf5d9 upstream.
In codepaths that use the begin/retry interface for reading
mems_allowed_seq with irqs disabled, there exists a race condition that
stalls the patch process after only modifying a subset of the
static_branch call sites.
This problem manifested itself as a deadlock in the slub allocator,
inside get_any_partial. The loop reads mems_allowed_seq value (via
read_mems_allowed_begin), performs the defrag operation, and then
verifies the consistency of mem_allowed via the read_mems_allowed_retry
and the cookie returned by xxx_begin.
The issue here is that both begin and retry first check if cpusets are
enabled via cpusets_enabled() static branch. This branch can be
rewritted dynamically (via cpuset_inc) if a new cpuset is created. The
x86 jump label code fully synchronizes across all CPUs for every entry
it rewrites. If it rewrites only one of the callsites (specifically the
one in read_mems_allowed_retry) and then waits for the
smp_call_function(do_sync_core) to complete while a CPU is inside the
begin/retry section with IRQs off and the mems_allowed value is changed,
we can hang.
This is because begin() will always return 0 (since it wasn't patched
yet) while retry() will test the 0 against the actual value of the seq
counter.
The fix is to use two different static keys: one for begin
(pre_enable_key) and one for retry (enable_key). In cpuset_inc(), we
first bump the pre_enable key to ensure that cpuset_mems_allowed_begin()
always return a valid seqcount if are enabling cpusets. Similarly, when
disabling cpusets via cpuset_dec(), we first ensure that callers of
cpuset_mems_allowed_retry() will start ignoring the seqcount value
before we let cpuset_mems_allowed_begin() return 0.
The relevant stack traces of the two stuck threads:
CPU: 1 PID: 1415 Comm: mkdir Tainted: G L 4.9.36-00104-g540c51286237 #4
Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
task: ffff8817f9c28000 task.stack: ffffc9000ffa4000
RIP: smp_call_function_many+0x1f9/0x260
Call Trace:
smp_call_function+0x3b/0x70
on_each_cpu+0x2f/0x90
text_poke_bp+0x87/0xd0
arch_jump_label_transform+0x93/0x100
__jump_label_update+0x77/0x90
jump_label_update+0xaa/0xc0
static_key_slow_inc+0x9e/0xb0
cpuset_css_online+0x70/0x2e0
online_css+0x2c/0xa0
cgroup_apply_control_enable+0x27f/0x3d0
cgroup_mkdir+0x2b7/0x420
kernfs_iop_mkdir+0x5a/0x80
vfs_mkdir+0xf6/0x1a0
SyS_mkdir+0xb7/0xe0
entry_SYSCALL_64_fastpath+0x18/0xad
...
CPU: 2 PID: 1 Comm: init Tainted: G L 4.9.36-00104-g540c51286237 #4
Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
task: ffff8818087c0000 task.stack: ffffc90000030000
RIP: int3+0x39/0x70
Call Trace:
<#DB> ? ___slab_alloc+0x28b/0x5a0
<EOE> ? copy_process.part.40+0xf7/0x1de0
__slab_alloc.isra.80+0x54/0x90
copy_process.part.40+0xf7/0x1de0
copy_process.part.40+0xf7/0x1de0
kmem_cache_alloc_node+0x8a/0x280
copy_process.part.40+0xf7/0x1de0
_do_fork+0xe7/0x6c0
_raw_spin_unlock_irq+0x2d/0x60
trace_hardirqs_on_caller+0x136/0x1d0
entry_SYSCALL_64_fastpath+0x5/0xad
do_syscall_64+0x27/0x350
SyS_clone+0x19/0x20
do_syscall_64+0x60/0x350
entry_SYSCALL64_slow_path+0x25/0x25
Link: http://lkml.kernel.org/r/20170731040113.14197-1-dmitriyz@waymo.com
Fixes: 46e700abc4 ("mm, page_alloc: remove unnecessary taking of a seqlock when cpusets are disabled")
Signed-off-by: Dima Zavin <dmitriyz@waymo.com>
Reported-by: Cliff Spradlin <cspradlin@waymo.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ea277194d upstream.
Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.
He described the race as follows:
CPU0 CPU1
---- ----
user accesses memory using RW PTE
[PTE now cached in TLB]
try_to_unmap_one()
==> ptep_get_and_clear()
==> set_tlb_ubc_flush_pending()
mprotect(addr, PROT_READ)
==> change_pte_range()
==> [ PTE non-present - no flush ]
user writes using cached RW PTE
...
try_to_unmap_flush()
The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.
For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access. For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB. However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.
Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption. In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch. Other users such as gup as a race with
reclaim looks just at PTEs. huge page variants should be ok as they
don't race with reclaim. mincore only looks at PTEs. userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.
Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected. His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.
[akpm@linux-foundation.org: tweak comments]
[akpm@linux-foundation.org: fix spello]
Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 773dc11875 upstream.
HS400-ES devices fail to initialize with the following error messages.
mmc1: power class selection to bus width 8 ddr 0 failed
mmc1: error -110 whilst initialising MMC card
This was seen on Samsung Chromebook Plus. Code analysis points to
commit 3d4ef32975 ("mmc: core: fix multi-bit bus width without
high-speed mode"), which attempts to set the bus width for all but
HS200 devices unconditionally. However, for HS400-ES, the bus width
is already selected.
Cc: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Norris <briannorris@chromium.org>
Fixes: 3d4ef32975 ("mmc: core: fix multi-bit bus width ...")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chip.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a1e3f1431 upstream.
When the device is non removable, the card detect signal is often used
for another purpose i.e. muxed to another SoC peripheral or used as a
GPIO. It could lead to wrong behaviors depending the default value of
this signal if not muxed to the SDHCI controller.
Fixes: bb5f8ea4d5 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC")
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd40559c86 upstream.
The verifier is allocated on the stack, but the EXCHANGE_ID RPC call was
changed to be asynchronous by commit 8d89bd70bc. If we interrrupt
the call to rpc_wait_for_completion_task(), we can therefore end up
transmitting random stack contents in lieu of the verifier.
Fixes: 8d89bd70bc ("NFS setup async exchange_id")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f5d03143d upstream.
Due to a bugfix in wireless tree and the commit mentioned below a merge
was needed which went haywire. So the submitted change resulted in the
function brcmf_sdiod_sgtable_alloc() being called twice during the probe
thus leaking the memory of the first call.
Fixes: 4d79289598 ("brcmfmac: switch to new platform data")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b0f934e92 upstream.
iwlagn_check_ratid_empty takes the tid as a parameter, but
it doesn't check that it is not IWL_TID_NON_QOS.
Since IWL_TID_NON_QOS = 8 and iwl_priv::tid_data is an array
with 8 entries, accessing iwl_priv::tid_data[IWL_TID_NON_QOS]
is a bad idea.
This happened in iwlagn_rx_reply_tx. Since
iwlagn_check_ratid_empty is relevant only to check whether
we can open A-MPDU, this flow is irrelevant if tid is
IWL_TID_NON_QOS. Call iwlagn_check_ratid_empty only inside
the
if (tid != IWL_TID_NON_QOS)
a few lines earlier in the function.
Reported-by: Seraphime Kirkovski <kirkseraph@gmail.com>
Tested-by: Seraphime Kirkovski <kirkseraph@gmail.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c0338c687 upstream.
The combination of WQ_UNBOUND and max_active == 1 used to imply
ordered execution. After NUMA affinity 4c16bd327c ("workqueue:
implement NUMA affinity for unbound workqueues"), this is no longer
true due to per-node worker pools.
While the right way to create an ordered workqueue is
alloc_ordered_workqueue(), the documentation has been misleading for a
long time and people do use WQ_UNBOUND and max_active == 1 for ordered
workqueues which can lead to subtle bugs which are very difficult to
trigger.
It's unlikely that we'd see noticeable performance impact by enforcing
ordering on WQ_UNBOUND / max_active == 1 workqueues. Let's
automatically set __WQ_ORDERED for those workqueues.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Alexei Potashnik <alexei@purestorage.com>
Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59a5e266c3 upstream.
My static checker complains that "devno" can be negative, meaning that
we read before the start of the loop. I've looked at the code, and I
think the warning is right. This come from /proc so it's root only or
it would be quite a quite a serious bug. The call tree looks like this:
proc_scsi_write() <- gets id and channel from simple_strtoul()
-> scsi_add_single_device() <- calls shost->transportt->user_scan()
-> ata_scsi_user_scan()
-> ata_find_dev()
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c74541777 upstream.
While refactoring, f7b2814bb9 ("cgroup: factor out
cgroup_{apply|finalize}_control() from
cgroup_subtree_control_write()") broke error return value from the
function. The return value from the last operation is always
overridden to zero. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7af608e4f9 upstream.
On subsystem registration, css_populate_dir() is not called on the new
root css, so the interface files for the subsystem on cgrp_dfl_root
aren't created on registration. This is a residue from the days when
cgrp_dfl_root was used only as the parking spot for unused subsystems,
which no longer is true as it's used as the root for cgroup2.
This is often fine as later operations tend to create them as a part
of mount (cgroup1) or subtree_control operations (cgroup2); however,
it's not difficult to mount cgroup2 with the controller interface
files missing as Waiman found out.
Fix it by invoking css_populate_dir() on the root css on subsys
registration.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13d57093c1 upstream.
In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG
statement in flush_cache_range() during a system shutdown:
kernel BUG at arch/parisc/kernel/cache.c:595!
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
IAOQ[0]: flush_cache_range+0x144/0x148
IAOQ[1]: flush_cache_page+0x0/0x1a8
RP(r2): flush_cache_range+0xec/0x148
Backtrace:
[<00000000402910ac>] unmap_page_range+0x84/0x880
[<00000000402918f4>] unmap_single_vma+0x4c/0x60
[<0000000040291a18>] zap_page_range_single+0x110/0x160
[<0000000040291c34>] unmap_mapping_range+0x174/0x1a8
[<000000004026ccd8>] truncate_pagecache+0x50/0xa8
[<000000004026cd84>] truncate_setsize+0x54/0x70
[<000000004033d534>] put_aio_ring_file+0x44/0xb0
[<000000004033d5d8>] aio_free_ring+0x38/0x140
[<000000004033d714>] free_ioctx+0x34/0xa8
[<00000000401b0028>] process_one_work+0x1b8/0x4d0
[<00000000401b04f4>] worker_thread+0x1b4/0x648
[<00000000401b9128>] kthread+0x1b0/0x208
[<0000000040150020>] end_fault_vector+0x20/0x28
[<0000000040639518>] nf_ip_reroute+0x50/0xa8
[<0000000040638ed0>] nf_ip_route+0x10/0x78
[<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
Backtrace:
[<0000000040163bf0>] show_stack+0x20/0x38
[<0000000040688480>] dump_stack+0xa8/0x120
[<0000000040163dc4>] die_if_kernel+0x19c/0x2b0
[<0000000040164d0c>] handle_interruption+0xa24/0xa48
This patch modifies flush_cache_range() to handle non current contexts.
In as much as this occurs infrequently, the simplest approach is to
flush the entire cache when this happens.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f169b9f52 ]
When multiple front-ends are using the same back-end, putting state of a
front-end to STOP state upon receiving pause command will result in backend
stream getting released by DPCM framework unintentionally. In order to
avoid backend to be released when another active front-end stream is
present, put the stream state to PAUSED state instead of STOP state.
Signed-off-by: Patrick Lai <plai@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d1148f0f4 ]
bna & bfa firmware version 3.2.5.1 was submitted to linux-firmware on
Feb 17 19:10:20 2015 -0500 in 0ab54ff1dc ("linux-firmware: Add QLogic BR
Series Adapter Firmware").
bna was updated to use the newer firmware on Feb 19 16:02:32 2015 -0500 in
3f307c3d70 ("bna: Update the Driver and Firmware Version")
bfa was not updated. I presume this was an oversight but it broke support
for bfa+bna cards such as the following
04:00.0 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.1 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.2 Ethernet controller [0200]: Brocade Communications Systems,
Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.3 Ethernet controller [0200]: Brocade Communications Systems,
Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
Currently, if the bfa module is loaded first, bna fails to probe the
respective devices with
[ 215.026787] bna: QLogic BR-series 10G Ethernet driver - version: 3.2.25.1
[ 215.043707] bna 0000:04:00.2: bar0 mapped to ffffc90001fc0000, len 262144
[ 215.060656] bna 0000:04:00.2: initialization failed err=1
[ 215.073893] bna 0000:04:00.3: bar0 mapped to ffffc90002040000, len 262144
[ 215.090644] bna 0000:04:00.3: initialization failed err=1
Whereas if bna is loaded first, bfa fails with
[ 249.592109] QLogic BR-series BFA FC/FCOE SCSI driver - version: 3.2.25.0
[ 249.610738] bfa 0000:04:00.0: Running firmware version is incompatible with the driver version
[ 249.833513] bfa 0000:04:00.0: bfa init failed
[ 249.833919] scsi host6: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.0 driver: 3.2.25.0
[ 249.841446] bfa 0000:04:00.1: Running firmware version is incompatible with the driver version
[ 250.045449] bfa 0000:04:00.1: bfa init failed
[ 250.045962] scsi host7: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.1 driver: 3.2.25.0
Increase bfa's requested firmware version. Also increase the driver
version. I only tested that all of the devices probe without error.
Reported-by: Tim Ehlers <tehlers@gwdg.de>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 13a6c8328e ]
Testing EP_FLAG_RUNNING in snd_complete_urb() before running the completion
logic allows us to save a few cpu cycles by returning early, skipping the
pending urb in case the stream was stopped; the stop logic handles the urb
and sets the completion callbacks to NULL.
Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71eae1ca77 ]
The RX descriptor word 0 on SH7734 has the RFS[9:0] field in bits 16-25
(bits 0-15 usually used for that are occupied by the packet checksum).
Thus we need to set the 'shift_rd0' field in the SH7734 SoC data...
Fixes: f0e81fecd4 ("net: sh_eth: Add support SH7734")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4ee437fbf6 ]
The fsl_ssi fifo watermark is by default set to 2 free spaces (i.e.
activate DMA on FIFO when only 2 spaces are left.) This means the
DMA must service the fifo within 2 audio samples, which is just not
enough time for many use cases with high data rate. In many
configurations the audio channel slips (causing l/r swap in stereo
configurations, or channel slipping in multi-channel configurations).
This patch gives more breathing room and allows the SSI to operate
reliably by changing the fifio refill watermark to 8.
There is no change in behavior for older chips (with an 8-deep fifo).
Only the newer chips with a 15-deep fifo get the new behavior. I
suspect a new fifo depth setting could be optimized on the older
chips too, but I have not tested.
Signed-off-by: Caleb Crome <caleb@crome.org>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 63dfb0dac9 ]
The USB core may call reset_resume when it fails to resume asix device.
And USB core can recovery this abnormal resume at low level driver,
the same .resume at asix driver can work too. Add .reset_resume can
avoid disconnecting after backing from system resume, and NFS can
still be mounted after this commit.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 14ba972842 ]
All i.MX6 SoCs have an OCOTP Controller with 4kbit fuses. The i.MX6SL is
an exception and has only 2kbit fuses.
In the TRM for the i.MX6DQ (IMX6QDRM - Rev 2, 06/2014) the fuses size is
described in chapter 46.1.1 with:
"32-bit word restricted program and read to 4Kbits of eFuse OTP(512x8)."
In the TRM for the i.MX6SL (IMX6SLRM - Rev 2, 06/2015) the fuses size is
described in chapter 34.1.1 with:
"32-bit word restricted program and read to 2 kbit of eFuse OTP(128x8)."
Since the Freescale Linux kernel OCOTP driver works with a fuses size of
2 kbit for the i.MX6SL, it looks like the TRM is wrong and the formula
to calculate the correct fuses size has to be 256x8.
Signed-off-by: Daniel Schultz <d.schultz@phytec.de>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6ef4fb387d ]
Recent changes made KERN_CONT mandatory for continued lines. In the
absence of KERN_CONT, a newline may be implicit inserted by the core
printk code.
In show_pte, we (erroneously) use printk without KERN_CONT for continued
prints, resulting in output being split across a number of lines, and
not matching the intended output, e.g.
[ff000000000000] *pgd=00000009f511b003
, *pud=00000009f4a80003
, *pmd=0000000000000000
Fix this by using pr_cont() for all the continuations.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c86d77743 upstream.
On IPv4-mapped IPv6 addresses sk_family is AF_INET6,
but the flow informations are created based on AF_INET.
So the routing set up 'struct flowi4' but we try to
access 'struct flowi6' what leads to an out of bounds
access. Fix this by using the family we get with the
dst_entry, like we do it for the standard policy lookup.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 074859184d ]
Currently, the sched:sched_switch tracepoint reports deadline tasks with
priority -1. But when reading the trace via perf script I've got the
following output:
# ./d & # (d is a deadline task, see [1])
# perf record -e sched:sched_switch -a sleep 1
# perf script
...
swapper 0 [000] 2146.962441: sched:sched_switch: swapper/0:0 [120] R ==> d:2593 [4294967295]
d 2593 [000] 2146.972472: sched:sched_switch: d:2593 [4294967295] R ==> g:2590 [4294967295]
The task d reports the wrong priority [4294967295]. This happens because
the "int prio" is stored in an unsigned long long val. Although it is
set as a %lld, as int is shorter than unsigned long long,
trace_seq_printf prints it as a positive number.
The fix is just to cast the val as an int, and print it as a %d,
as in the sched:sched_switch tracepoint's "format".
The output with the fix is:
# ./d &
# perf record -e sched:sched_switch -a sleep 1
# perf script
...
swapper 0 [000] 4306.374037: sched:sched_switch: swapper/0:0 [120] R ==> d:10941 [-1]
d 10941 [000] 4306.383823: sched:sched_switch: d:10941 [-1] R ==> swapper/0:0 [120]
[1] d.c
---
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/types.h>
#include <linux/sched.h>
struct sched_attr {
__u32 size, sched_policy;
__u64 sched_flags;
__s32 sched_nice;
__u32 sched_priority;
__u64 sched_runtime, sched_deadline, sched_period;
};
int sched_setattr(pid_t pid, const struct sched_attr *attr, unsigned int flags)
{
return syscall(__NR_sched_setattr, pid, attr, flags);
}
int main(void)
{
struct sched_attr attr = {
.size = sizeof(attr),
.sched_policy = SCHED_DEADLINE, /* This creates a 10ms/30ms reservation */
.sched_runtime = 10 * 1000 * 1000,
.sched_period = attr.sched_deadline = 30 * 1000 * 1000,
};
if (sched_setattr(0, &attr, 0) < 0) {
perror("sched_setattr");
return -1;
}
for(;;);
}
---
Committer notes:
Got the program from the provided URL, http://bristot.me/lkml/d.c,
trimmed it and included in the cset log above, so that we have
everything needed to test it in one place.
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/866ef75bcebf670ae91c6a96daa63597ba981f0d.1483443552.git.bristot@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0b47a6bd11 ]
Ensure all reserved fields of xatp are zero before making
hypervisor call to XEN in xen_map_device_mmio().
xenmem_add_to_physmap_one() in XEN fails the mapping request if
extra.res reserved field in xatp is not zero for XENMAPSPACE_dev_mmio
request.
Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c2931667c8 ]
Currently how btrfs dio deals with split dio write is not good
enough if dio write is split into several segments due to the
lack of contiguous space, a large dio write like 'dd bs=1G count=1'
can end up with incorrect outstanding_extents counter and endio
would complain loudly with an assertion.
This fixes the problem by compensating the outstanding_extents
counter in inode if a large dio write gets split.
Reported-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>