commit 7bf9b7bef8 upstream.
find_vma() is *not* safe when somebody else is removing vmas. Not just
the return value might get bogus just as you are getting it (this instance
doesn't try to dereference the resulting vma), the search itself can get
buggered in rather spectacular ways. IOW, ->mmap_sem really, really is
not optional here.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f40b90972 upstream.
When booting a secondary CPU, the primary CPU hands two sets of page
tables via the secondary_data struct:
(1) swapper_pg_dir: a normal, cacheable, shared (if SMP) mapping
of the kernel image (i.e. the tables used by init_mm).
(2) idmap_pgd: an uncached mapping of the .idmap.text ELF
section.
The idmap is generally used when enabling and disabling the MMU, which
includes early CPU boot. In this case, the secondary CPU switches to
swapper as soon as it enters C code:
struct mm_struct *mm = &init_mm;
unsigned int cpu = smp_processor_id();
/*
* All kernel threads share the same mm context; grab a
* reference and switch to it.
*/
atomic_inc(&mm->mm_count);
current->active_mm = mm;
cpumask_set_cpu(cpu, mm_cpumask(mm));
cpu_switch_mm(mm->pgd, mm);
This causes a problem on ARMv7, where the identity mapping is treated as
strongly-ordered leading to architecturally UNPREDICTABLE behaviour of
exclusive accesses, such as those used by atomic_inc.
This patch re-orders the secondary_start_kernel function so that we
switch to swapper before performing any exclusive accesses.
Reported-by: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: David McKay <david.mckay@st.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the smp call to stop the other cpus are handled in those
cpus in interrupt context, there's a potential for those smp
handlers to interrupt threads holding spin locks (such as the
one a mutex holds). This prevents those threads from ever
releasing their spin lock, so if the cpu doing the shutdown
is allowed to switch to another thread that tries to grab the
same lock/mutex, we could get into a deadlock (the spin lock
call is called with preemption disabled in the mutex lock code).
To avoid that possibility, disable preemption before doing the
smp_send_stop().
Change-Id: I7976c5382d7173fcb3cd14da8cc5083d442b2544
Signed-off-by: Mike J. Chen <mjchen@google.com>
commit 7deabca0ac upstream.
We can stall RCU processing on SMP platforms if a CPU sits in its idle
loop for a long time. This happens because we don't call irq_enter()
and irq_exit() around generic_smp_call_function_interrupt() and
friends. Add the necessary calls, and remove the one from within
ipi_timer(), so that they're all in a common place.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[add irq_enter()/irq_exit() in do_local_timer]
Signed-off-by: UCHINO Satoshi <satoshi.uchino@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fbd036b55 upstream.
Stepan found:
CPU0 CPUn
_cpu_up()
__cpu_up()
boostrap()
notify_cpu_starting()
set_cpu_online()
while (!cpu_active())
cpu_relax()
<PREEMPT-out>
smp_call_function(.wait=1)
/* we find cpu_online() is true */
arch_send_call_function_ipi_mask()
/* wait-forever-more */
<PREEMPT-in>
local_irq_enable()
cpu_notify(CPU_ONLINE)
sched_cpu_active()
set_cpu_active()
Now the purpose of cpu_active is mostly with bringing down a cpu, where
we mark it !active to avoid the load-balancer from moving tasks to it
while we tear down the cpu. This is required because we only update the
sched_domain tree after we brought the cpu-down. And this is needed so
that some tasks can still run while we bring it down, we just don't want
new tasks to appear.
On cpu-up however the sched_domain tree doesn't yet include the new cpu,
so its invisible to the load-balancer, regardless of the active state.
So instead of setting the active state after we boot the new cpu (and
consequently having to wait for it before enabling interrupts) set the
cpu active before we set it online and avoid the whole mess.
commit 435a7ef52d upstream.
We can't be holding the mmap_sem while calling flush_cache_user_range
because the flush can fault. If we fault on a user address, the
page fault handler will try to take mmap_sem again. Since both places
acquire the read lock, most of the time it succeeds. However, if another
thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
between the call to flush_cache_user_range and the fault, the down_read
in do_page_fault will deadlock.
[will: removed drop of vma parameter as already queued by rmk (7365/1)]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dima Zavin <dima@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e787ec1376 upstream.
The inline assembly in kernel_execve() uses r8 and r9. Since this
code sequence does not return, it usually doesn't matter if the
register clobber list is accurate. However, I saw a case where a
particular version of gcc used r8 as an intermediate for the value
eventually passed to r9. Because r8 is used in the inline
assembly, and not mentioned in the clobber list, r9 was set
to an incorrect value.
This resulted in a kernel panic on execution of the first user-space
program in the system. r9 is used in ret_to_user as the thread_info
pointer, and if it's wrong, bad things happen.
Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8130b9d7b9 upstream.
If we are context switched whilst copying into a thread's
vfp_hard_struct then the partial copy may be corrupted by the VFP
context switching code (see "ARM: vfp: flush thread hwstate before
restoring context from sigframe").
This patch updates the ptrace VFP set code so that the thread state is
flushed before the copy, therefore disabling VFP and preventing
corruption from occurring.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 247f4993a5 upstream.
In a preemptible kernel, vfp_set() can be preempted, causing the
hardware VFP context to be switched while the thread vfp state is
being read and modified. This leads to a race condition which can
cause the thread vfp state to become corrupted if lazy VFP context
save occurs due to preemption in between the time thread->vfpstate
is read and the time the modified state is written back.
This may occur if preemption occurs during the execution of a
ptrace() call which modifies the VFP register state of a thread.
Such instances should be very rare in most realistic scenarios --
none has been reported, so far as I am aware. Only uniprocessor
systems should be affected, since VFP context save is not currently
lazy in SMP kernels.
The problem was introduced by my earlier patch migrating to use
regsets to implement ptrace.
This patch does a vfp_sync_hwstate() before reading
thread->vfpstate, to make sure that the thread's VFP state is not
live in the hardware registers while the registers are modified.
Thanks to Will Deacon for spotting this.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2af276dfb1 upstream.
Following execution of a signal handler, we currently restore the VFP
context from the ucontext in the signal frame. This involves copying
from the user stack into the current thread's vfp_hard_struct and then
flushing the new data out to the hardware registers.
This is problematic when using a preemptible kernel because we could be
context switched whilst updating the vfp_hard_struct. If the current
thread has made use of VFP since the last context switch, the VFP
notifier will copy from the hardware registers into the vfp_hard_struct,
overwriting any data that had been partially copied by the signal code.
Disabling preemption across copy_from_user calls is a terrible idea, so
instead we move the VFP thread flush *before* we update the
vfp_hard_struct. Since the flushing is performed lazily, this has the
effect of disabling VFP and clearing the CPU's VFP state pointer,
therefore preventing the thread from being updated with stale data on
the next context switch.
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8428e84d42 upstream.
Recent gcc versions generate unaligned accesses by default on ARMv6 and
later processors. This patch ensures that the SCTLR.A bit is always
cleared on such processors to avoid kernel traping before
alignment_init() is called.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: John Linn <John.Linn@xilinx.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit fde165b2a2 upstream.
Commit 4e8ee7de22 (ARM: SMP: use
idmap_pgd for mapping MMU enable during secondary booting)
switched secondary boot to use idmap_pgd, which is initialized
during early_initcall, instead of a page table initialized during
__cpu_up. This causes idmap_pgd to contain the static mappings
but be missing all dynamic mappings.
If a console is registered that creates a dynamic mapping, the
printk in secondary_start_kernel will trigger a data abort on
the missing mapping before the exception handlers have been
initialized, leading to a hang. Initial boot is not affected
because no consoles have been registered, and resume is usually
not affected because the offending console is suspended.
Onlining a cpu with hotplug triggers the problem.
A workaround is to the printk in secondary_start_kernel until
after the page tables have been switched back to init_mm.
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e787ec1376 upstream.
The inline assembly in kernel_execve() uses r8 and r9. Since this
code sequence does not return, it usually doesn't matter if the
register clobber list is accurate. However, I saw a case where a
particular version of gcc used r8 as an intermediate for the value
eventually passed to r9. Because r8 is used in the inline
assembly, and not mentioned in the clobber list, r9 was set
to an incorrect value.
This resulted in a kernel panic on execution of the first user-space
program in the system. r9 is used in ret_to_user as the thread_info
pointer, and if it's wrong, bad things happen.
Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During booting of cpu1, there is a short window where cpu1
is online, but not active where cpu1 is occupied by waiting
to become active. If cpu0 then decides to schedule something
on cpu1 and wait for it to complete, before cpu0 has set
cpu1 active, we have a deadlock.
Typically it's this CPU frequency transition that happens at
this time, so let's just not wait for it to happen, it will
happen whenever the CPU eventually comes online instead.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Reviewed-by: Rickard Andersson <rickard.andersson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On secondary CPUs, the Timer Control Register is not reset
to a sane value before the timer is registered, and the TRM
doesn't seem to indicate any reset value either. In some cases,
the kernel will take an interrupt too early, depending on what
junk was present in the registers at reset time.
The fix is to set the Timer Control Register to 0 before
registering the clock_event_device and enabling the interrupt.
Problem seen on VE (Cortex A5) and Tegra.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The smp_twd clockevents driver currently enables the local timer PPI
before the clockevents device is registered. This can lead to a kernel
panic if a spurious timer interrupt is generated before registration
has completed since the kernel will treat it as an IPI timer.
This patch moves the clockevents device registration before the IRQ
unmasking so that we can always handle timer interrupts once they can
occur.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
clockevents_reconfigure was an older api that doesn't handle
updating the max time between ticks when the frequency changes.
Under some conditions, the boot value of max_delta_ns scaled
by the mult/shift values for the current frequency can result
in a value of 0x200000004 selected as the number of cycles to
program for a long tick, which gets wrapped to 0x4.
Also switch to the matching clockevents_config_and_register
function to register the clockevent, which handles converting
the min/max ticks to ns during init.
Change-Id: I6ca659c309e7bb031cdb1954767b5aa7a022ff44
Signed-off-by: Colin Cross <ccross@android.com>
The localtimer's clock changes with the cpu clock. After a
cpufreq transition, update the clockevent's frequency and
reprogram the next clock event.
Adds a clock called "smp_twd" that is used to determine the
twd frequency, which can also be used at init time to
avoid calibrating the twd frequency.
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Rob Herring <robherring2@gmail.com>
commit 8130b9d7b9 upstream.
If we are context switched whilst copying into a thread's
vfp_hard_struct then the partial copy may be corrupted by the VFP
context switching code (see "ARM: vfp: flush thread hwstate before
restoring context from sigframe").
This patch updates the ptrace VFP set code so that the thread state is
flushed before the copy, therefore disabling VFP and preventing
corruption from occurring.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 247f4993a5 upstream.
In a preemptible kernel, vfp_set() can be preempted, causing the
hardware VFP context to be switched while the thread vfp state is
being read and modified. This leads to a race condition which can
cause the thread vfp state to become corrupted if lazy VFP context
save occurs due to preemption in between the time thread->vfpstate
is read and the time the modified state is written back.
This may occur if preemption occurs during the execution of a
ptrace() call which modifies the VFP register state of a thread.
Such instances should be very rare in most realistic scenarios --
none has been reported, so far as I am aware. Only uniprocessor
systems should be affected, since VFP context save is not currently
lazy in SMP kernels.
The problem was introduced by my earlier patch migrating to use
regsets to implement ptrace.
This patch does a vfp_sync_hwstate() before reading
thread->vfpstate, to make sure that the thread's VFP state is not
live in the hardware registers while the registers are modified.
Thanks to Will Deacon for spotting this.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2af276dfb1 upstream.
Following execution of a signal handler, we currently restore the VFP
context from the ucontext in the signal frame. This involves copying
from the user stack into the current thread's vfp_hard_struct and then
flushing the new data out to the hardware registers.
This is problematic when using a preemptible kernel because we could be
context switched whilst updating the vfp_hard_struct. If the current
thread has made use of VFP since the last context switch, the VFP
notifier will copy from the hardware registers into the vfp_hard_struct,
overwriting any data that had been partially copied by the signal code.
Disabling preemption across copy_from_user calls is a terrible idea, so
instead we move the VFP thread flush *before* we update the
vfp_hard_struct. Since the flushing is performed lazily, this has the
effect of disabling VFP and clearing the CPU's VFP state pointer,
therefore preventing the thread from being updated with stale data on
the next context switch.
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If an idle notifier modifies a timer, calling the notifier after
the sched tick has been stopped may leave the sched tick set too
early. Move teh idle notifier call before the call to
tick_nohz_stop_sched_tick.
Change-Id: I0db3284bec6d0193bc5e2a57650ab06bd8342319
Signed-off-by: Colin Cross <ccross@android.com>
commit 11ed0ba175 upstream.
This patch implements a workaround for PL310 erratum 769419. On
revisions of the PL310 prior to r3p2, the Store Buffer does not
automatically drain. This can cause normal, non-cacheable writes to be
retained when the memory system is idle, leading to suboptimal I/O
performance for drivers using coherent DMA.
This patch adds an optional wmb() call to the cpu_idle loop. On systems
with an outer cache, this causes an explicit flush of the store buffer.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If an idle notifier modifies a timer, calling the notifier after
the sched tick has been stopped may leave the sched tick set too
early. Move teh idle notifier call before the call to
tick_nohz_stop_sched_tick.
Change-Id: I0db3284bec6d0193bc5e2a57650ab06bd8342319
Signed-off-by: Colin Cross <ccross@android.com>
commit 8428e84d42 upstream.
Recent gcc versions generate unaligned accesses by default on ARMv6 and
later processors. This patch ensures that the SCTLR.A bit is always
cleared on such processors to avoid kernel traping before
alignment_init() is called.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: John Linn <John.Linn@xilinx.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Such that interactive cpufreq governor uses up-to-date idle time
information.
Reported by Colin Cross <ccross@android.com>
Change-Id: I06425444f800f803afc9dc7a6ad0fdb46c918bb6
Signed-off-by: Todd Poynor <toddpoynor@google.com>
Patch is the last version from tglx on Oct 7.
Discussion is at: http://comments.gmane.org/gmane.linux.ports.arm.kernel/131919
The original commit message for the first patch version:
Frank Rowand reported:
I have a consistent (every boot) hang on boot with the RT patches.
With a few hacks to get console output, I get:
rcu_preempt_state detected stalls on CPUs/tasks
I have also replicated the problem on the ARM RealView (in tree) and
without the RT patches.
The problem ended up being caused by the allowed cpus mask being set
to all possible cpus for the ksoftirqd on the secondary processors.
So the RCU softirq was never executing on the secondary cpu.
The problem was that ksoftirqd was woken on the secondary processors before
the secondary processors were online. This led to allowed cpus being set
to all cpus.
wake_up_process()
try_to_wake_up()
select_task_rq()
if (... || !cpu_online(cpu))
select_fallback_rq(task_cpu(p), p)
...
/* No more Mr. Nice Guy. */
dest_cpu = cpuset_cpus_allowed_fallback(p)
do_set_cpus_allowed(p, cpu_possible_mask)
# Thus ksoftirqd can now run on any cpu...
</report>
The reason is that the ARM SMP boot code for the secondary CPUs enables
interrupts before the newly brought up CPU is marked online and
active.
That causes a wakeup of ksoftirqd or a wakeup of any other kernel
thread which is affine to the brought up CPU break that threads
affinity and therefor being scheduled on already online CPUs.
This problem has been observed on x86 before and the only solution is
to mark the CPU online and wait for the CPU active bit before the
point where interrupts are enabled.
Change-Id: If948ef52d434191579e1ca95d18d0c50e91a03b9
Signed-off-by: Dima Zavin <dima@android.com>