commit 0d0752bca1 upstream.
Looking into the active_asids array is not enough, as we also need
to look into the reserved_asids array (they both represent processes
that are currently running).
Also, not holding the ASID allocator lock is racy, as another CPU
could schedule that process and trigger a rollover, making the erratum
workaround miss an IPI.
Exposing this outside of context.c is a little ugly on the side, so
let's define a new entry point that the erratum workaround can call
to obtain the cpumask.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c5f927a6f6 upstream.
With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip.
It's possible to partially work around this problem by post-processing
the data to use the PERF_SAMPLE_IP value, but this works only if the CPU
wasn't in the kernel when the sample was taken.
Signed-off-by: Jed Davis <jld@mozilla.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Updates:
-------
- Rebased over 3.10 final
- Differences from big-LITTLE-MP-master-v18
- New Patches:
- master-config-fragments: 1 new patch
- "config: Disable priority filtering for HMP Scheduler"
- master-misc-patches: 1 new patch
- "mm: make vmstat_update periodic run conditional"
- New Branches:
- master-task-placement-v2-updates: 7 patches
New patches from ARM added in a new topic branch stacked on top
of master-task-placement-v2-sysfs...
- Revert "sched: Enable HMP priority filter by default"
- "HMP: Use unweighted load for hmp migration decisions"
- "HMP: Select least-loaded CPU when performing HMP Migrations"
- "HMP: Avoid multiple calls to hmp_domain_min_load in fast path"
- "HMP: Force new non-kernel tasks onto big CPUs until load stabilises"
- "sched: Restrict nohz balance kicks to stay in the HMP domain"
- "HMP: experimental: Force all rt tasks to start on little domain."
Commands used for merge:
-----------------------
$ git checkout -b big-LITTLE-MP-master-v19 v3.10
$ git merge master-arm-multi_pmu_v2 master-config-fragments \
master-hw-bkpt-fix master-misc-patches master-task-placement-v2 \
master-task-placement-v2-sysfs master-task-placement-v2-updates
This patch restricts the allowed cpu mask for rt tasks initially started
with a full cpu mask to the little domain.
An rt task is specified as real time in __setscheduler() which is finally
called for all rt tasks (kernel and user land). In this function we
restrict the allowed cpu mask to the little domain.
This also prevents that a rt tasks can later be pushed to the big domain
because the function find_lowest_rq() will only recognize the allowed cpu
mask of a task to find the new cpu the task runs on.
Current kludges of the patch:
* Since we do not have an API to get the cpu mask of the A7 cluster,
hmp_slow_cpu_mask is made global in arm/kernel/topology.c for now.
* The watchdog_enable() function calls sched_setscheduler() before
kthread_bind() for the cpu specific watchdog kernel threads. The order of
these two calls has to be changed to make this patch work.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Previously, an offline CPU would always appear to have a zero load
and this would distort the offload functionality used for balancing
big and little domains.
Maintain a mask of online CPUs in each domain and use this instead.
Change-Id: I639b564b2f40cb659af8ceb8bd37f84b8a1fe323
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
The patch "sched: Use device-tree to provide fast/slow CPU list for HMP"
depends on the ordering of CPU's in the device tree. It breaks to determine
the logical mask correctly if the logical mask of the CPUs differ from
physical ordering in the device tree.
This patch fix the logic by depending on the mpidr in the device tree
and mapping that mpidr to the logical cpu.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
On homogeneous (non-heterogeneous) systems all CPUs will be declared
'fast' and the slow cpu list will be empty. In this situation we need to
avoid adding an empty slow HMP domain otherwise the scheduler code will
blow up when it attempts to move a task to the slow domain.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
SCHED_HMP requires the different cpu types to be represented by an
ordered list of hmp_domains. Each hmp_domain represents all cpus of
a particular type using a cpumask.
The list is platform specific and therefore must be generated by
platform code by implementing arch_get_hmp_domains().
Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
We can't rely on Kconfig options to set the fast and slow CPU lists for
HMP scheduling if we want a single kernel binary to support multiple
devices with different CPU topology. E.g. TC2 (ARM's Test-Chip-2
big.LITTLE system), Fast Models, or even non big.LITTLE devices.
This patch adds the function arch_get_fast_and_slow_cpus() to generate
the lists at run-time by parsing the CPU nodes in device-tree; it
assumes slow cores are A7s and everything else is fast. The function
still supports the old Kconfig options as this is useful for testing the
HMP scheduler on devices without big.LITTLE.
This patch is reuse of a patch by Jon Medhurst <tixy@linaro.org> with a
few bits left out.
Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for self-hosted
debug} introduces debug powerdown support for self-hosted debug.
While merging the patch 'has_ossr' check was removed which
was needed for hardwares which doesn't support self-hosted debug.
Pandaboard (A9) is one such hardware and Dietmar's orginial
patch did mention this issue.
Without that check on Panda with CPUIDLE enabled, a flood of
below messages thrown.
[ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch
[ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch
So restore that check back to avoid the mentioned issue.
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
This adds core support for saving and restoring CPU PMU registers
for suspend/resume support i.e. deeper C-states in cpuidle terms.
This patch adds support only to ARMv7 PMU registers save/restore.
It needs to be extended to xscale and ARMv6 if needed.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
The userspace perf tool provides options to specify PMU names from command
line for the event. An example of pmu event syntax would be
(<pmu_name>/<config>/<modifier>)
However the parser in the perf tool breaks the tokens at spacesand fails to
identify the PMU name with spaces correctly.
This patch removes spaces in the ARMv7 CPU PMU names.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
This patch sets the cpu affinity for the perf IRQs in the logical order
within the cluster. However interupts are assumed to be specified in the
same logical order within the cluster.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
In a system with multiple heterogeneous CPU PMUs and each PMUs can handle
events on a subset of CPUs, probably belonging a the same cluster.
This patch introduces a cpumask to track which CPUs each PMU supports.
It also updates armpmu_event_init to reject cpu-specific events being
initialised for unsupported CPUs. Since process-specific events can be
initialised for all the CPU PMUs,armpmu_start/stop/add are modified to
prevent from being added on unsupported CPUs.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
In order to support multiple, heterogeneous CPU PMUs and distinguish
them, they cannot be registered as PERF_TYPE_RAW type. Instead we can
get perf core to allocate a new idr type id for each PMU.
Userspace applications can refer sysfs entried to find a PMU's type,
which can then be used in tracking events on individual PMUs.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
A single global CPU PMU pointer is not useful in a system with multiple,
heterogeneous CPU PMUs as we need to access the relevant PMU depending
on the current CPU.
This patch replaces the single global CPU PMU pointer with per-cpu
pointers and changes the OProfile accessors to refer to the PMU affine
to CPU0.
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some device drivers like PMU require to retrieve the logical cpu mask
that corresponds to a given cluster id. This patch provides a hook in
the topology code that, given an existing cluster id as input,
initializes the corresponding cpumask passed as a pointer, reusing all
existing topology information required by sched domains in the kernel.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Since the smp call to stop the other cpus are handled in those
cpus in interrupt context, there's a potential for those smp
handlers to interrupt threads holding spin locks (such as the
one a mutex holds). This prevents those threads from ever
releasing their spin lock, so if the cpu doing the shutdown
is allowed to switch to another thread that tries to grab the
same lock/mutex, we could get into a deadlock (the spin lock
call is called with preemption disabled in the mutex lock code).
To avoid that possibility, disable preemption before doing the
smp_send_stop().
Change-Id: I7976c5382d7173fcb3cd14da8cc5083d442b2544
Signed-off-by: Mike J. Chen <mjchen@google.com>
When PowerDown was requested at the same time as ProgBit, the
formatter flush command that follows could get stuck.
Change-Id: Iafb665f61f055819e64ca1dcb60398c656f593e4
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Without this change a saw an 18% increase in idle power consumption
on one deivce when trace support is compiled into the kernel. Now
I see the same increase only when tracing.
Change-Id: I21bb5ecf1b7d29ce3790ceeb5323409cc22d5a3b
Signed-off-by: Arve Hjønnevåg <arve@android.com>
If more than one ETM or PTM are present, configure all of them
and enable the formatter in the ETB. This allows tracing on dual
core systems (e.g. omap4).
Change-Id: I028657d5cf2bee1b23f193d4387b607953b35888
Signed-off-by: Arve Hjønnevåg <arve@android.com>
On some SOCs the read and write pointer are reset when the chip
resets, but the trace buffer content is preserved. If the status
bits indicates that the buffer is empty and we have never started
tracing, assume the buffer is full instead. This can be useful
if the system rebooted from a watchdog reset.
Change-Id: Iaf21c2c329c6059004ee1d38e3dfff66d7d28029
Signed-off-by: Arve Hjønnevåg <arve@android.com>
It is not safe to call etm_lock or etb_lock without holding the
mutex since another thread may also have unlocked the registers.
Also add some missing checks for valid etb_regs in the etm sysfs
entries.
Change-Id: I939f76a6ea7546a8fc0d4ddafa2fd2b6f38103bb
Signed-off-by: Arve Hjønnevåg <arve@android.com>
The old code enabled data tracing, but did not configure the
range. We now configure it to trace all data addresses by default,
and add a trace_data_range attribute to change the range or disable
data tracing.
Change-Id: I9d04e3e1ea0d0b4d4d5bcb93b1b042938ad738b2
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Trace kernel text segment by default as before, allow tracing of other
ranges by writing a range to /sys/devices/etm/trace_range, or to trace
everything by writing 0 0.
Change-Id: Ibb734ca820fedf79560b20536247f1e1700cdc71
Signed-off-by: Arve Hjønnevåg <arve@android.com>
If the write address was at the end of the buffer, toggling the trace
capture bit would set the RAM-full status instead of clearing it, and
if any of the stop bits in the formatter is set toggling the trace
capture bit may not do anything.
Instead use the read position to find out if the data has already
been returned.
This also fixes the read function so it works when the trace buffer is
larger than the buffer passed in from user space. The old version
would reset the trace buffer pointers after every read, so the second
call to read would always return 0.
Change-Id: I75256abe2556adfd66fd5963e46f9e84ae4645e1
Signed-off-by: Arve Hjønnevåg <arve@android.com>
On some systems kernel code is considered secure, and this code
already limits tracing to the kernel text segment which results
in no trace data.
Change-Id: I098a0753e874859446d098e1ee209f67fc13cd5d
Signed-off-by: Arve Hjønnevåg <arve@android.com>
If clk_get fail, assume the etb does not need a separate clock.
Change-Id: Ia0bf3f5391e94a60ea45876aa7afc8a88a7ec3bf
Signed-off-by: Arve Hjønnevåg <arve@android.com>
This patch implements CONFIG_DEBUG_RODATA, allowing
the kernel text section to be marked read-only in
order to catch bugs that write over the kernel. This
requires mapping the kernel code, plus up to 4MB, using
pages instead of sections, which can increase TLB
pressure.
The kernel is normally mapped using 1MB section entries
in the first level page table, and the first level page
table is copied into every mm. This prevents marking
the kernel text read-only, because the 1MB section
entries are too large granularity to separate the init
section, which is reused as read-write memory after
init, and the kernel text section. Also, the top level
page table for every process would need to be updated,
which is not possible to do safely and efficiently on SMP.
To solve both problems, allow alloc_init_pte to overwrite
an existing section entry with a fully-populated second
level page table. When CONFIG_DEBUG_RODATA is set, all
the section entries that overlap the kernel text section
will be replaced with page mappings. The kernel always
uses a pair of 2MB-aligned 1MB sections, so up to 2MB
of memory before and after the kernel may end up page
mapped.
When the top level page tables are copied into each
process the second level page tables are not copied,
leaving a single second level page table that will
affect all processes on all cpus. To mark a page
read-only, the second level page table is located using
the pointer in the first level page table for the
current process, and the supervisor RO bit is flipped
atomically. Once all pages have been updated, all TLBs
are flushed to ensure the changes are visible on all
cpus.
If CONFIG_DEBUG_RODATA is not set, the kernel will be
mapped using the normal 1MB section entries.
Change-Id: I94fae337f882c2e123abaf8e1082c29cd5d483c6
Signed-off-by: Colin Cross <ccross@android.com>
Based on a rough patch by frank.rowand@am.sony.com
Since ARM doesn't have an NMI (fiq's are not always available),
send an IPI to all other CPUs (current cpu prints the stack directly)
to capture a backtrace.
Change-Id: I8b163c8cec05d521b433ae133795865e8a33d4e2
Signed-off-by: Dima Zavin <dima@android.com>
If the console_lock was held while the system was rebooted, the messages
in the temporary logbuffer would not have propogated to all the console
drivers.
This force releases the console lock if it failed to be acquired.
Change-Id: I193dcf7b968be17966833e50b8b8bc70d5d9fe89
Signed-off-by: Dima Zavin <dima@android.com>
This is extremely useful in diagnosing remote crashes, and is based heavily
on original work by <md@google.com>.
Signed-off-by: San Mehat <san@google.com>
Cc: Michael Davidson <md@google.com>
[ARM] process: Use uber-safe probe_kernel_address() to read mem when dumping.
This prevents the dump from taking pagefaults / external aborts.
Signed-off-by: San Mehat <san@google.com>
To deal with the I-cache discrepancy between Cortex-A15 and Cortex-A7,
let's assume aliasing I-cache in both cases.
Note: this might need to be refined i.e. detect a big.LITTLE system
somehow by probing all CPUs not only the boot one.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
The patch "ARM: kernel: fix MPIDR cpu_{suspend}/{resume} usage"
uses the BFC assembler instruction but this isn't available
on ARMv6 CPUs, which breaks compilation when building kernels which
support both SMP and ARMv6, e.g. omap2plus_defconifg.
Fix this by using a BIC instruction instead.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
The current version of cpu_{suspend}/{resume} relies on the 8 LSBs of
the MPIDR register to index the context pointer saved and restored on
CPU shutdown. This approach breaks as soon as platforms with populated
MPIDR affinity levels 1 and 2 are deployed, since the MPIDR cannot be
considered a linear index anymore.
There are multiple solutions to this problem, each with pros and cons.
This patch changes cpu_{suspend}/{resume} so that the CPU logical id
is used to retrieve an index into the context pointers array.
Performance is impacted on both save and restore paths. On save path
the CPU logical id has to be retrieved from thread_info; since caches
are on, the performance hit should be neglectable. In the resume code
path the MMU is off and so are the caches. The patch adds a trivial for
loop that polls the cpu_logical_map array scanning the present MPIDRs and
retrieves the actual CPU logical index. Since everything runs out of
strongly ordered memory the perfomance hit in the resume code path must
be measured and thought over; it worsens as the number of CPUs increases
since it is a linear search (but can be improved).
On the up side, the logical index approach is by far the easiest solution in
terms of coding and make dynamic changes to the cpu mapping trivial at
run-time.
Any change to the cpu_logical_map (ie in-kernel switcher) at run time must be
cleaned from the caches since this data has to be retrieved with the MMU
off, when caches are not searched.
Tested on TC2 and fast models.
This patch adds the 'psci' kernel command line option. Secure firmware cannot
yet add a psci device node in the dt to indicate whether it supports psci or
not. So in the current dt, the psci device node is present by default. The
probe function will always indicate that the secure firmware implements psci
irrespective of the address space linux runs in as the same device tree will
be used in either case. Hence a kernel cmdline option is required to choose
either the native or psci power api backend depending upon the address space
linux is running in.
Specifying 'psci=enable' in the cmdline will allow Linux running in the
non-secure address space to use the same dt but use the psci backend instead
of the native backend. It effectively overrides the presence of the native
implementation by ensuring registration of the psci backend. Linux running in
the secure address space will use the native backend for power management when
'psci=disable' in the cmdline (also the default value i.e. psci backend is
disabled by default) or the psci node in the dt is absent.
Signed-off-by: Achin Gupta <achin.gupta@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>