Commit Graph

635961 Commits

Author SHA1 Message Date
Morten Rasmussen
5cdeb5f0cf ANDROID: sched: Add per-cpu max capacity to sched_group_capacity
struct sched_group_capacity currently represents the compute capacity
sum of all cpus in the sched_group. Unless it is divided by the
group_weight to get the average capacity per cpu it hides differences in
cpu capacity for mixed capacity systems (e.g. high RT/IRQ utilization or
ARM big.LITTLE). But even the average may not be sufficient if the group
covers cpus of different capacities. Instead, by extending struct
sched_group_capacity to indicate max per-cpu capacity in the group a
suitable group for a given task utilization can easily be found such
that cpus with reduced capacity can be avoided for tasks with high
utilization (not implemented by this patch).

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:26 -08:00
Dietmar Eggemann
785367fc84 ANDROID: sched: Do eas idle balance regardless of the rq avg idle value
EAS relies on idle balance to migrate a misfit task towards a cpu with
higher capacity.

When such a cpu becomes idle, idle balance should happen even if the rq
avg idle is smaller than the sched migration cost (default 500us).

The rq avg idle is updated during the wakeup of a task in case the rq has
a non-null idle_stamp. This value stays unchanged and valid until the next
task wakes up on this cpu after an idle period.

So rq avg idle could be smaller than sched migration cost preventing the
idle balance from happening. In this case we would be at the mercy of
wakeup, periodic or nohz-idle load balancing to put another task on this
cpu.

To break this dependency towards rq avg idle make EAS idle balance
independent from this rq avg idle has to be larger than sched migration
cost.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:26 -08:00
Dietmar Eggemann
8f4855421f ANDROID: arm64: Enable max freq invariant scheduler load-tracking and capacity support
Maximum Frequency Invariance has to be part of Cpu Invariance because
Frequency Invariance deals only with differences in load-tracking
introduces by Dynamic Frequency Scaling and not with limiting the
possible range of cpu frequency.

By placing Maximum Frequency Invariance into Cpu Invariance,
load-tracking is scaled via arch_scale_cpu_capacity()
in __update_load_avg() and cpu capacity is scaled via
arch_scale_cpu_capacity() in update_cpu_capacity().

To be able to save the extra multiplication in the scheduler hotpath
(__update_load_avg()) we could:

  1 Inform cpufreq about base cpu capacity at boot and let it handle
    scale_cpu_capacity() as well.
  2 Use the cpufreq policy callback which would update a per-cpu current
    cpu_scale and this value would be return in scale_cpu_capacity().
  3 Use per-cpu current max_freq_scale and current cpu_scale with the
    current patch.

Including <linux/cpufreq.h> in topology.h like for the arm arch doesn't
work because of CONFIG_COMPAT=y (Kernel support for 32-bit EL0).
That's why cpufreq_scale_max_freq_capacity() has to be declared extern
in topology.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:25 -08:00
Dietmar Eggemann
568913e203 ANDROID: arm: Enable max freq invariant scheduler load-tracking and capacity support
Maximum Frequency Invariance has to be part of Cpu Invariance because
Frequency Invariance deals only with differences in load-tracking
introduces by Dynamic Frequency Scaling and not with limiting the
possible range of cpu frequency.

By placing Maximum Frequency Invariance into Cpu Invariance,
load-tracking is scaled via arch_scale_cpu_capacity()
in __update_load_avg() and cpu capacity is scaled via
arch_scale_cpu_capacity() in update_cpu_capacity().

To be able to save the extra multiplication in the scheduler hotpath
(__update_load_avg()) we could:

 1 Inform cpufreq about base cpu capacity at boot and let it handle
   scale_cpu_capacity() as well.
 2 Use the cpufreq policy callback which would update a per-cpu current
   cpu_scale and this value would be return in scale_cpu_capacity().
 3 Use per-cpu current max_freq_scale and current cpu_scale with the
   current patch.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:24 -08:00
Dietmar Eggemann
bbb138bdac ANDROID: sched: Update max cpu capacity in case of max frequency constraints
Wakeup balancing uses cpu capacity awareness and needs to know the
system-wide maximum cpu capacity.

Patch "sched: Store system-wide maximum cpu capacity in root domain"
finds the system-wide maximum cpu capacity during scheduler domain
hierarchy setup. This is sufficient as long as maximum frequency
invariance is not enabled.

If it is enabled, the system-wide maximum cpu capacity can change
between scheduler domain hierarchy setups due to frequency capping.

The cpu capacity is changed in update_cpu_capacity() which is called in
load balance on the lowest scheduler domain hierarchy level. To be able
to know if a change in cpu capacity for a certain cpu also has an effect
on the system-wide maximum cpu capacity it is normally necessary to
iterate over all cpus. This would be way too costly. That's why this
patch follows a different approach.

The unsigned long max_cpu_capacity value in struct root_domain is
replaced with a struct max_cpu_capacity, containing value (the
max_cpu_capacity) and cpu (the cpu index of the cpu providing the
maximum cpu_capacity).

Changes to the system-wide maximum cpu capacity and the cpu index are
made if:

 1 System-wide maximum cpu capacity < cpu capacity
 2 System-wide maximum cpu capacity > cpu capacity and cpu index == cpu

There are no changes to the system-wide maximum cpu capacity in all
other cases.

Atomic read and write access to the pair (max_cpu_capacity.val,
max_cpu_capacity.cpu) is enforced by max_cpu_capacity.lock.

The access to max_cpu_capacity.val in task_fits_max() is still performed
without taking the max_cpu_capacity.lock.

The code to set max cpu capacity in build_sched_domains() has been
removed because the whole functionality is now provided by
update_cpu_capacity() instead.

This approach can introduce errors temporarily, e.g. in case the cpu
currently providing the max cpu capacity has its cpu capacity lowered
due to frequency capping and calls update_cpu_capacity() before any cpu
which might provide the max cpu now.

There is also an outstanding question:

Should the cpu capacity of a cpu going idle be set to a very small
value?

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:24 -08:00
Dietmar Eggemann
4eb92977b2 ANDROID: cpufreq: Max freq invariant scheduler load-tracking and cpu capacity support
Implements cpufreq_scale_max_freq_capacity() to provide the scheduler
with a maximum frequency scaling correction factor for more accurate
load-tracking and cpu capacity handling by being able to deal with
frequency capping.

This scaling factor describes the influence of running a cpu with a
current maximum frequency lower than the absolute possible maximum
frequency on load tracking and cpu capacity.

The factor is:

	current_max_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu)

In fact, max_freq_scale should be a struct cpufreq_policy data member.
But this would require that the scheduler hot path (__update_load_avg())
would have to grab the cpufreq lock. This can be avoided by using per-cpu
data initialized to SCHED_CAPACITY_SCALE for max_freq_scale.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:23 -08:00
Morten Rasmussen
f69e2dc550 ANDROID: sched: Disable energy-unfriendly nohz kicks
With energy-aware scheduling enabled nohz_kick_needed() generates many
nohz idle-balance kicks which lead to nothing when multiple tasks get
packed on a single cpu to save energy. This causes unnecessary wake-ups
and hence wastes energy. Make these conditions depend on !energy_aware()
for now until the energy-aware nohz story gets sorted out.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:22 -08:00
Dietmar Eggemann
53065e82a1 ANDROID: sched: Consider a not over-utilized energy-aware system as balanced
In case the system operates below the tipping point indicator,
introduced in ("sched: Add over-utilization/tipping point
indicator"), bail out in find_busiest_group after the dst and src
group statistics have been checked.

There is simply no need to move usage around because all involved
cpus still have spare cycles available.

For an energy-aware system below its tipping point,  we rely on the
task placement of the wakeup path. This works well for short running
tasks.

The existence of long running tasks on one of the involved cpus lets
the system operate over its tipping point. To be able to move such
a task (whose load can't be used to average the load among the cpus)
from a src cpu with lower capacity than the dst_cpu, an additional
rule has to be implemented in need_active_balance.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:22 -08:00
Morten Rasmussen
4017a8e35c ANDROID: sched: Energy-aware wake-up task placement
Let available compute capacity and estimated energy impact select
wake-up target cpu when energy-aware scheduling is enabled and the
system in not over-utilized (above the tipping point).

energy_aware_wake_cpu() attempts to find group of cpus with sufficient
compute capacity to accommodate the task and find a cpu with enough spare
capacity to handle the task within that group. Preference is given to
cpus with enough spare capacity at the current OPP. Finally, the energy
impact of the new target and the previous task cpu is compared to select
the wake-up target cpu.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:21 -08:00
Dietmar Eggemann
1f884f4351 ANDROID: sched: Determine the current sched_group idle-state
To estimate the energy consumption of a sched_group in
sched_group_energy() it is necessary to know which idle-state the group
is in when it is idle. For now, it is assumed that this is the current
idle-state (though it might be wrong). Based on the individual cpu
idle-states group_idle_state() finds the group idle-state.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:20 -08:00
Morten Rasmussen
0691064135 ANDROID: sched, cpuidle: Track cpuidle state index in the scheduler
The idle-state of each cpu is currently pointed to by rq->idle_state but
there isn't any information in the struct cpuidle_state that can used to
look up the idle-state energy model data stored in struct
sched_group_energy. For this purpose is necessary to store the idle
state index as well. Ideally, the idle-state data should be unified.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:19 -08:00
Morten Rasmussen
a562dfc4f7 ANDROID: sched: Add over-utilization/tipping point indicator
Energy-aware scheduling is only meant to be active while the system is
_not_ over-utilized. That is, there are spare cycles available to shift
tasks around based on their actual utilization to get a more
energy-efficient task distribution without depriving any tasks. When
above the tipping point task placement is done the traditional way based
on load_avg, spreading the tasks across as many cpus as possible based
on priority scaled load to preserve smp_nice. Below the tipping point we
want to use util_avg instead. We need to define a criteria for when we
make the switch.

The util_avg for each cpu converges towards 100% (1024) regardless of
how many task additional task we may put on it. If we define
over-utilized as:

sum_{cpus}(rq.cfs.avg.util_avg) + margin > sum_{cpus}(rq.capacity)

some individual cpus may be over-utilized running multiple tasks even
when the above condition is false. That should be okay as long as we try
to spread the tasks out to avoid per-cpu over-utilization as much as
possible and if all tasks have the _same_ priority. If the latter isn't
true, we have to consider priority to preserve smp_nice.

For example, we could have n_cpus nice=-10 util_avg=55% tasks and
n_cpus/2 nice=0 util_avg=60% tasks. Balancing based on util_avg we are
likely to end up with nice=-10 tasks sharing cpus and nice=0 tasks
getting their own as we 1.5*n_cpus tasks in total and 55%+55% is less
over-utilized than 55%+60% for those cpus that have to be shared. The
system utilization is only 85% of the system capacity, but we are
breaking smp_nice.

To be sure not to break smp_nice, we have defined over-utilization
conservatively as when any cpu in the system is fully utilized at it's
highest frequency instead:

cpu_rq(any).cfs.avg.util_avg + margin > cpu_rq(any).capacity

IOW, as soon as one cpu is (nearly) 100% utilized, we switch to load_avg
to factor in priority to preserve smp_nice.

With this definition, we can skip periodic load-balance as no cpu has an
always-running task when the system is not over-utilized. All tasks will
be periodic and we can balance them at wake-up. This conservative
condition does however mean that some scenarios that could benefit from
energy-aware decisions even if one cpu is fully utilized would not get
those benefits.

For system where some cpus might have reduced capacity on some cpus
(RT-pressure and/or big.LITTLE), we want periodic load-balance checks as
soon a just a single cpu is fully utilized as it might one of those with
reduced capacity and in that case we want to migrate it.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:19 -08:00
Morten Rasmussen
931bd82353 ANDROID: sched: Estimate energy impact of scheduling decisions
Adds a generic energy-aware helper function, energy_diff(), that
calculates energy impact of adding, removing, and migrating utilization
in the system.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:18 -08:00
Morten Rasmussen
a455fa7b08 ANDROID: sched: Extend sched_group_energy to test load-balancing decisions
Extended sched_group_energy() to support energy prediction with usage
(tasks) added/removed from a specific cpu or migrated between a pair of
cpus. Useful for load-balancing decision making.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:17 -08:00
Morten Rasmussen
61bf6252e5 ANDROID: sched: Calculate energy consumption of sched_group
For energy-aware load-balancing decisions it is necessary to know the
energy consumption estimates of groups of cpus. This patch introduces a
basic function, sched_group_energy(), which estimates the energy
consumption of the cpus in the group and any resources shared by the
members of the group.

NOTE: The function has five levels of identation and breaks the 80
character limit. Refactoring is necessary.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:17 -08:00
Morten Rasmussen
30786a0ac3 ANDROID: sched: Highest energy aware balancing sched_domain level pointer
Add another member to the family of per-cpu sched_domain shortcut
pointers. This one, sd_ea, points to the highest level at which energy
model is provided. At this level and all levels below all sched_groups
have energy model data attached.

Partial energy model information is possible but restricted to providing
energy model data for lower level sched_domains (sd_ea and below) and
leaving load-balancing on levels above to non-energy-aware
load-balancing. For example, it is possible to apply energy-aware
scheduling within each socket on a multi-socket system and let normal
scheduling handle load-balancing between sockets.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:16 -08:00
Morten Rasmussen
9978d1393f ANDROID: sched: Relocated cpu_util() and change return type
Move cpu_util() to an earlier position in fair.c and change return
type to unsigned long as negative usage doesn't make much sense. All
other load and capacity related functions use unsigned long including
the caller of cpu_util().

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:15 -08:00
Morten Rasmussen
8b35ef456f ANDROID: sched: Compute cpu capacity available at current frequency
capacity_orig_of() returns the max available compute capacity of a cpu.
For scale-invariant utilization tracking and energy-aware scheduling
decisions it is useful to know the compute capacity available at the
current OPP of a cpu.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:14 -08:00
Robin Randhawa
1f998b3593 ANDROID: sched: Support for extracting EAS energy costs from DT
This patch implements support for extracting energy cost data from DT.
The data should conform to the DT bindings for energy cost data needed
by EAS (energy aware scheduling).

Signed-off-by: Robin Randhawa <robin.randhawa@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:14 -08:00
Robin Randhawa
4f569991b1 ANDROID: arm64, topology: Updates to use DT bindings for EAS costing data
With the bindings and the associated accessors to extract data from the
bindings in place, remove the static hard-coded data from topology.c and
use the accesors instead.

Signed-off-by: Robin Randhawa <robin.randhawa@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:13 -08:00
Juri Lelli
5e900d4884 ANDROID: arm64: Cpu invariant scheduler load-tracking and capacity support
Provides the scheduler with a cpu scaling correction factor for more
accurate load-tracking and cpu capacity handling.

The Energy Model (EM) (in fact the capacity value of the last element
of the capacity states vector of the core (MC) level sched_group_energy
structure) is used as the source for this cpu scaling factor.

The cpu capacity value depends on the micro-architecture and the
maximum frequency of the cpu.

The maximum frequency part should not be confused with the frequency
invariant scheduler load-tracking support which deals with frequency
related scaling due to DFVS functionality.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:12 -08:00
Dietmar Eggemann
b4ca4bcfe1 ANDROID: arm: Cpu invariant scheduler load-tracking and capacity support
Provides the scheduler with a cpu scaling correction factor for more
accurate load-tracking and cpu capacity handling.

The Energy Model (EM) (in fact the capacity value of the last element
of the capacity states vector of the core (MC) level sched_group_energy
structure) is used instead of the arm arch specific cpu_efficiency and
dtb property 'clock-frequency' values as the source for this cpu
scaling factor.

The cpu capacity value depends on the micro-architecture and the
maximum frequency of the cpu.

The maximum frequency part should not be confused with the frequency
invariant scheduler load-tracking support which deals with frequency
related scaling due to DFVS functionality.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:12 -08:00
Morten Rasmussen
858d718840 ANDROID: sched: Introduce SD_SHARE_CAP_STATES sched_domain flag
cpufreq is currently keeping it a secret which cpus are sharing
clock source. The scheduler needs to know about clock domains as well
to become more energy aware. The SD_SHARE_CAP_STATES domain flag
indicates whether cpus belonging to the sched_domain share capacity
states (P-states).

There is no connection with cpufreq (yet). The flag must be set by
the arch specific topology code.

cc: Russell King <linux@arm.linux.org.uk>
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:11 -08:00
Dietmar Eggemann
dd23c09a5a ANDROID: sched: Initialize energy data structures
The sched_group_energy (sge) pointer of the first sched_group (sg) in
the sched_domain (sd) is initialized to point to the appropriate (in
terms of sd level and cpu) sge data defined in the arch and so to the
correct part of the Energy Model (EM).

Energy-aware scheduling allows that a system has only EM data up to a
certain sd level (so called highest energy aware balancing sd level).
A check in init_sched_energy() enforces that all sd's below this sd
level contain EM data.

The 'int cpu' parameter of sched_domain_energy_f requires that
check_sched_energy_data() makes sure that all cpus spanned by a sg
are provisioned with the same EM data.

This patch has also been tested with feature FORCE_SD_OVERLAP enabled.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:10 -08:00
Morten Rasmussen
94c4cea624 ANDROID: sched: Make energy awareness a sched feature
This patch introduces the ENERGY_AWARE sched feature, which is
implemented using jump labels when SCHED_DEBUG is defined. It is
statically set false when SCHED_DEBUG is not defined. Hence this doesn't
allow energy awareness to be enabled without SCHED_DEBUG. This
sched_feature knob will be replaced later with a more appropriate
control knob when things have matured a bit.

ENERGY_AWARE is based on per-entity load-tracking hence FAIR_GROUP_SCHED
must be enable. This dependency isn't checked at compile time yet.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:10 -08:00
Morten Rasmussen
577dacab70 ANDROID: sched: Documentation for scheduler energy cost model
This documentation patch provides an overview of the experimental
scheduler energy costing model, associated data structures, and a
reference recipe on how platforms can be characterized to derive energy
models.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:09 -08:00
Morten Rasmussen
94beeae886 ANDROID: sched: Prevent unnecessary active balance of single task in sched group
Scenarios with the busiest group having just one task and the local
being idle on topologies with sched groups with different numbers of
cpus manage to dodge all load-balance bailout conditions resulting the
nr_balance_failed counter to be incremented. This eventually causes a
pointless active migration of the task. This patch prevents this by not
incrementing the counter when the busiest group only has one task.
ASYM_PACKING migrations and migrations due to reduced capacity should
still take place as these are explicitly captured by
need_active_balance().

A better solution would be to not attempt the load-balance in the first
place, but that requires significant changes to the order of bailout
conditions and statistics gathering.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:08 -08:00
Dietmar Eggemann
90f309fba3 ANDROID: sched: Enable idle balance to pull single task towards cpu with higher capacity
We do not want to miss out on the ability to pull a single remaining
task from a potential source cpu towards an idle destination cpu. Add an
extra criteria to need_active_balance() to kick off active load balance
if the source cpu is over-utilized and has lower capacity than the
destination cpu.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:07 -08:00
Morten Rasmussen
de9b636668 ANDROID: sched: Consider spare cpu capacity at task wake-up
find_idlest_group() selects the wake-up target group purely
based on group load which leads to suboptimal choices in low load
scenarios. An idle group with reduced capacity (due to RT tasks or
different cpu type) isn't necessarily a better target than a lightly
loaded group with higher capacity.

The patch adds spare capacity as an additional group selection
parameter. The target group is now selected based on the following
criteria:

1. Return the group with the cpu with most spare capacity and this
capacity is significant if such group exists. Significant spare capacity
is currently at least 20% to spare.

2. Return the group with the lowest load, unless it is the local group
in which case NULL is returned and the search is continued at the next
(lower) level.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:07 -08:00
Morten Rasmussen
b9ac009489 ANDROID: sched: Add cpu capacity awareness to wakeup balancing
Wakeup balancing is completely unaware of cpu capacity, cpu utilization
and task utilization. The task is preferably placed on a cpu which is
idle in the instant the wakeup happens. New tasks
(SD_BALANCE_{FORK,EXEC} are placed on an idle cpu in the idlest group if
such can be found, otherwise it goes on the least loaded one. Existing
tasks (SD_BALANCE_WAKE) are placed on the previous cpu or an idle cpu
sharing the same last level cache unless the wakee_flips heuristic in
wake_wide() decides to fallback to considering cpus outside SD_LLC.
Hence existing tasks are not guaranteed to get a chance to migrate to a
different group at wakeup in case the current one has reduced cpu
capacity (due RT/IRQ pressure or different uarch e.g. ARM big.LITTLE).
They may eventually get pulled by other cpus doing
periodic/idle/nohz_idle balance, but it may take quite a while before it
happens.

This patch adds capacity awareness to find_idlest_{group,queue} (used by
SD_BALANCE_{FORK,EXEC} and SD_BALANCE_WAKE under certain circumstances)
such that groups/cpus that can accommodate the waking task based on task
utilization are preferred. In addition, wakeup of existing tasks
(SD_BALANCE_WAKE) is sent through find_idlest_{group,queue} also if the
task doesn't fit the capacity of the previous cpu to allow it to escape
(override wake_affine) when necessary instead of relying on
periodic/idle/nohz_idle balance to eventually sort it out.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:06 -08:00
Morten Rasmussen
25cea247ff ANDROID: arm: Update arch_scale_cpu_capacity() to reflect change to define
arch_scale_cpu_capacity() is no longer a weak function but a #define
instead. Include the #define in topology.h.

cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:05 -08:00
Dietmar Eggemann
ea31a3e30d ANDROID: arm64: Enable frequency invariant scheduler load-tracking support
Defines arch_scale_freq_capacity() to use cpufreq implementation.

Including <linux/cpufreq.h> in topology.h like for the arm arch doesn't
work because of CONFIG_COMPAT=y (Kernel support for 32-bit EL0).
That's why cpufreq_scale_freq_capacity() has to be declared extern in
topology.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:05 -08:00
Dietmar Eggemann
42083ec1de ANDROID: arm: Enable frequency invariant scheduler load-tracking support
Defines arch_scale_freq_capacity() to use cpufreq implementation.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:04 -08:00
Dietmar Eggemann
c33be5d118 ANDROID: cpufreq: Frequency invariant scheduler load-tracking support
Implements cpufreq_scale_freq_capacity() to provide the scheduler with a
frequency scaling correction factor for more accurate load-tracking.

The factor is:

	current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu)

In fact, freq_scale should be a struct cpufreq_policy data member. But
this would require that the scheduler hot path (__update_load_avg()) would
have to grab the cpufreq lock. This can be avoided by using per-cpu data
initialized to SCHED_CAPACITY_SCALE for freq_scale.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:03 -08:00
Sven Wegener
52e61ad45a ANDROID: [CPUFREQ] Don't export governors for default governor
We don't need to export the governors for use as the default governor,
because the default governor will be built-in anyway and we can access
the symbol directly.

This also fixes the following sparse warnings:

drivers/cpufreq/cpufreq_conservative.c:578:25: warning: symbol 'cpufreq_gov_conservative' was not declared. Should it be static?
drivers/cpufreq/cpufreq_ondemand.c:582:25: warning: symbol 'cpufreq_gov_ondemand' was not declared. Should it be static?
drivers/cpufreq/cpufreq_performance.c:39:25: warning: symbol 'cpufreq_gov_performance' was not declared. Should it be static?
drivers/cpufreq/cpufreq_powersave.c:38:25: warning: symbol 'cpufreq_gov_powersave' was not declared. Should it be static?
drivers/cpufreq/cpufreq_userspace.c:190:25: warning: symbol 'cpufreq_gov_userspace' was not declared. Should it be static?

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:03 -08:00
Sami Tolvanen
3fa5b5beb9 ANDROID: kernel/configs: recommended: CONFIG_ARM64_SW_TTBR0_PAN=y
Bug: 31432001
Change-Id: Ia72c3aa70a463d3a7f52b76e5082520aa328d29b
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:46:02 -08:00
Jin Qian
580a5379c7 ANDROID: kernel/configs: base: Enable QUOTA related configs
Bug: 33757366
Change-Id: Iec4f55c3ca4a16dbc8695054f481d9261c56d0f6
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:46:01 -08:00
Amit Pundir
9998c6978e ANDROID: kernel/configs: recommended: Enable MEMORY_STATE_TIME
Enable qcom's memory state tracking driver config
CONFIG_MEMORY_STATE_TIME in android-recommended.config

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:46:00 -08:00
Amit Pundir
95bb7b31fb FROMLIST: config: android-base: enable hardened usercopy and kernel ASLR
Enable CONFIG_HARDENED_USERCOPY and CONFIG_RANDOMIZE_BASE in Android
base config fragment.

Reviewed at https://android-review.googlesource.com/#/c/283659/
Reviewed at https://android-review.googlesource.com/#/c/278133/

Link: http://lkml.kernel.org/r/1481113148-29204-2-git-send-email-amit.pundir@linaro.org
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Cc: Rob Herring <rob.herring@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-01-31 10:46:00 -08:00
Daniel Micay
52e45973e7 FROMLIST: config: android-recommended: disable aio support
The aio interface adds substantial attack surface for a feature that's not
being exposed by Android at all.  It's unlikely that anyone is using the
kernel feature directly either.  This feature is rarely used even on
servers.  The glibc POSIX aio calls really use thread pools.  The lack of
widespread usage also means this is relatively poorly audited/tested.

The kernel's aio rarely provides performance benefits over using a thread
pool and is quite incomplete in terms of system call coverage along with
having edge cases where blocking can occur.  Part of the performance issue
is the fact that it only supports direct io, not buffered io.  The
existing API is considered fundamentally flawed and it's unlikely it will
be expanded, but rather replaced:

https://marc.info/?l=linux-aio&m=145255815216051&w=2

Since ext4 encryption means no direct io support, kernel aio isn't even
going to work properly on Android devices using file-based encryption.

Reviewed-at: https://android-review.googlesource.com/#/c/292158/

Link: http://lkml.kernel.org/r/1481113148-29204-1-git-send-email-amit.pundir@linaro.org
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Cc: Rob Herring <rob.herring@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-01-31 10:45:59 -08:00
Jeff Vander Stoep
abfff41cc6 ANDROID: kernel/configs: recommended: enable fstack-protector-strong
If compiler has stack protector support, set
CONFIG_CC_STACKPROTECTOR_STRONG.

Bug: 28967314
Change-Id: I588c2d544250e9e4b5082b43c237b8f85b7313ca
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:45:59 -08:00
Amit Pundir
611ef60db3 ANDROID: kernel/configs: base: enable UID_CPUTIME
Enabled UID_CPUTIME and dependent PROFILING config option.

UID_CPUTIME (/proc/uid_cputime) interfaces provide amount of time a
UID's processes spent executing in user-space and kernel-space. It is
used by batterystats service.

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:45:58 -08:00
Jeff Vander Stoep
270ef0c00a ANDROID: kernel/configs: base: restrict access to perf events
Add:
CONFIG_SECURITY_PERF_EVENTS_RESTRICT=y

to android-base.cfg

The kernel.perf_event_paranoid sysctl is set to 3 by default.
No unprivileged use of the perf_event_open syscall will be
permitted unless it is changed.

Bug: 29054680
Change-Id: Ie7512259150e146d8e382dc64d40e8faaa438917
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:45:57 -08:00
Amit Pundir
c88356f9db ANDROID: configs: base: enable configfs gadget functions
Now that Android is moving towards ConfigFS based USB gadgets,
lets enable USB_CONFIGFS and relevant Android gadget functions
instead of obsolete USB_G_ANDROID composite driver which doesn't
exist now.

Enabled following ConfigFS gadget functions:

F_FS            for ADB
F_MTP/PTP       for MTP/PTP
F_ACC           for Android USB Accessory
F_AUDIO_SRC     for USB Audio Source
F_MIDI          for MIDI, and
CONFIGFS_UEVENT for communicating USB state change notifications to userspace.

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:45:56 -08:00
Amit Pundir
c250cd6ebb ANDROID: configs: merge AOSP config fragments
Upstream now supports AOSP kernel config fragments:
commit 27eb6622ab ("config: add android config fragments").

This patch merge non-upstream AOSP config fragments from
android/configs/android-* of common kernel experimental/android-4.9
to kernel/configs/android-*.

Added initial set of AOSP config fragments and a README.android,
from AOSP Change-ID: I3a4883f3b04d2820e90ceb3c4d02390d6458d6ce
("android: configs: Initial commit of Android config fragments"),
to explain the purpose of Android config fragments and how to use
them to generate a device config compatible with Android.

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-31 10:45:56 -08:00
James Carr
ad3c02f8b3 ANDROID: Implement memory_state_time, used by qcom,cpubw
New driver memory_state_time tracks time spent in different DDR
frequency and bandwidth states.

Memory drivers such as qcom,cpubw can post updated state to the driver
after registering a callback. Processed by a workqueue

Bandwidth buckets are read in from device tree in the relevant qualcomm
section, can be defined in any quantity and spacing.

The data is exposed at /sys/kernel/memory_state_time, able to be read by
the Android framework.

Functionality is behind a config option CONFIG_MEMORY_STATE_TIME

Change-Id: I4fee165571cb975fb9eacbc9aada5e6d7dd748f0
Signed-off-by: James Carr <carrja@google.com>
2017-01-31 10:45:55 -08:00
Badhri Jagan Sridharan
c5b8dcdea8 ANDROID: dm: rebase for 4.9
Export the direct_access method of dm_linear target for
dm-android-verity target.

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I46556d882305e5194352946264cbc9c06e5038e4
2017-01-31 10:45:54 -08:00
Dmitry Shmidt
48baaa32ce ANDROID: usb: otg-wakelock: Remove wakelock.h dependencies
Change-Id: Ibff8d6e04cc475114bc0a91512d0ee3900768b06
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2017-01-31 10:45:54 -08:00
Dmitry Shmidt
1626ae3163 ANDROID: gpio_matrix: Remove wakelock.h dependencies
Change-Id: I228bcdebf28f5c67765002043d3f919718827316
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2017-01-31 10:45:53 -08:00
Dmitry Shmidt
b4a2694b0b ANDROID: fiq_debugger: Remove wakelock.h dependencies
Change-Id: I16a0dd4c4c6ee6440ce8a921bc0834d904b81f37
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2017-01-31 10:45:52 -08:00