From ecce1cf84a2e719fc88267123a6ce689594a530e Mon Sep 17 00:00:00 2001 From: Valentin Schneider Date: Wed, 11 Dec 2019 11:38:51 +0000 Subject: [PATCH] BACKPORT: sched/fair: Make EAS wakeup placement consider uclamp restrictions task_fits_capacity() has just been made uclamp-aware, and find_energy_efficient_cpu() needs to go through the same treatment. Things are somewhat different here however - using the task max clamp isn't sufficient. Consider the following setup: The target runqueue, rq: rq.cpu_capacity_orig = 512 rq.cfs.avg.util_avg = 200 rq.uclamp.max = 768 // the max p.uclamp.max of all enqueued p's is 768 The waking task, p (not yet enqueued on rq): p.util_est = 600 p.uclamp.max = 100 Now, consider the following code which doesn't use the rq clamps: util = uclamp_task_util(p); // Does the task fit in the spare CPU capacity? cpu = cpu_of(rq); fits_capacity(util, cpu_capacity(cpu) - cpu_util(cpu)) This would lead to: util = 100; fits_capacity(100, 512 - 200) fits_capacity() would return true. However, enqueuing p on that CPU *will* cause it to become overutilized since rq clamp values are max-aggregated, so we'd remain with rq.uclamp.max = 768 which comes from the other tasks already enqueued on rq. Thus, we could select a high enough frequency to reach beyond 0.8 * 512 utilization (== overutilized) after enqueuing p on rq. What find_energy_efficient_cpu() needs here is uclamp_rq_util_with() which lets us peek at the future utilization landscape, including rq-wide uclamp values. Make find_energy_efficient_cpu() use uclamp_rq_util_with() for its fits_capacity() check. This is in line with what compute_energy() ends up using for estimating utilization. [QP: moved changes to select_cpu_candidates(), which is the equivalent to the mainline path, and fix missing dependency on fits_capacity() by using the open coded version] Bug: 120440300 Tested-By: Dietmar Eggemann Suggested-by: Quentin Perret Signed-off-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Vincent Guittot Reviewed-by: Dietmar Eggemann Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191211113851.24241-6-valentin.schneider@arm.com Signed-off-by: Ingo Molnar (cherry picked from commit 1d42509e475cdc8542aa5b3e03a7e845244f4f57) Signed-off-by: Quentin Perret Change-Id: Ibe1643cd5e6c97daceceae9733344e54bf4a4857 --- kernel/sched/fair.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 83ebba03f7f7..83e99e574b06 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7003,9 +7003,18 @@ static void select_cpu_candidates(struct sched_domain *sd, cpumask_t *cpus, if (!cpumask_test_cpu(cpu, &p->cpus_allowed)) continue; - /* Skip CPUs that will be overutilized. */ util = cpu_util_next(cpu, p, cpu); cpu_cap = capacity_of(cpu); + spare_cap = cpu_cap - util; + + /* + * Skip CPUs that cannot satisfy the capacity request. + * IOW, placing the task there would make the CPU + * overutilized. Take uclamp into account to see how + * much capacity we can get out of the CPU; this is + * aligned with schedutil_cpu_util(). + */ + util = uclamp_rq_util_with(cpu_rq(cpu), util, p); if (cpu_cap * 1024 < util * capacity_margin) continue; @@ -7013,7 +7022,6 @@ static void select_cpu_candidates(struct sched_domain *sd, cpumask_t *cpus, * Find the CPU with the maximum spare capacity in * the performance domain */ - spare_cap = cpu_cap - util; if (spare_cap > max_spare_cap) { max_spare_cap = spare_cap; max_spare_cap_cpu = cpu;