linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 12:17:12 +09:00

Author	SHA1	Message	Date
Patrick Bellasi	4dd2760a37	sched/fair: ignore backup CPU when not valid The find_best_target can sometimes not return a valid backup CPU, either because it cannot find one or just becasue it returns prev_cpu as a backup. In these cases we should skip the energy_diff evaluation for the backup CPU. Change-Id: I3787dbdfe74122348dd7a7485b88c4679051bd32 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Patrick Bellasi	406bf1fd8f	sched/fair: trace energy_diff for non boosted tasks In systems where SchedTune is enabled, we do not report energy diff for non boosted tasks. Let's fix this by always genereting an energy_diff event where however: nrg.delta = 0, since we skip energy normalization payoff = nrg.diff, since the payoff is defined just by the energy difference Change-Id: I9a11ec19b6f56da04147f5ae5b47daf1dd180445 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	623b519093	UPSTREAM: sched/fair: Sync task util before slow-path wakeup We use task_util() in find_idlest_group() via capacity_spare_wake(). This task_util() updated in wake_cap(). However wake_cap() is not the only reason for ending up in find_idlest_group() - we could have been sent there by wake_wide(). So explicitly sync the task util with prev_cpu when we are about to head to find_idlest_group(). We could simply do this at the beginning of select_task_rq_fair() (i.e. irrespective of whether we're heading to select_idle_sibling() or find_idlest_group() & co), but I didn't want to slow down the select_idle_sibling() path more than necessary. Don't do this during fork balancing, we won't need the task_util and we'd just clobber the last_update_time, which is supposed to be 0. Change-Id: I935f4bfdfec3e8b914457aac3387ce264d5fd484 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andres Oportus <andresoportus@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: http://lkml.kernel.org/r/20170808095519.10077-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit `ea16f0ea6c` tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	eea0ea6d5a	UPSTREAM: sched/fair: Fix usage of find_idlest_group() when the local group is idlest find_idlest_group() returns NULL when the local group is idlest. The caller then continues the find_idlest_group() search at a lower level of the current CPU's sched_domain hierarchy. find_idlest_group_cpu() is not consulted and, crucially, @new_cpu is not updated. This means the search is pointless and we return @prev_cpu from select_task_rq_fair(). This is fixed by initialising @new_cpu to @cpu instead of @prev_cpu. Change-Id: Ie531f5bb29775952bdc4c148b6e974b2f5f32b7a Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-6-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit `93f50f9024` tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	0f0b33d3cb	UPSTREAM: sched/fair: Fix usage of find_idlest_group() when no groups are allowed When 'p' is not allowed on any of the CPUs in the sched_domain, we currently return NULL from find_idlest_group(), and pointlessly continue the search on lower sched_domain levels (where 'p' is also not allowed) before returning prev_cpu regardless (as we have not updated new_cpu). Add an explicit check for this case, and add a comment to find_idlest_group(). Now when find_idlest_group() returns NULL, it always means that the local group is allowed and idlest. Change-Id: I5f2648d2f7fb0465677961ecb7473df3d06f0057 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-5-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit `6fee85ccbc` tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	5fdbc79e14	BACKPORT: sched/fair: Fix find_idlest_group when local group is not allowed When the local group is not allowed we do not modify this_*_load from their initial value of 0. That means that the load checks at the end of find_idlest_group cause us to incorrectly return NULL. Fixing the initial values to ULONG_MAX means we will instead return the idlest remote group in that case. BACKPORT: Note 4.4 is missing commit `6b94780e45` "sched/core: Use load_avg for selecting idlest group", so we only have to fix this_load instead of this_runnable_load and this_avg_load. Change-Id: I41f775b0e7c8f5e675c2780f955bb130a563cba7 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171005114516.18617-4-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit `0d10ab952e` tip:sched/core) (backport changes described above) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	e38a80f456	UPSTREAM: sched/fair: Remove unnecessary comparison with -1 Since commit: `83a0a96a5f` ("sched/fair: Leverage the idle state info when choosing the "idlest" cpu") find_idlest_group_cpu() (formerly find_idlest_cpu) no longer returns -1, so we can simplify the checking of the return value in find_idlest_cpu(). Change-Id: I98f4b9f178cd93a30408e024e608d36771764c7b Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-3-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from commit `e90381eaec` in tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	15b467e490	BACKPORT: sched/fair: Move select_task_rq_fair slow-path into its own function In preparation for changes that would otherwise require adding a new level of indentation to the while(sd) loop, create a new function find_idlest_cpu() which contains this loop, and rename the existing find_idlest_cpu() to find_idlest_group_cpu(). Code inside the while(sd) loop is unchanged. @new_cpu is added as a variable in the new function, with the same initial value as the @new_cpu in select_task_rq_fair(). Change-Id: I9842308cab00dc9cd6c513fc38c609089a1aaaaf Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-2-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (reworked for eas/cas schedstats added in Android) (cherry-picked commit `18bd1b4bd5` from tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Vincent Guittot	1e000406ac	UPSTREAM: sched/core: Fix find_idlest_group() for fork During fork, the utilization of a task is init once the rq has been selected because the current utilization level of the rq is used to set the utilization of the fork task. As the task's utilization is still 0 at this step of the fork sequence, it doesn't make sense to look for some spare capacity that can fit the task's utilization. Furthermore, I can see perf regressions for the test: hackbench -P -g 1 because the least loaded policy is always bypassed and tasks are not spread during fork. With this patch and the fix below, we are back to same performances as for v4.8. The fix below is only a temporary one used for the test until a smarter solution is found because we can't simply remove the test which is useful for others benchmarks \| @@ -5708,13 +5708,6 @@ static int select_idle_cpu(struct task_struct p, struct sched_domain sd, int t \| \| avg_cost = this_sd->avg_scan_cost; \| \| - /* \| - * Due to large variance we need a large fuzz factor; hackbench in \| - * particularly is sensitive here. \| - */ \| - if ((avg_idle / 512) < avg_cost) \| - return -1; \| - \| time = local_clock(); \| \| for_each_cpu_wrap(cpu, sched_domain_span(sd), target, wrap) { Tested-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: kernellwp@gmail.com Cc: umgwanakikbuti@gmail.com Cc: yuyang.du@intel.comc Link: http://lkml.kernel.org/r/1481216215-24651-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `f519a3f1c6`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I86cc2ad81af3467c0b2f82b995111f428248baa4	2017-11-20 21:15:59 +05:30
Peter Zijlstra	98ac5c4cd3	UPSTREAM: sched/core: Add missing update_rq_clock() call in set_user_nice() Address this rq-clock update bug: WARNING: CPU: 30 PID: 195 at ../kernel/sched/sched.h:797 set_next_entity() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() set_next_entity() ? _raw_spin_lock() set_curr_task_fair() set_user_nice.part.85() set_user_nice() create_worker() worker_thread() kthread() ret_from_fork() Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `2fb8d36787`) Change-Id: I53ba056e72820c7fadb3f022e4ee3b821c0de17d Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Peter Zijlstra	6351c82d2c	UPSTREAM: sched/core: Add missing update_rq_clock() call for task_hot() Add the update_rq_clock() call at the top of the callstack instead of at the bottom where we find it missing, this to aid later effort to minimize the number of update_rq_lock() calls. WARNING: CPU: 30 PID: 194 at ../kernel/sched/sched.h:797 assert_clock_updated() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() assert_clock_updated.isra.63.part.64() can_migrate_task() load_balance() pick_next_task_fair() __schedule() schedule() worker_thread() kthread() Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `3bed5e2166`) Change-Id: Ief5070dcce486535334dcb739ee16b989ea9df42 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Peter Zijlstra	d996ec007f	UPSTREAM: sched/core: Add missing update_rq_clock() in detach_task_cfs_rq() Instead of adding the update_rq_clock() all the way at the bottom of the callstack, add one at the top, this to aid later effort to minimize update_rq_lock() calls. WARNING: CPU: 0 PID: 1 at ../kernel/sched/sched.h:797 detach_task_cfs_rq() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() detach_task_cfs_rq() switched_from_fair() __sched_setscheduler() _sched_setscheduler() sched_set_stop_task() cpu_stop_create() __smpboot_create_thread.part.2() smpboot_register_percpu_thread_cpumask() cpu_stop_init() do_one_initcall() ? print_cpu_info() kernel_init_freeable() ? rest_init() kernel_init() ret_from_fork() Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `80f5c1b84b`) Change-Id: Ibffde077d18eabec4c2984158bd9d6d73bd0fb96 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Peter Zijlstra	fc63c5c792	UPSTREAM: sched/core: Add missing update_rq_clock() in post_init_entity_util_avg() Address this rq-clock update bug: WARNING: CPU: 0 PID: 0 at ../kernel/sched/sched.h:797 post_init_entity_util_avg() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: __warn() post_init_entity_util_avg() wake_up_new_task() _do_fork() kernel_thread() rest_init() start_kernel() Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `4126bad671`) Change-Id: Ibe9a73386896377f96483d195e433259218755a5 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Peter Zijlstra	630948e7ac	BACKPORT: sched/fair: Fix PELT integrity for new tasks Vincent and Yuyang found another few scenarios in which entity tracking goes wobbly. The scenarios are basically due to the fact that new tasks are not immediately attached and thereby differ from the normal situation -- a task is always attached to a cfs_rq load average (such that it includes its blocked contribution) and are explicitly detached/attached on migration to another cfs_rq. Scenario 1: switch to fair class p->sched_class = fair_class; if (queued) enqueue_task(p); ... enqueue_entity() enqueue_entity_load_avg() migrated = !sa->last_update_time (true) if (migrated) attach_entity_load_avg() check_class_changed() switched_from() (!fair) switched_to() (fair) switched_to_fair() attach_entity_load_avg() If @p is a new task that hasn't been fair before, it will have !last_update_time and, per the above, end up in attach_entity_load_avg() _twice_. Scenario 2: change between cgroups sched_move_group(p) if (queued) dequeue_task() task_move_group_fair() detach_task_cfs_rq() detach_entity_load_avg() set_task_rq() attach_task_cfs_rq() attach_entity_load_avg() if (queued) enqueue_task(); ... enqueue_entity() enqueue_entity_load_avg() migrated = !sa->last_update_time (true) if (migrated) attach_entity_load_avg() Similar as with scenario 1, if @p is a new task, it will have !load_update_time and we'll end up in attach_entity_load_avg() _twice_. Furthermore, notice how we do a detach_entity_load_avg() on something that wasn't attached to begin with. As stated above; the problem is that the new task isn't yet attached to the load tracking and thereby violates the invariant assumption. This patch remedies this by ensuring a new task is indeed properly attached to the load tracking on creation, through post_init_entity_util_avg(). Of course, this isn't entirely as straightforward as one might think, since the task is hashed before we call wake_up_new_task() and thus can be poked at. We avoid this by adding TASK_NEW and teaching cpu_cgroup_can_attach() to refuse such tasks. .:: BACKPORT Complicated by the fact that mch of the lines changed by the original of this commit were then changed by: `df217913e7` sched/fair: Factorize attach/detach entity <Vincent Guittot> and then `d31b1a66cb` sched/fair: Factorize PELT update <Vincent Guittot> , which have both already been backported here. Reported-by: Yuyang Du <yuyang.du@intel.com> Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `7dc603c902`) Change-Id: Ibc59eb52310a62709d49a744bd5a24e8b97c4ae8 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Vincent Guittot	73f39e489e	BACKPORT: sched/cgroup: Fix cpu_cgroup_fork() handling A new fair task is detached and attached from/to task_group with: cgroup_post_fork() ss->fork(child) := cpu_cgroup_fork() sched_move_task() task_move_group_fair() Which is wrong, because at this point in fork() the task isn't fully initialized and it cannot 'move' to another group, because its not attached to any group as yet. In fact, cpu_cgroup_fork() needs a small part of sched_move_task() so we can just call this small part directly instead sched_move_task(). And the task doesn't really migrate because it is not yet attached so we need the following sequence: do_fork() sched_fork() __set_task_cpu() cgroup_post_fork() set_task_rq() # set task group and runqueue wake_up_new_task() select_task_rq() can select a new cpu __set_task_cpu post_init_entity_util_avg attach_task_cfs_rq() activate_task enqueue_task This patch makes that happen. BACKPORT: Difference from original commit: - Removed use of DEQUEUE_MOVE (which isn't defined in 4.4) in dequeue_task flags - Replaced "struct rq_flags rf" with "unsigned long flags". Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> [ Added TASK_SET_GROUP to set depth properly. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `ea86cb4b76`) Change-Id: I8126fd923288acf961218431ffd29d6bf6fd8d72 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Peter Zijlstra	b8cb77612b	UPSTREAM: sched/fair: Fix and optimize the fork() path The task_fork_fair() callback already calls __set_task_cpu() and takes rq->lock. If we move the sched_class::task_fork callback in sched_fork() under the existing p->pi_lock, right after its set_task_cpu() call, we can avoid doing two such calls and omit the IRQ disabling on the rq->lock. Change to __set_task_cpu() to skip the migration bits, this is a new task, not a migration. Similarly, make wake_up_new_task() use __set_task_cpu() for the same reason, the task hasn't actually migrated as it hasn't ever ran. This cures the problem of calling migrate_task_rq_fair(), which does remove_entity_from_load_avg() on tasks that have never been added to the load avg to begin with. This bug would result in transiently messed up load_avg values, averaged out after a few dozen milliseconds. This is probably the reason why this bug was not found for such a long time. Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `e210bffd39`) Change-Id: Icbddbaa6e8c1071859673d8685bc3f38955cf144 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Brendan Jackman	ba876f4fc6	UPSTREAM: sched/fair: Force balancing on nohz balance if local group has capacity The "goto force_balance" here is intended to mitigate the fact that avg_load calculations can result in bad placement decisions when priority is asymmetrical. The original commit that adds it: `fab476228b` ("sched: Force balancing on newidle balance if local group has capacity") explains: Under certain situations, such as a niced down task (i.e. nice = -15) in the presence of nr_cpus NICE0 tasks, the niced task lands on a sched group and kicks away other tasks because of its large weight. This leads to sub-optimal utilization of the machine. Even though the sched group has capacity, it does not pull tasks because sds.this_load >> sds.max_load, and f_b_g() returns NULL. A similar but inverted issue also affects ARM big.LITTLE (asymmetrical CPU capacity) systems - consider 8 always-running, same-priority tasks on a system with 4 "big" and 4 "little" CPUs. Suppose that 5 of them end up on the "big" CPUs (which will be represented by one sched_group in the DIE sched_domain) and 3 on the "little" (the other sched_group in DIE), leaving one CPU unused. Because the "big" group has a higher group_capacity its avg_load may not present an imbalance that would cause migrating a task to the idle "little". The force_balance case here solves the problem but currently only for CPU_NEWLY_IDLE balances, which in theory might never happen on the unused CPU. Including CPU_IDLE in the force_balance case means there's an upper bound on the time before we can attempt to solve the underutilization: after DIE's sd->balance_interval has passed the next nohz balance kick will help us out. Change-Id: I807ba5cba0ef1b8bbec02cbcd4755fd32af10135 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170807163900.25180-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit `583ffd99d7` tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Chris Redpath	7beab85489	cpufreq/sched: Consider max cpu capacity when choosing frequencies When using schedfreq on cpus with max capacity significantly smaller than 1024, the tick update uses non-normalised capacities - this leads to selecting an incorrect OPP as we were scaling the frequency as if the max capacity achievable was 1024 rather than the max for that particular cpu or group. This could result in a cpu being stuck at the lowest OPP and unable to generate enough utilisation to climb out if the max capacity is significantly smaller than 1024. Instead, normalize the capacity to be in the range 0-1024 in the tick so that when we later select a frequency, we get the correct one. Also comments updated to be clearer about what is needed. Change-Id: Id84391c7ac015311002ada21813a353ee13bee60 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Chris Redpath	926b5b1536	BACKPORT: sched/fair: Make it possible to account fair load avg consistently While set_task_rq_fair() is introduced in mainline by commit `ad936d8658` ("sched/fair: Make it possible to account fair load avg consistently"), the function results to be introduced here by the backport of commit `09a43ace1f` ("sched/fair: Propagate load during synchronous attach/detach"). The problem (apart from the confusion introduced by the backport) is actually that set_task_rq_fair() is currently not called at all. Fix the problem by backporting again commit `ad936d8658` ("sched/fair: Make it possible to account fair load avg consistently"). Original change log: The current code accounts for the time a task was absent from the fair class (per ATTACH_AGE_LOAD). However it does not work correctly when a task got migrated or moved to another cgroup while outside of the fair class. This patch tries to address that by aging on migration. We locklessly read the 'last_update_time' stamp from both the old and new cfs_rq, ages the load upto the old time, and sets it to the new time. These timestamps should in general not be more than 1 tick apart from one another, so there is a definite bound on things. Signed-off-by: Byungchul Park <byungchul.park@lge.com> [ Changelog, a few edits and !SMP build fix ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445616981-29904-2-git-send-email-byungchul.park@lge.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked from `ad936d8658`) Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I17294ab0ada3901d35895014715fd60952949358 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-11-20 21:15:59 +05:30
Martijn Coenen	6c70907428	ANDROID: binder: Add thread->process_todo flag. This flag determines whether the thread should currently process the work in the thread->todo worklist. The prime usecase for this is improving the performance of synchronous transactions: all synchronous transactions post a BR_TRANSACTION_COMPLETE to the calling thread, but there's no reason to return that command to userspace right away - userspace anyway needs to wait for the reply. Likewise, a synchronous transaction that contains a binder object can cause a BC_ACQUIRE/BC_INCREFS to be returned to userspace; since the caller must anyway hold a strong/weak ref for the duration of the call, postponing these commands until the reply comes in is not a problem. Note that this flag is not used to determine whether a thread can handle process work; a thread should never pick up process work when thread work is still pending. Before patch: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_sendVec_binderize/4 45959 ns 20288 ns 34351 BM_sendVec_binderize/8 45603 ns 20080 ns 34909 BM_sendVec_binderize/16 45528 ns 20113 ns 34863 BM_sendVec_binderize/32 45551 ns 20122 ns 34881 BM_sendVec_binderize/64 45701 ns 20183 ns 34864 BM_sendVec_binderize/128 45824 ns 20250 ns 34576 BM_sendVec_binderize/256 45695 ns 20171 ns 34759 BM_sendVec_binderize/512 45743 ns 20211 ns 34489 BM_sendVec_binderize/1024 46169 ns 20430 ns 34081 After patch: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_sendVec_binderize/4 42939 ns 17262 ns 40653 BM_sendVec_binderize/8 42823 ns 17243 ns 40671 BM_sendVec_binderize/16 42898 ns 17243 ns 40594 BM_sendVec_binderize/32 42838 ns 17267 ns 40527 BM_sendVec_binderize/64 42854 ns 17249 ns 40379 BM_sendVec_binderize/128 42881 ns 17288 ns 40427 BM_sendVec_binderize/256 42917 ns 17297 ns 40429 BM_sendVec_binderize/512 43184 ns 17395 ns 40411 BM_sendVec_binderize/1024 43119 ns 17357 ns 40432 Signed-off-by: Martijn Coenen <maco@android.com> Change-Id: Ia70287066d62aba64e98ac44ff1214e37ca75693	2017-11-20 21:15:59 +05:30
Martijn Coenen	d61c22c2b5	ANDROID: binder: show high watermark of alloc->pages. Show the high watermark of the index into the alloc->pages array, to facilitate sizing the buffer on a per-process basis. Change-Id: I2b40cd16628e0ee45216c51dc9b3c5b0c862032e Signed-off-by: Martijn Coenen <maco@android.com>	2017-11-20 21:15:59 +05:30
Kevin Brodsky	6dce05a28e	UPSTREAM: arm64: compat: Remove leftover variable declaration (cherry picked from commit `82d24d114f`) Commit `a1d5ebaf8c` ("arm64: big-endian: don't treat code as data when copying sigret code") moved the 32-bit sigreturn trampoline code from the aarch32_sigret_code array to kuser32.S. The commit removed the array definition from signal32.c, but not its declaration in signal32.h. Remove the leftover declaration. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 20045882 Bug: 63737556 Change-Id: Ic8a5f0e367f0ecd5c5ddd9e3885d0285f91cf89e	2017-11-20 21:15:59 +05:30
Chris Redpath	bfbb3a019c	ANDROID: sched/fair: Select correct capacity state for energy_diff The util returned from group_max_util is not capped at the max util present in the group, so it can be larger than the capacity stored in the array. Ensure that when this happens, we always use the last entry in the array to fetch energy from. Tested with synthetics on Juno board. Bug: 38159576 Change-Id: I89fb52fb7e68fa3e682e308acc232596672d03f7 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Dmitry Shmidt	b215e140de	Revert "UPSTREAM: efi/libstub/arm64: Set -fpie when building the EFI stub" It break boot with UEFI bootloader This reverts commit `2f2860a504`.	2017-11-20 21:15:59 +05:30
Leo Yan	b097a6d8b1	cpufreq: schedutil: clamp util to CPU maximum capacity The code is to get the CPU util by accumulate different scheduling classes and when the total util value is larger than CPU capacity then it clamps util to CPU maximum capacity. So we can get correct util value when use PELT signal but if with WALT signal it misses to clamp util value. On the other hand, WALT doesn't accumulate different class utilization but it needs to applying boost margin for WALT signal the CPU util value is possible to be larger than CPU capacity; so this patch is to always clamp util to CPU maximum capacity. Change-Id: I05481ddbf20246bb9be15b6bd21b6ec039015ea8 Signed-off-by: Leo Yan <leo.yan@linaro.org>	2017-11-20 21:15:59 +05:30
Sherry Yang	32e281fba3	FROMLIST: android: binder: Change binder_shrinker to static (from https://patchwork.kernel.org/patch/9990321/) binder_shrinker struct is not used anywhere outside of binder_alloc.c and should be static. Bug: 63926541 Change-Id: I7a13d4ddbaaf3721cddfe1d860e34c7be80dd082 Acked-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Sherry Yang <sherryy@android.com>	2017-11-20 21:15:59 +05:30
Sherry Yang	bcda4a7869	FROMLIST: android: binder: Fix null ptr dereference in debug msg (from https://patchwork.kernel.org/patch/9990323/) Don't access next->data in kernel debug message when the next buffer is null. Bug: 36007193 Change-Id: Ib8240d7e9a7087a2256e88c0ae84b9df0f2d0224 Acked-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Sherry Yang <sherryy@android.com>	2017-11-20 21:15:59 +05:30
Chris Redpath	79afe6efa8	cpufreq/sched: Use cpu max freq rather than policy max When we convert capacity into frequency, we used policy->max to get the max freq of the cpu. Since this can be changed by userspace policy or thermal events, we are potentially asking for a lower frequency than the utilization demands. Change over to using cpuinfo.max which is the max freq supported by that cpu rather than the currently-chosen max. Frequency granted still honours the max policy. Tested by setting a userspace policy and observing the relevant vars in a trace. In this instance, we ask for around 1ghz instead of 620MHz. freq_new=1013512 unfixed_freq_new=624487 capacity=546 cpuinfo_max=1900800 policy_max=1171200 Change-Id: I8c5694db42243c6fb78bb9be9046b06ac81295e7 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Ke Wang	6f8b7ac222	trace: sched: Fix util_avg_walt in sched_load_avg_cpu trace cumulative_runnable_avg was introduced in commit `ee4cebd75e` ("sched: EAS/WALT: use cr_avg instead of prev_runnable_sum") in cpu_util() for task placement, which is used to replace prev_runnable_sum. Fix util_avg_walt in sched_load_avg_cpu trace, which use prev_runnable_sum for cpu_util(). Moreover, fix potential overflow due to cumulative_runnable_avg is in u64. Change-Id: I1220477bf2ff32a6e34a34b6280b15a8178203a8 Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>	2017-11-20 21:15:59 +05:30
Dietmar Eggemann	dd1a6f1887	sched/fair: remove erroneous RCU_LOCKDEP_WARN from start_cpu() Fixes: https://bugs.linaro.org/show_bug.cgi?id=3075 Change-Id: I62d714fc4b9366a9b2535649aa92d1edc840cf94 Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-20 21:15:59 +05:30
Joonwoo Park	cd46b9e102	sched: EAS/WALT: finish accounting prior to task_tick In order to set rq->misfit_task in time, call update_task_ravg() prior to task_tick. This reduces upmigration delay by 1 scheduler window. Change-Id: I7cc80badd423f2e7684125fbfd853b0a3610f0e8 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>	2017-11-20 21:15:59 +05:30
Joonwoo Park	4ffc773858	cpufreq: sched: update capacity request upon tick always At present, sched_freq_tick() skips updating of capacity update when current frequency is fmax. This can cause incorrect frequency drop when a CPU bound task goes into sleep for example : 1) A task (A) enqueues onto CPU 0 and executes for long time. 2) A new task (B) which has low task demand enqueues onto CPU 1 and executes long so becomes a CPU bound task. 3) Both CPU 0 and 1 gets scheduler tick but skip sched_freq_tick() since current frequency is fmax. 4) Task (A) sleeps and lower the CPU 0's capacity request. 5) Because task (B) voted CPU capacity at step 2 with low demand and skipped to request afterwards, cluster frequency for both CPU 0 and 1 drops to match capacity voted by CPU 1 at step 2 even though task (B) on CPU 1 requires max capacity. Fix such incorrectness by not skipping CPU capacity voting at tick path. Change-Id: Ieb46af1ac96ffce7a5532c58c7f07bf1ada06b86 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>	2017-11-20 21:15:59 +05:30
Joonwoo Park	2067342dbc	sched/fair: prevent meaningless active migration At present need_active_balance() determines whether an active upmigration is needed by using capacity_of(). A CPU's capacity may be reduced by RT pressure, and therefore distinguishing capability differences with capacity_of() may lead to suboptimal active migrations to less capable CPUs. Use capacity_orig_of to distinguish differently capable CPUs in addition to capacity_of(), thus avoiding placing tasks on less capable CPUs due to instantaneous RT pressure. Change-Id: I3e1435246a8edc3ad618ef98a34866cfbd8c16a5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [markivx: Reworked the commit text a bit] Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>	2017-11-20 21:15:59 +05:30
Vikram Mulukutla	1765eacee8	sched: walt: Leverage existing helper APIs to apply invariance There's no need for a separate hierarchy of notifiers, APIs and variables in walt.c for the purpose of applying frequency and IPC invariance. Let's just use capacity_curr_of and get rid of a lot of the infrastructure relating to capacity, load_scale_factor etc. Change-Id: Ia220e2c896373fa535db05bff60f9aa33aefc978 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>	2017-11-20 21:15:59 +05:30
Greg Hackmann	5cf2d0c2e8	ANDROID: HACK: arm64: use -mno-implicit-float instead of -mgeneral-regs-only LLVM bug 30792 causes clang's AArch64 backend to crash compiling arch/arm64/crypto/aes-ce-cipher.c. Replacing -mgeneral-regs-only with -mno-implicit-float is the suggested workaround. Drop this patch once the clang bug has been fixed. Change-Id: I7c7bb9315a281970698120a6d2a9fcd126aad65e Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>	2017-11-20 21:15:59 +05:30
Greg Hackmann	d97498d6a5	ANDROID: Kbuild, LLVMLinux: allow overriding clang target triple Android has an unusual setup where the kernel needs to target [arch]-linux-gnu to avoid Android userspace-specific flags and optimizations, but AOSP doesn't ship a matching binutils. Add a new variable CLANG_TRIPLE which can override the "-target" triple used to compile the kernel, while using a different CROSS_COMPILE to pick the binutils/gcc installation. For Android you'd do something like: export CLANG_TRIPLE=aarch64-linux-gnu- export CROSS_COMPILE=aarch64-linux-android- If you don't need something like this, leave CLANG_TRIPLE unset and it will default to CROSS_COMPILE. Change-Id: Ib544c37f4ee4ed005437471b2984486a3e7c0da7 Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	a072e7463e	CHROMIUM: arm64: Disable asm-operand-width warning for clang clang raises 'asm-operand-widths' warnings in inline assembly code when the size of an operand is < 64 bits and the operand width is unspecified. Most warnings are raised in macros, i.e. the datatype of the operand may vary. Most of these warnings are fixed in upstream, however we consider it isn't worth the effort/risk to backport all the necessary changes. On future CrOS kernels >= v4.13 the warning should be re-enabled. Change-Id: Ia331bc83d44b8c1499450aefb45c576cd29ebf55 Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	8d6114ac42	CHROMIUM: kbuild: clang: Disable the 'duplicate-decl-specifier' warning clang generates plenty of these warnings in different parts of the code. They are mostly caused by container_of() and other macros which declare a "const <type> *" variable for their internal use which triggers a "duplicate 'const' specifier" warning if the <type> is already const qualified. Change-Id: I85ffb201003d3a04fe8b8ff94478344250d2db68 Wording-mostly-from: Michael Davidson <md@google.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	607f86551c	UPSTREAM: x86/build: Use cc-option to validate stack alignment parameter With the following commit: `8f91869766` ("x86/build: Fix stack alignment for CLang") cc-option is only used to determine the name of the stack alignment option supported by the compiler, but not to verify that the actual parameter <option>=N is valid in combination with the other CFLAGS. This causes problems (as reported by the kbuild robot) with older GCC versions which only support stack alignment on a boundary of 16 bytes or higher. Also use (__)cc_option to add the stack alignment option to CFLAGS to make sure only valid options are added. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michael Davidson <md@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hines <srhines@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dianders@chromium.org Fixes: `8f91869766` ("x86/build: Fix stack alignment for CLang") Link: http://lkml.kernel.org/r/20170817182047.176752-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `9e8730b178`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: Ia2c932ede0096fe399131e958e0aaf4835039294	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	f2b2d0a7b9	UPSTREAM: x86/build: Fix stack alignment for CLang Commit: `d77698df39` ("x86/build: Specify stack alignment for clang") intended to use the same stack alignment for clang as with gcc. The two compilers use different options to configure the stack alignment (gcc: -mpreferred-stack-boundary=n, clang: -mstack-alignment=n). The above commit assumes that the clang option uses the same parameter type as gcc, i.e. that the alignment is specified as 2^n. However clang interprets the value of this option literally to use an alignment of n, in consequence the stack remains misaligned. Change the values used with -mstack-alignment to be the actual alignment instead of a power of two. cc-option isn't used here with the typical pattern of KBUILD_CFLAGS += $(call cc-option ...). The reason is that older gcc versions don't support the -mpreferred-stack-boundary option, since cc-option doesn't verify whether the alternative option is valid it would incorrectly select the clang option -mstack-alignment.. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michael Davidson <md@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hines <srhines@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dianders@chromium.org Link: http://lkml.kernel.org/r/20170817004740.170588-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `8f91869766`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I7991bfed754f5ac10ac8b383c20ec89d56b2afc0	2017-11-20 21:15:59 +05:30
Ard Biesheuvel	033851b8b7	UPSTREAM: efi/libstub/arm64: Set -fpie when building the EFI stub Clang may emit absolute symbol references when building in non-PIC mode, even when using the default 'small' code model, which is already mostly position independent to begin with, due to its use of adrp/add pairs that have a relative range of +/- 4 GB. The remedy is to pass the -fpie flag, which can be done safely now that the code has been updated to avoid GOT indirections (which may be emitted due to the compiler assuming that the PIC/PIE code may end up in a shared library that is subject to ELF symbol preemption) Passing -fpie when building code that needs to execute at an a priori unknown offset is arguably an improvement in any case, and given that the recent visibility changes allow the PIC build to pass with GCC as well, let's add -fpie for all arm64 builds rather than only for Clang. Tested-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170818194947.19347-5-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `91ee5b21ee`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I0a011945239d39a2d1eb04c20bf1b9ceb7d2b91d	2017-11-20 21:15:59 +05:30
Ard Biesheuvel	5f42fb6a5d	BACKPORT: efi/libstub/arm64: Force 'hidden' visibility for section markers To prevent the compiler from emitting absolute references to the section markers when running in PIC mode, override the visibility to 'hidden' for all contents of asm/sections.h Tested-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170818194947.19347-4-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `0426a4e68f`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: Ia438c3f0aa6abdbd9057dfe1db732a25aa98ef40	2017-11-20 21:15:59 +05:30
David Rientjes	038f98fe7a	UPSTREAM: compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled The motivation for commit `abb2ea7dfd` ("compiler, clang: suppress warning for unused static inline functions") was to suppress clang's warnings about unused static inline functions. For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86 architecture, `inline' in the kernel implies that __attribute__((always_inline)) is used. Some code depends on that behavior, see https://lkml.org/lkml/2017/6/13/918: net/built-in.o: In function `__xchg_mb': arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99' arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99 The full fix would be to identify these breakages and annotate the functions with __always_inline instead of `inline'. But since we are late in the 4.12-rc cycle, simply carry forward the forced inlining behavior and work toward moving arm64, and other architectures, toward CONFIG_OPTIMIZE_INLINING behavior. (cherry picked from commit `9a04dbcfb3`) Change-Id: I13891c2f1e588d8c7febe5d2d57134abb31d6ecd Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-11-20 21:15:59 +05:30
Michael Davidson	d945b109a7	UPSTREAM: x86/boot: #undef memcpy() et al in string.c undef memcpy() and friends in boot/string.c so that the functions defined here will have the correct names, otherwise we end up up trying to redefine __builtin_memcpy() etc. Surprisingly, GCC allows this (and, helpfully, discards the __builtin_ prefix from the function name when compiling it), but clang does not. Adding these #undef's appears to preserve what I assume was the original intent of the code. (cherry picked from commit `18d5e6c34a`) Change-Id: I616a6a8ece533166367d987597e8c405c96441a2 Signed-off-by: Michael Davidson <md@google.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724235155.79255-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-11-20 21:15:59 +05:30
Ard Biesheuvel	6fe93571ba	UPSTREAM: crypto: arm64/sha - avoid non-standard inline asm tricks Replace the inline asm which exports struct offsets as ELF symbols with proper const variables exposing the same values. This works around an issue with Clang which does not interpret the "i" (or "I") constraints in the same way as GCC. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> (cherry picked from commit `f4857f4c2e`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I1f882de15bd447d6fc41858dfc0cbfd3f6e2466c	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	7bb77b25c0	UPSTREAM: kbuild: clang: Disable 'address-of-packed-member' warning clang generates plenty of these warnings in different parts of the code, to an extent that the warnings are little more than noise. Disable the 'address-of-packed-member' warning. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> (cherry picked from commit `bfb38988c5`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I35ecf1b35a908d41ee791a8a651e3cfb4edd081b	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	1386dde510	UPSTREAM: x86/build: Specify stack alignment for clang For gcc stack alignment is configured with -mpreferred-stack-boundary=N, clang has the option -mstack-alignment=N for that purpose. Use the same alignment as with gcc. If the alignment is not specified clang assumes an alignment of 16 bytes, as required by the standard ABI. However as mentioned in `d9b0cde91c` ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if supported") the standard kernel entry on x86-64 leaves the stack on an 8-byte boundary, as a consequence clang will keep the stack misaligned. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> (cherry picked commit `d77698df39`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I4283d10c6fe31cf194b35adc5371732b89eb3ae3	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	3a2fa912d4	UPSTREAM: x86/build: Use __cc-option for boot code compiler options cc-option is used to enable compiler options for the boot code if they are available. The macro uses KBUILD_CFLAGS and KBUILD_CPPFLAGS for the check, however these flags aren't used to build the boot code, in consequence cc-option can yield wrong results. For example -mpreferred-stack-boundary=2 is never set with a 64-bit compiler, since the setting is only valid for 16 and 32-bit binaries. This is also the case for 32-bit kernel builds, because the option -m32 is added to KBUILD_CFLAGS after the assignment of REALMODE_CFLAGS. Use __cc-option instead of cc-option for the boot mode options. The macro receives the compiler options as parameter instead of using KBUILD_C*FLAGS, for the boot code we pass REALMODE_CFLAGS. Also use separate statements for the __cc-option checks instead of performing them in the initial assignment of REALMODE_CFLAGS since the variable is an input of the macro. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> (cherry picked commit `032a2c4f65`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: I7756f875771edb00238eb770be912f713407681a	2017-11-20 21:15:59 +05:30
Matthias Kaehlcke	e6ef089b2d	BACKPORT: kbuild: Add __cc-option macro cc-option uses KBUILD_CFLAGS and KBUILD_CPPFLAGS when it determines whether an option is supported or not. This is fine for options used to build the kernel itself, however some components like the x86 boot code use a different set of flags. Add the new macro __cc-option which is a more generic version of cc-option with additional parameters. One parameter is the compiler with which the check should be performed, the other the compiler options to be used instead KBUILD_C*FLAGS. Refactor cc-option and hostcc-option to use __cc-option and move hostcc-option to scripts/Kbuild.include. Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Marek <mmarek@suse.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> (cherry picked from commit `9f3f1fd299`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Conflicts: scripts/Kbuild.include Change-Id: I4c8288b9c74bd6b9199307a0e04b78a27e28361d	2017-11-20 21:15:59 +05:30
Ville Syrjälä	8931a2e629	UPSTREAM: x86/hweight: Don't clobber %rdi The caller expects %rdi to remain intact, push+pop it make that happen. Fixes the following kind of explosions on my core2duo machine when trying to reboot or shut down: general protection fault: 0000 [#1] PREEMPT SMP Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty #69 Hardware name: /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007 task: ffff88012a0b4080 task.stack: ffff880123850000 RIP: 0010:[<ffffffff81003c92>] [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0 RSP: 0018:ffff880123853b60 EFLAGS: 00010087 RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800 RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000 R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001 R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018 FS: 00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0 Stack: ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001 ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800 ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13 Call Trace: [<ffffffff81003d59>] x86_pmu_stop+0x59/0xd0 [<ffffffff81003e13>] x86_pmu_del+0x43/0x140 [<ffffffff8111705d>] event_sched_out.isra.105+0xbd/0x260 [<ffffffff8111738d>] __perf_remove_from_context+0x2d/0xb0 [<ffffffff8111745d>] __perf_event_exit_context+0x4d/0x70 [<ffffffff810c8826>] generic_exec_single+0xb6/0x140 [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0 [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0 [<ffffffff810c898f>] smp_call_function_single+0xdf/0x140 [<ffffffff81113d27>] perf_event_exit_cpu_context+0x87/0xc0 [<ffffffff81113d73>] perf_reboot+0x13/0x40 [<ffffffff8107578a>] notifier_call_chain+0x4a/0x70 [<ffffffff81075ad7>] __blocking_notifier_call_chain+0x47/0x60 [<ffffffff81075b06>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81076a1d>] kernel_restart_prepare+0x1d/0x40 [<ffffffff81076ae2>] kernel_restart+0x12/0x60 [<ffffffff81076d56>] SYSC_reboot+0xf6/0x1b0 [<ffffffff811a823c>] ? mntput_no_expire+0x2c/0x1b0 [<ffffffff811a83e4>] ? mntput+0x24/0x40 [<ffffffff811894fc>] ? __fput+0x16c/0x1e0 [<ffffffff811895ae>] ? ____fput+0xe/0x10 [<ffffffff81072fc3>] ? task_work_run+0x83/0xa0 [<ffffffff81001623>] ? exit_to_usermode_loop+0x53/0xc0 [<ffffffff8100105a>] ? trace_hardirqs_on_thunk+0x1a/0x1c [<ffffffff81076e6e>] SyS_reboot+0xe/0x10 [<ffffffff814c4ba5>] entry_SYSCALL_64_fastpath+0x18/0xa3 Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 <0f> 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf RIP [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0 RSP <ffff880123853b60> ---[ end trace 7ec95181faf211be ]--- note: reboot[3070] exited with preempt_count 2 Cc: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Fixes: `f5967101e9` ("x86/hweight: Get rid of the special calling convention") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit `65ea11ec6a`) Signed-off-by: Greg Hackmann <ghackmann@google.com> Change-Id: Ib004aa044ba9fc73cfff97fe78c8607008ca3846	2017-11-20 21:15:59 +05:30

1 2 3 4 5 ...

570535 Commits