linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-10 12:57:06 +09:00

Author	SHA1	Message	Date
Peter Zijlstra	c8035d9f08	sched: Fix incorrect sanity check commit `11854247e2` upstream We moved to migrate on wakeup, which means that sleeping tasks could still be present on offline cpus. Amend the check to only test running tasks. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:03 -07:00
Peter Zijlstra	20f95b412b	sched: Fix fork vs hotplug vs cpuset namespaces commit `fabf318e5e` upstream There are a number of issues: 1) TASK_WAKING vs cgroup_clone (cpusets) copy_process(): sched_fork() child->state = TASK_WAKING; /* waiting for wake_up_new_task() */ if (current->nsproxy != p->nsproxy) ns_cgroup_clone() cgroup_clone() mutex_lock(inode->i_mutex) mutex_lock(cgroup_mutex) cgroup_attach_task() ss->can_attach() ss->attach() [ -> cpuset_attach() ] cpuset_attach_task() set_cpus_allowed_ptr(); while (child->state == TASK_WAKING) cpu_relax(); will deadlock the system. 2) cgroup_clone (cpusets) vs copy_process So even if the above would work we still have: copy_process(): if (current->nsproxy != p->nsproxy) ns_cgroup_clone() cgroup_clone() mutex_lock(inode->i_mutex) mutex_lock(cgroup_mutex) cgroup_attach_task() ss->can_attach() ss->attach() [ -> cpuset_attach() ] cpuset_attach_task() set_cpus_allowed_ptr(); ... p->cpus_allowed = current->cpus_allowed over-writing the modified cpus_allowed. 3) fork() vs hotplug if we unplug the child's cpu after the sanity check when the child gets attached to the task_list but before wake_up_new_task() shit will meet with fan. Solve all these issues by moving fork cpu selection into wake_up_new_task(). Reported-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1264106190.4283.1314.camel@laptop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:02 -07:00
Peter Zijlstra	05e067a973	sched: Fix hotplug hang commit `70f1120527` upstream The hot-unplug kstopmachine usage does a wakeup after deactivating the cpu, hence we cannot use cpu_active() here but must rely on the good olde online. Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1261326987.4314.24.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:02 -07:00
Peter Zijlstra	54a2695e9f	sched: Remove the cfs_rq dependency from set_task_cpu() commit `88ec22d3ed` upstream In order to remove the cfs_rq dependency from set_task_cpu() we need to ensure the task is cfs_rq invariant for all callsites. The simple approach is to substract cfs_rq->min_vruntime from se->vruntime on dequeue, and add cfs_rq->min_vruntime on enqueue. However, this has the downside of breaking FAIR_SLEEPERS since we loose the old vruntime as we only maintain the relative position. To solve this, we observe that we only migrate runnable tasks, we do this using deactivate_task(.sleep=0) and activate_task(.wakeup=0), therefore we can restrain the min_vruntime invariance to that state. The only other case is wakeup balancing, since we want to maintain the old vruntime we cannot make it relative on dequeue, but since we don't migrate inactive tasks, we can do so right before we activate it again. This is where we need the new pre-wakeup hook, we need to call this while still holding the old rq->lock. We could fold it into ->select_task_rq(), but since that has multiple callsites and would obfuscate the locking requirements, that seems like a fudge. This leaves the fork() case, simply make sure that ->task_fork() leaves the ->vruntime in a relative state. This covers all cases where set_task_cpu() gets called, and ensures it sees a relative vruntime. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.191697025@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:02 -07:00
Peter Zijlstra	668530636f	sched: Add pre and post wakeup hooks commit `efbbd05a59` upstream As will be apparent in the next patch, we need a pre wakeup hook for sched_fair task migration, hence rename the post wakeup hook and one pre wakeup. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.114746117@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:01 -07:00
Peter Zijlstra	ca6d844e58	sched: Fix select_task_rq() vs hotplug issues commit `5da9a0fb67` upstream Since select_task_rq() is now responsible for guaranteeing ->cpus_allowed and cpu_active_mask, we need to verify this. select_task_rq_rt() can blindly return smp_processor_id()/task_cpu() without checking the valid masks, select_task_rq_fair() can do the same in the rare case that all SD_flags are disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.961475466@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:01 -07:00
Peter Zijlstra	31a9936c6f	sched: Fix sched_exec() balancing commit `3802290628` upstream sched: Fix sched_exec() balancing Since we access ->cpus_allowed without holding rq->lock we need a retry loop to validate the result, this comes for near free when we merge sched_migrate_task() into sched_exec() since that already does the needed check. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.884743662@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:01 -07:00
Peter Zijlstra	9f2243e581	sched: Fix broken assertion commit `077614ee1e` upstream There's a preemption race in the set_task_cpu() debug check in that when we get preempted after setting task->state we'd still be on the rq proper, but fail the test. Check for preempted tasks, since those are always on the RQ. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20091217121830.137155561@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:01 -07:00
Ingo Molnar	07ad010640	sched: Make warning less noisy commit `416eb39556` upstream Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.807938893@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:01 -07:00
Peter Zijlstra	854c984012	sched: Ensure set_task_cpu() is never called on blocked tasks commit `e2912009fb` upstream In order to clean up the set_task_cpu() rq dependencies we need to ensure it is never called on blocked tasks because such usage does not pair with consistent rq->lock usage. This puts the migration burden on ttwu(). Furthermore we need to close a race against changing ->cpus_allowed, since select_task_rq() runs with only preemption disabled. For sched_fork() this is safe because the child isn't in the tasklist yet, for wakeup we fix this by synchronizing set_cpus_allowed_ptr() against TASK_WAKING, which leaves sched_exec to be a problem This also closes a hole in (`6ad4c1888` sched: Fix balance vs hotplug race) where ->select_task_rq() doesn't validate the result against the sched_domain/root_domain. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.807938893@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:00 -07:00
Peter Zijlstra	9afee77b0c	sched: Use TASK_WAKING for fork wakups commit `06b83b5fbe` upstream For later convenience use TASK_WAKING for fresh tasks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.732561278@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:00 -07:00
Xiaotian Feng	4ec349b128	sched: Fix set_cpu_active() in cpu_down() commit `9ee349ad6d` upstream Sachin found cpu hotplug test failures on powerpc, which made the kernel hang on his POWER box. The problem is that we fail to re-activate a cpu when a hot-unplug fails. Fix this by moving the de-activation into _cpu_down after doing the initial checks. Remove the synchronize_sched() calls and rely on those implied by rebuilding the sched domains using the new mask. Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Tested-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.500272612@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:00 -07:00
Thomas Gleixner	62d4f15528	sched: Use rcu in sched_get_rr_param() commit `1a551ae715` upstream read_lock(&tasklist_lock) does not protect sys_sched_get_rr_param() against a concurrent update of the policy or scheduler parameters as do_sched_scheduler() does not take the tasklist_lock. The access to task->sched_class->get_rr_interval is protected by task_rq_lock(task). Use rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.862897167@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:00 -07:00
Thomas Gleixner	b507f4cade	sched: Use rcu in sched_get/set_affinity() commit `23f5d14251` upstream tasklist_lock is held read locked to protect the find_task_by_vpid() call and to prevent the task going away. sched_setaffinity acquires a task struct ref and drops tasklist lock right away. The access to the cpus_allowed mask is protected by rq->lock. rcu_read_lock() provides the same protection here. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.789059966@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:18:00 -07:00
Thomas Gleixner	d99be10f68	sched: Use rcu in sys_sched_getscheduler/sys_sched_getparam() commit `5fe85be081` upstream read_lock(&tasklist_lock) does not protect sys_sched_getscheduler and sys_sched_getparam() against a concurrent update of the policy or scheduler parameters as do_sched_setscheduler() does not take the tasklist_lock. The accessed integers can be retrieved w/o locking and are snapshots anyway. Using rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away is not changing the above situation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.753790977@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:59 -07:00
Rafael J.Wysocki	ad6899b37b	sched: Make wakeup side and atomic variants of completion API irq safe commit `7539a3b3d1` upstream Alan Stern noticed that all the wakeup side (and atomic) variants of the completion APIs should be irq safe, but the newly introduced completion_done() and try_wait_for_completion() aren't. The use of the irq unsafe variants in IRQ contexts can cause crashes/hangs. Fix the problem by making them use spin_lock_irqsave() and spin_lock_irqrestore(). Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: pm list <linux-pm@lists.linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Chinner <david@fromorbit.com> Cc: Lachlan McIlroy <lachlan@sgi.com> LKML-Reference: <200912130007.30541.rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:59 -07:00
Ingo Molnar	7a3ca629e8	sched: Remove forced2_migrations stats commit `b9889ed1dd` upstream This build warning: kernel/sched.c: In function 'set_task_cpu': kernel/sched.c:2070: warning: unused variable 'old_rq' Made me realize that the forced2_migrations stat looks pretty pointless (and a misnomer) - remove it. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:59 -07:00
Peter Zijlstra	f70cad0fdf	sched: Sanitize fork() handling commit `cd29fe6f26` upstream Currently we try to do task placement in wake_up_new_task() after we do the load-balance pass in sched_fork(). This yields complicated semantics in that we have to deal with tasks on different RQs and the set_task_cpu() calls in copy_process() and sched_fork() Rename ->task_new() to ->task_fork() and call it from sched_fork() before the balancing, this gives the policy a clear point to place the task. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:59 -07:00
Peter Zijlstra	7b8a5273c8	sched: Clean up ttwu() rq locking commit `ab19cb2331` upstream Since set_task_clock() doesn't rely on rq->clock anymore we can simplyfy the mess in ttwu(). Optimize things a bit by not fiddling with the IRQ state there. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:58 -07:00
Peter Zijlstra	55eedcb291	sched: Remove rq->clock coupling from set_task_cpu() commit `5afcdab706` upstream set_task_cpu() should be rq invariant and only touch task state, it currently fails to do so, which opens up a few races, since not all callers hold both rq->locks. Remove the relyance on rq->clock, as any site calling set_task_cpu() should also do a remote clock update, which should ensure the observed time between these two cpus is monotonic, as per kernel/sched_clock.c:sched_clock_remote(). Therefore we can simply remove the clock_offset bits and be happy. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:58 -07:00
Hiroshi Shimamoto	9c6abb9b98	sched: Remove unused cpu_nr_migrations() commit `9824a2b728` upstream cpu_nr_migrations() is not used, remove it. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4AF12A66.6020609@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:58 -07:00
Peter Zijlstra	a5f9d12672	sched: Consolidate select_task_rq() callers commit `970b13bacb` upstream sched: Consolidate select_task_rq() callers Small cleanup. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> [ v2: build fix ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:57 -07:00
Thomas Gleixner	34d1f22e76	sched: Protect sched_rr_get_param() access to task->sched_class commit `dba091b9e3` upstream sched_rr_get_param calls task->sched_class->get_rr_interval(task) without protection against a concurrent sched_setscheduler() call which modifies task->sched_class. Serialize the access with task_rq_lock(task) and hand the rq pointer into get_rr_interval() as it's needed at least in the sched_fair implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <alpine.LFD.2.00.0912090930120.3089@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:57 -07:00
Thomas Gleixner	7272e504cb	sched: Protect task->cpus_allowed access in sched_getaffinity() commit `3160568371` upstream sched_getaffinity() is not protected against a concurrent modification of the tasks affinity. Serialize the access with task_rq_lock(task). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091208202026.769251187@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:57 -07:00
Roland McGrath	6bd6aa7dda	x86-64, compat: Retruncate rax after ia32 syscall entry tracing commit `eefdca043e` upstream. In commit `d4d6715`, we reopened an old hole for a 64-bit ptracer touching a 32-bit tracee in system call entry. A %rax value set via ptrace at the entry tracing stop gets used whole as a 32-bit syscall number, while we only check the low 32 bits for validity. Fix it by truncating %rax back to 32 bits after syscall_trace_enter, in addition to testing the full 64 bits as has already been added. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:57 -07:00
H. Peter Anvin	337b16213b	compat: Make compat_alloc_user_space() incorporate the access_ok() commit `c41d68a513` upstream. compat_alloc_user_space() expects the caller to independently call access_ok() to verify the returned area. A missing call could introduce problems on some architectures. This patch incorporates the access_ok() check into compat_alloc_user_space() and also adds a sanity check on the length. The existing compat_alloc_user_space() implementations are renamed arch_compat_alloc_user_space() and are used as part of the implementation of the new global function. This patch assumes NULL will cause __get_user()/__put_user() to either fail or access userspace on all architectures. This should be followed by checking the return value of compat_access_user_space() for NULL in the callers, at which time the access_ok() in the callers can also be removed. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <jejb@parisc-linux.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:57 -07:00
H. Peter Anvin	7764ec137f	x86-64, compat: Test %rax for the syscall number, not %eax commit `36d001c70d` upstream. On 64 bits, we always, by necessity, jump through the system call table via %rax. For 32-bit system calls, in theory the system call number is stored in %eax, and the code was testing %eax for a valid system call number. At one point we loaded the stored value back from the stack to enforce zero-extension, but that was removed in checkin `d4d6715016`. An actual 32-bit process will not be able to introduce a non-zero-extended number, but it can happen via ptrace. Instead of re-introducing the zero-extension, test what we are actually going to use, i.e. %rax. This only adds a handful of REX prefixes to the code. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:56 -07:00
Peter Zijlstra	f5f6687a47	x86, tsc: Fix a preemption leak in restore_sched_clock_state() commit `55496c896b` upstream. Doh, a real life genuine preemption leak.. This caused a suspend failure. Reported-bisected-and-tested-by-the-invaluable: Jeff Chua <jeff.chua.linux@gmail.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Nico Schottelius <nico-linux-20100709@schottelius.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Florian Pritz <flo@xssn.at> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Len Brown <lenb@kernel.org> sleep states LKML-Reference: <1284150773.402.122.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:56 -07:00
Johannes Berg	444ba62c39	wireless extensions: fix kernel heap content leak commit `42da2f948d` upstream. Wireless extensions have an unfortunate, undocumented requirement which requires drivers to always fill iwp->length when returning a successful status. When a driver doesn't do this, it leads to a kernel heap content leak when userspace offers a larger buffer than would have been necessary. Arguably, this is a driver bug, as it should, if it returns 0, fill iwp->length, even if it separately indicated that the buffer contents was not valid. However, we can also at least avoid the memory content leak if the driver doesn't do this by setting the iwp length to max_tokens, which then reflects how big the buffer is that the driver may fill, regardless of how big the userspace buffer is. To illustrate the point, this patch also fixes a corresponding cfg80211 bug (since this requirement isn't documented nor was ever pointed out by anyone during code review, I don't trust all drivers nor all cfg80211 handlers to implement it correctly). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:56 -07:00
John W. Linville	34cee821ec	ath5k: check return value of ieee80211_get_tx_rate commit `d8e1ba76d6` upstream. This avoids a NULL pointer dereference as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=625889 When the WARN condition is hit in ieee80211_get_tx_rate, it will return NULL. So, we need to check the return value and avoid dereferencing it in that case. Signed-off-by: John W. Linville <linville@tuxdriver.com> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:55 -07:00
Christian Lamparter	8fdd6243c3	p54: fix tx feedback status flag check commit `f880c2050f` upstream. Michael reported that p54* never really entered power save mode, even tough it was enabled. It turned out that upon a power save mode change the firmware will set a special flag onto the last outgoing frame tx status (which in this case is almost always the designated PSM nullfunc frame). This flag confused the driver; It erroneously reported transmission failures to the stack, which then generated the next nullfunc. and so on... Reported-by: Michael Buesch <mb@bu3sch.de> Tested-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:55 -07:00
Frederic Weisbecker	a4bc94d437	perf: Initialize callchains roots's childen hits commit `5225c45899` upstream. Each histogram entry has a callchain root that stores the callchain samples. However we forgot to initialize the tracking of children hits of these roots, which then got random values on their creation. The root children hits is multiplied by the minimum percentage of hits provided by the user, and the result becomes the minimum hits expected from children branches. If the random value due to the uninitialization is big enough, then this minimum number of hits can be huge and eventually filter every children branches. The end result was invisible callchains. All we need to fix this is to initialize the children hits of the root. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:55 -07:00
KAMEZAWA Hiroyuki	05e0120a46	memory hotplug: fix next block calculation in is_removable commit `0dcc48c15f` upstream. next_active_pageblock() is for finding next _used_ freeblock. It skips several blocks when it finds there are a chunk of free pages lager than pageblock. But it has 2 bugs. 1. We have no lock. page_order(page) - pageblock_order can be minus. 2. pageblocks_stride += is wrong. it should skip page_order(p) of pages. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:55 -07:00
Dmitry Torokhov	e3327857e9	Input: i8042 - fix device removal on unload commit `af045b8666` upstream. We need to call platform_device_unregister(i8042_platform_device) before calling platform_driver_unregister() because i8042_remove() resets i8042_platform_device to NULL. This leaves the platform device instance behind and prevents driver reload. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=16613 Reported-by: Seryodkin Victor <vvscore@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:54 -07:00
Jan Sembera	0ca6988df7	binfmt_misc: fix binfmt_misc priority commit `ee3aebdd8f` upstream. Commit `74641f584d` ("alpha: binfmt_aout fix") (May 2009) introduced a regression - binfmt_misc is now consulted after binfmt_elf, which will unfortunately break ia32el. ia32 ELF binaries on ia64 used to be matched using binfmt_misc and executed using wrapper. As 32bit binaries are now matched by binfmt_elf before bindmt_misc kicks in, the wrapper is ignored. The fix increases precedence of binfmt_misc to the original state. Signed-off-by: Jan Sembera <jsembera@suse.cz> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:54 -07:00
Jerome Marchand	0776127797	kernel/groups.c: fix integer overflow in groups_search commit `1c24de60e5` upstream. gid_t is a unsigned int. If group_info contains a gid greater than MAX_INT, groups_search() function may look on the wrong side of the search tree. This solves some unfair "permission denied" problems. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:54 -07:00
Gary King	a0c42c23e8	bounce: call flush_dcache_page() after bounce_copy_vec() commit `ac8456d6f9` upstream. I have been seeing problems on Tegra 2 (ARMv7 SMP) systems with HIGHMEM enabled on 2.6.35 (plus some patches targetted at 2.6.36 to perform cache maintenance lazily), and the root cause appears to be that the mm bouncing code is calling flush_dcache_page before it copies the bounce buffer into the bio. The bounced page needs to be flushed after data is copied into it, to ensure that architecture implementations can synchronize instruction and data caches if necessary. Signed-off-by: Gary King <gking@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:54 -07:00
Guennadi Liakhovetski	61e3fe7a31	mmc: fix the use of kunmap_atomic() in tmio_mmc.h commit `5600efb1bc` upstream. kunmap_atomic() takes the cookie, returned by the kmap_atomic() as its argument and not the page address, used as an argument to kmap_atomic(). This patch fixes the compile error: In file included from drivers/mmc/host/tmio_mmc.c:37: drivers/mmc/host/tmio_mmc.h: In function 'tmio_mmc_kunmap_atomic': drivers/mmc/host/tmio_mmc.h:192: error: negative width in bit-field '<anonymous>' Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Eric Miao <eric.y.miao@gmail.com> Tested-by: Magnus Damm <damm@opensource.se> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:53 -07:00
Yusuke Goda	389e860f1b	tmio_mmc: don't clear unhandled pending interrupts commit `b78d6c5f51` upstream. Previously, it was possible for ack_mmc_irqs() to clear pending interrupt bits in the CTL_STATUS register, even though the interrupt handler had not been called. This was because of a race that existed when doing a read-modify-write sequence on CTL_STATUS. After the read step in this sequence, if an interrupt occurred (causing one of the bits in CTL_STATUS to be set) the write step would inadvertently clear it. Observed with the TMIO_STAT_RXRDY bit together with CMD53 on AR6002 and BCM4318 SDIO cards in polled mode. This patch eliminates this race by only writing to CTL_STATUS and clearing the interrupts that were passed as an argument to ack_mmc_irqs()." [matt@console-pimps.org: rewrote changelog] Signed-off-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Acked-by: Magnus Damm <damm@opensource.se> Tested-by: Arnd Hannemann <arnd@arndnet.de> Acked-by: Ian Molton <ian@mnementh.co.uk> Cc: Matt Fleming <matt@console-pimps.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:53 -07:00
Peter Oberparleiter	4e318f43d2	gcov: fix null-pointer dereference for certain module types commit `85a0fdfd0f` upstream. The gcov-kernel infrastructure expects that each object file is loaded only once. This may not be true, e.g. when loading multiple kernel modules which are linked to the same object file. As a result, loading such kernel modules will result in incorrect gcov results while unloading will cause a null-pointer dereference. This patch fixes these problems by changing the gcov-kernel infrastructure so that multiple profiling data sets can be associated with one debugfs entry. It applies to 2.6.36-rc1. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Reported-by: Werner Spies <werner.spies@thalesgroup.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:53 -07:00
Dan Carpenter	f4ee150ede	irda: off by one commit `cf9b94f88b` upstream. This is an off by one. We would go past the end when we NUL terminate the "value" string at end of the function. The "value" buffer is allocated in irlan_client_parse_response() or irlan_provider_parse_command(). CC: stable@kernel.org Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-20 13:17:52 -07:00
Chris Wright	97c63b87be	tracing: t_start: reset FTRACE_ITER_HASH in case of seek/pread commit `df09162550` upstream. Be sure to avoid entering t_show() with FTRACE_ITER_HASH set without having properly started the iterator to iterate the hash. This case is degenerate and, as discovered by Robert Swiecki, can cause t_hash_show() to misuse a pointer. This causes a NULL ptr deref with possible security implications. Tracked as CVE-2010-3079. Cc: Robert Swiecki <swiecki@google.com> Cc: Eugene Teo <eugene@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:52 -07:00
Steven Rostedt	577fd36b74	tracing: Do not allow llseek to set_ftrace_filter commit `9c55cb12c1` upstream. Reading the file set_ftrace_filter does three things. 1) shows whether or not filters are set for the function tracer 2) shows what functions are set for the function tracer 3) shows what triggers are set on any functions 3 is independent from 1 and 2. The way this file currently works is that it is a state machine, and as you read it, it may change state. But this assumption breaks when you use lseek() on the file. The state machine gets out of sync and the t_show() may use the wrong pointer and cause a kernel oops. Luckily, this will only kill the app that does the lseek, but the app dies while holding a mutex. This prevents anyone else from using the set_ftrace_filter file (or any other function tracing file for that matter). A real fix for this is to rewrite the code, but that is too much for a -rc release or stable. This patch simply disables llseek on the set_ftrace_filter() file for now, and we can do the proper fix for the next major release. Reported-by: Robert Swiecki <swiecki@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Tavis Ormandy <taviso@google.com> Cc: Eugene Teo <eugene@redhat.com> Cc: vendor-sec@lst.de Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:52 -07:00
Li Zefan	d17babf50a	tracing: Fix a race in function profile commit `3aaba20f26` upstream. While we are reading trace_stat/functionX and someone just disabled function_profile at that time, we can trigger this: divide error: 0000 [#1] PREEMPT SMP ... EIP is at function_stat_show+0x90/0x230 ... This fix just takes the ftrace_profile_lock and checks if rec->counter is 0. If it's 0, we know the profile buffer has been reset. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4C723644.4040708@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:51 -07:00
Tejun Heo	45665602d9	libata: skip EH autopsy and recovery during suspend commit `e2f3d75fc0` upstream. For some mysterious reason, certain hardware reacts badly to usual EH actions while the system is going for suspend. As the devices won't be needed until the system is resumed, ask EH to skip usual autopsy and recovery and proceed directly to suspend. Signed-off-by: Tejun Heo <tj@kernel.org> Tested-by: Stephan Diestelhorst <stephan.diestelhorst@amd.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:51 -07:00
Alan Stern	3df1ba0d07	HID: fix suspend crash by moving initializations earlier commit `fde4e2f732` upstream. Although the usbhid driver allocates its usbhid structure in the probe routine, several critical fields in that structure don't get initialized until usbhid_start(). However if report descriptor parsing fails then usbhid_start() is never called. This leads to problems during system suspend -- the system will freeze. This patch (as1378) fixes the bug by moving the initialization statements up into usbhid_probe(). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Tested-By: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:51 -07:00
Jiri Kosina	c7260ac8c7	HID: usbhid: initialize interface pointers early enough commit `57ab12e418` upstream. Move the initialization of USB interface pointers from _start() over to _probe() callback, which is where it belongs. This fixes case where interface is NULL when parsing of report descriptor fails. LKML-Reference: <20100213135720.603e5f64@neptune.home> Reported-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:51 -07:00
Robert Richter	a9dd3bc18f	oprofile, x86: fix init_sysfs() function stub commit `269f45c250` upstream. The use of the return value of init_sysfs() with commit `10f0412` oprofile, x86: fix init_sysfs error handling discovered the following build error for !CONFIG_PM: .../linux/arch/x86/oprofile/nmi_int.c: In function ‘op_nmi_init’: .../linux/arch/x86/oprofile/nmi_int.c:784: error: expected expression before ‘do’ make[2]: * [arch/x86/oprofile/nmi_int.o] Error 1 make[1]: * [arch/x86/oprofile] Error 2 This patch fixes this. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:51 -07:00
Robert Richter	b250d8d2d7	oprofile, x86: fix init_sysfs error handling commit `10f0412f57` upstream. On failure init_sysfs() might not properly free resources. The error code of the function is not checked. And, when reinitializing the exit function might be called twice. This patch fixes all this. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:50 -07:00
Robert Richter	f4db115e77	oprofile: fix crash when accessing freed task structs commit `750d857c68` upstream. This patch fixes a crash during shutdown reported below. The crash is caused by accessing already freed task structs. The fix changes the order for registering and unregistering notifier callbacks. All notifiers must be initialized before buffers start working. To stop buffer synchronization we cancel all workqueues, unregister the notifier callback and then flush all buffers. After all of this we finally can free all tasks listed. This should avoid accessing freed tasks. On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote: > So the initial observation is a spinlock bad magic followed by a crash > in the spinlock debug code: > > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136 > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03 > > Backtrace looks like: > > spin_bug+0x74/0xd4 > ._raw_spin_lock+0x48/0x184 > ._spin_lock+0x10/0x24 > .get_task_mm+0x28/0x8c > .sync_buffer+0x1b4/0x598 > .wq_sync_buffer+0xa0/0xdc > .worker_thread+0x1d8/0x2a8 > .kthread+0xa8/0xb4 > .kernel_thread+0x54/0x70 > > So we are accessing a freed task struct in the work queue when > processing the samples. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-09-20 13:17:50 -07:00

1 2 3 4 5 ...

170968 Commits