60704 Commits

Author SHA1 Message Date
Catalin Marinas
92fd7a8988 ARM: 7150/1: Allow kernel unaligned accesses on ARMv6+ processors
commit 8428e84d42 upstream.

Recent gcc versions generate unaligned accesses by default on ARMv6 and
later processors. This patch ensures that the SCTLR.A bit is always
cleared on such processors to avoid kernel traping before
alignment_init() is called.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: John Linn <John.Linn@xilinx.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-26 09:08:33 -08:00
Cornelia Huck
57a3b9c22b KVM: s390: Fix RUNNING flag misinterpretation
commit 9e6dabeffd upstream.

CPUSTAT_RUNNING was implemented signifying that a vcpu is not stopped.
This is not, however, what the architecture says: RUNNING should be
set when the host is acting on the behalf of the guest operating
system.

CPUSTAT_RUNNING has been changed to be set in kvm_arch_vcpu_load()
and to be unset in kvm_arch_vcpu_put().

For signifying stopped state of a vcpu, a host-controlled bit has
been used and is set/unset basically on the reverse as the old
CPUSTAT_RUNNING bit (including pushing it down into stop handling
proper in handle_stop()).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-26 09:08:33 -08:00
Anton Blanchard
3549966c02 powerpc: Copy down exception vectors after feature fixups
commit d715e433b7 upstream.

kdump fails because we try to execute an HV only instruction. Feature
fixups are being applied after we copy the exception vectors down to 0
so they miss out on any updates.

We have always had this issue but it only became critical in v3.0
when we added CFAR support (breaks POWER5) and v3.1 when we added
POWERNV (breaks everyone).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:28 -08:00
Michael Neuling
fed5954739 powerpc: Add hvcall.h include to book3s_hv.c
commit de1d9248ea upstream.

If you build with KVM and UP it fails with the following due to a
missing include.

/arch/powerpc/kvm/book3s_hv.c: In function 'do_h_register_vpa':
arch/powerpc/kvm/book3s_hv.c:156:10: error: 'H_PARAMETER' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:156:10: note: each undeclared identifier is reported only once for each function it appears in
arch/powerpc/kvm/book3s_hv.c:192:12: error: 'H_RESOURCE' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:222:9: error: 'H_SUCCESS' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c: In function 'kvmppc_pseries_do_hcall':
arch/powerpc/kvm/book3s_hv.c:228:30: error: 'H_SUCCESS' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:232:7: error: 'H_CEDE' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:234:7: error: 'H_PROD' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:238:10: error: 'H_PARAMETER' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:250:7: error: 'H_CONFER' undeclared (first use in this function)
arch/powerpc/kvm/book3s_hv.c:252:7: error: 'H_REGISTER_VPA' undeclared (first use in this function)
make[2]: *** [arch/powerpc/kvm/book3s_hv.o] Error 1

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:27 -08:00
Geoff Levand
9c24bb2008 powerpc/ps3: Fix lost SMP IPIs
commit 72f3bea075 upstream.

Fixes the PS3 bootup hang introduced in 3.0-rc1 by:

  commit 317f394160
  sched: Move the second half of ttwu() to the remote cpu

Move the PS3's LV1 EOI call lv1_end_of_interrupt_ext() from ps3_chip_eoi()
to ps3_get_irq() for IPI messages.

If lv1_send_event_locally() is called between a previous call to
lv1_send_event_locally() and the coresponding call to
lv1_end_of_interrupt_ext() the second event will not be delivered to the
target cpu.

The PS3's SMP IPIs are implemented using lv1_send_event_locally(), so if two
IPI messages of the same type are sent to the same target in a relatively
short period of time the second IPI event can become lost when
lv1_end_of_interrupt_ext() is called from ps3_chip_eoi().

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:26 -08:00
Zhenzhong Duan
14fa9e2f09 xen:pvhvm: enable PVHVM VCPU placement when using more than 32 CPUs.
commit 90d4f5534d upstream.

PVHVM running with more than 32 vcpus and pv_irq/pv_time enabled
need VCPU placement to work, or else it will softlockup.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:25 -08:00
Nobuhiro Iwamatsu
aa85ae0189 sh: Fix cached/uncaced address calculation in 29bit mode
commit dfd3b596fb upstream.

In the case of 29bit mode, CAC/UNCAC_ADDR does not return a right address.
This revises this problem by using P1SEGADDR and P2SEGADDR in 29bit mode.

Reported-by: Yutaro Ebihara <ebiharaml@si-linux.co.jp>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:19 -08:00
Jochen Friedrich
cf80307d47 ARM: at91: Fix USBA gadget registration
commit dd0b382549 upstream.

Since 193ab2a607, various AT91 boards don't
register USBA adapters anymore due to depending on a now non-existing
symbol. Fix the symbol name.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21 14:35:18 -08:00
Anton Blanchard
3493ce844f powerpc: Fix deadlock in icswx code
commit 8bdafa39a4 upstream.

The icswx code introduced an A-B B-A deadlock:

     CPU0                    CPU1
     ----                    ----
lock(&anon_vma->mutex);
                             lock(&mm->mmap_sem);
                             lock(&anon_vma->mutex);
lock(&mm->mmap_sem);

Instead of using the mmap_sem to keep mm_users constant, take the
page table spinlock.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:42 -08:00
Thadeu Lima de Souza Cascardo
85e71ae343 powerpc/eeh: Fix /proc/ppc64/eeh creation
commit 8feaa43494 upstream.

Since commit 188917e183, /proc/ppc64 is a
symlink to /proc/powerpc/. That means that creating /proc/ppc64/eeh will
end up with a unaccessible file, that is not listed under /proc/powerpc/
and, then, not listed under /proc/ppc64/.

Creating /proc/powerpc/eeh fixes that problem and maintain the
compatibility intended with the ppc64 symlink.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:42 -08:00
Anton Blanchard
488d4d2d0a powerpc/pseries: Avoid spurious error during hotplug CPU add
commit 9c740025c5 upstream.

During hotplug CPU add we get the following error:

Unexpected Error (0) returned from configure-connector

ibm,configure-connector returns 0 for configuration complete, so
catch this and avoid the error.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:42 -08:00
Anton Blanchard
7bd46f2287 powerpc: Fix oops when echoing bad values to /sys/devices/system/memory/probe
commit a11940978b upstream.

If we echo an address the hypervisor doesn't like to
/sys/devices/system/memory/probe we oops the box:

# echo 0x10000000000 > /sys/devices/system/memory/probe

kernel BUG at arch/powerpc/mm/hash_utils_64.c:541!

The backtrace is:

create_section_mapping
arch_add_memory
add_memory
memory_probe_store
sysdev_class_store
sysfs_write_file
vfs_write
SyS_write

In create_section_mapping we BUG if htab_bolt_mapping returned
an error. A better approach is to return an error which will
propagate back to userspace.

Rerunning the test with this patch applied:

# echo 0x10000000000 > /sys/devices/system/memory/probe
-bash: echo: write error: Invalid argument

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:40 -08:00
Anton Blanchard
9d99c1d2bd powerpc/numa: Remove double of_node_put in hot_add_node_scn_to_nid
commit 6083184269 upstream.

During memory hotplug testing, I got the following warning:

ERROR: Bad of_node_put() on /memory@0

of_node_release
kref_put
of_node_put
of_find_node_by_type
hot_add_node_scn_to_nid
hot_add_scn_to_nid
memory_add_physaddr_to_nid
...

of_find_node_by_type() loop does the of_node_put for us so we only
need the handle the case where we terminate the loop early.

As suggested by Stephen Rothwell we can do the of_node_put
unconditionally outside of the loop since of_node_put handles a
NULL argument fine.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:39 -08:00
Andrea Arcangeli
58f18f91c6 thp: share get_huge_page_tail()
commit b35a35b556 upstream.

This avoids duplicating the function in every arch gup_fast.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:01 -08:00
Andrea Arcangeli
e852a85448 sparc: gup_pte_range() support THP based tail recounting
commit e0d85a366c upstream.

Up to this point the code assumed old refcounting for hugepages (pre-thp).
 This updates the code directly to the thp mapcount tail page refcounting.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:00 -08:00
Andrea Arcangeli
05ed1d8991 s390: gup_huge_pmd() return 0 if pte changes
commit 0693bc9ce2 upstream.

s390 didn't return 0 in that case, if it's rolling back the *nr pointer it
should also return zero to avoid adding pages to the array at the wrong
offset.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:00 -08:00
Andrea Arcangeli
9310644836 s390: gup_huge_pmd() support THP tail recounting
commit 220a2eb228 upstream.

Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:44:00 -08:00
Andrea Arcangeli
ddb693af6f powerpc: gup_huge_pmd() return 0 if pte changes
commit cf592bf768 upstream.

powerpc didn't return 0 in that case, if it's rolling back the *nr pointer
it should also return zero to avoid adding pages to the array at the wrong
offset.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:59 -08:00
Andrea Arcangeli
956b251c2f powerpc: gup_hugepte() support THP based tail recounting
commit 3526741f09 upstream.

Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:59 -08:00
Andrea Arcangeli
9aa7c2a5e4 powerpc: gup_hugepte() avoid freeing the head page too many times
commit 8596468487 upstream.

We only taken "refs" pins on the head page not "*nr" pins.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:59 -08:00
Andrea Arcangeli
80a76e1db5 powerpc: get_hugepte() don't put_page() the wrong page
commit 405e44f2e3 upstream.

"page" may have changed to point to the next hugepage after the loop
completed, The references have been taken on the head page, so the
put_page must happen there too.

This is a longstanding issue pre-thp inclusion.

It's totally unclear how these page_cache_add_speculative and
pte_val(pte) != pte_val(*ptep) checks are necessary across all the
powerpc gup_fast code, when x86 doesn't need any of that: there's no way
the page can be freed with irq disabled so we're guaranteed the
atomic_inc will happen on a page with page_count > 0 (so not needing the
speculative check).

The pte check is also meaningless on x86: no need to rollback on x86 if
the pte changed, because the pte can still change a CPU tick after the
check succeeded and it won't be rolled back in that case.  The important
thing is we got a reference on a valid page that was mapped there a CPU
tick ago.  So not knowing the soft tlb refill code of ppc64 in great
detail I'm not removing the "speculative" page_count increase and the
pte checks across all the code, but unless there's a strong reason for
it they should be later cleaned up too.

If a pte can change from huge to non-huge (like it could happen with
THP) passing a pte_t *ptep to gup_hugepte() would also require to repeat
the is_hugepd in gup_hugepte(), but that shouldn't happen with hugetlbfs
only so I'm not altering that.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:58 -08:00
Andrea Arcangeli
5db7016cf2 powerpc: remove superfluous PageTail checks on the pte gup_fast
commit 2839bdc1bf upstream.

This part of gup_fast doesn't seem capable of handling hugetlbfs ptes,
those should be handled by gup_hugepd only, so these checks are
superfluous.

Plus if this wasn't a noop, it would have oopsed because, the insistence
of using the speculative refcounting would trigger a VM_BUG_ON if a tail
page was encountered in the page_cache_get_speculative().

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:58 -08:00
Andrea Arcangeli
1e565a292a mm: thp: tail page refcounting fix
commit 70b50f94f1 upstream.

Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().

He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().

So the best way to fix the problem is to keep page_tail->_count zero at
all times.  This will guarantee that get_page_unless_zero() can never
succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).

While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages.  That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic.  As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns.  It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).

The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages.  The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it.  A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead.  So it's worth it.  Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages().  The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.

This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:41 -08:00
Dave Jones
76c125d265 um: Fix kmalloc argument order in um/vdso/vma.c
commit 0d65ede0a6 upstream.

kmalloc size is 1st arg, not second.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:36 -08:00
Richard Weinberger
02e66bee63 um: fix ubd cow size
commit 8535639810 upstream.

ubd_file_size() cannot use ubd_dev->cow.file because at this time
ubd_dev->cow.file is not initialized.
Therefore, ubd_file_size() will always report a wrong disk size when
COW files are used.
Reading from /dev/ubd* would crash the kernel.

We have to read the correct disk size from the COW file's backing
file.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:35 -08:00
Linus Walleij
f9f92901e2 ARM: mach-ux500: unlock I&D l2x0 caches before init
commit 1bf6d2c1bb upstream.

Apparently U8500 U-Boot versions may leave the l2x0 locked down
before executing the kernel. Make sure we unlock it before we
initialize the l2x0. This fixes a performance problem reported
by Jan Rinze.

The l2x0 core has been modified to unlock the l2x0 by default,
but it will not touch the locking registers if the l2x0 was
already enabled, as on the ux500, so we need this quirk to
make sure it is properly turned off.

Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Adrian Bunk <adrian.bunk@movial.com>
Reported-by: Jan Rinze <janrinze@gmail.com>
Tested-by: Robert Marklund <robert.marklund@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:31 -08:00
Axel Lin
859967a0ff ARM: pxa/cm-x300: properly set bt_reset pin
commit 1a64200ec5 upstream.

Fix below build warning and properly set bt_reset pin.

  CC      arch/arm/mach-pxa/cm-x300.o
arch/arm/mach-pxa/cm-x300.c: In function 'cm_x300_init_wi2wi':
arch/arm/mach-pxa/cm-x300.c:779: warning: unused variable 'wlan_en'
arch/arm/mach-pxa/cm-x300.c:795: warning: 'bt_reset' may be used uninitialized in this function

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:31 -08:00
Paul Fertser
a54059bb11 plat-mxc: iomux-v3.h: implicitly enable pull-up/down when that's desired
commit 6571534b60 upstream.

To configure pads during the initialisation a set of special constants
is used, e.g.
#define MX25_PAD_FEC_MDIO__FEC_MDIO IOMUX_PAD(0x3c4, 0x1cc, 0x10, 0, 0, PAD_CTL_HYS | PAD_CTL_PUS_22K_UP)

The problem is that no pull-up/down is getting activated unless both
PAD_CTL_PUE (pull-up enable) and PAD_CTL_PKE (pull/keeper module
enable) set. This is clearly stated in the i.MX25 datasheet and is
confirmed by the measurements on hardware. This leads to some rather
hard to understand bugs such as misdetecting an absent ethernet PHY (a
real bug i had), unstable data transfer etc. This might affect mx25,
mx35, mx50, mx51 and mx53 SoCs.

It's reasonable to expect that if the pullup value is specified, the
intention was to have it actually active, so we implicitly add the
needed bits.

Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:30 -08:00
Martin Schwidefsky
4dc17f0c42 memory leak with RCU_TABLE_FREE
commit e73b7fffe4 upstream.

The rcu page table free code uses a couple of bits in the page table
pointer passed to tlb_remove_table to discern the different page table
types. __tlb_remove_table extracts the type with an incorrect mask which
leads to memory leaks. The correct mask is ((FRAG_MASK << 4) | FRAG_MASK).

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:22 -08:00
Martin Schwidefsky
98ef836b07 user per registers vs. ptrace single stepping
commit a45aff5285 upstream.

git commit 5e9a2692 "[S390] ptrace cleanup" introduced a regression
for the case when both a user PER set (e.g. a storage alteration trace) and
PTRACE_SINGLESTEP are active. The new code will overrule the user PER set
with a instruction-fetch PER set over the whole address space for ptrace
single stepping. The inferior process will be stopped after each instruction
with an instruction fetch event. Any other events that may have occurred
concurrently are not reported (e.g. storage alteration event) because the
control bits for them are not set. The solution is to merge the PER control
bits of the user PER set with the PER_EVENT_IFETCH control bit for
PTRACE_SINGLESTEP.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:22 -08:00
Carsten Otte
715afafe87 KVM: s390: check cpu_id prior to using it
commit 4d47555a80 upstream.

We use the cpu id provided by userspace as array index here. Thus we
clearly need to check it first. Ooops.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:21 -08:00
Jan Beulich
d6a615f1c4 apic, i386/bigsmp: Fix false warnings regarding logical APIC ID mismatches
commit 838312be46 upstream.

These warnings (generally one per CPU) are a result of
initializing x86_cpu_to_logical_apicid while apic_default is
still in use, but the check in setup_local_APIC() being done
when apic_bigsmp was already used as an override in
default_setup_apic_routing():

 Overriding APIC driver with bigsmp
 Enabling APIC mode:  Physflat.  Using 5 I/O APICs
 ------------[ cut here ]------------
 WARNING: at .../arch/x86/kernel/apic/apic.c:1239
 ...
 CPU 1 irqstacks, hard=f1c9a000 soft=f1c9c000
 Booting Node   0, Processors  #1
 smpboot cpu 1: start_ip = 9e000
 Initializing CPU#1
 ------------[ cut here ]------------
 WARNING: at .../arch/x86/kernel/apic/apic.c:1239
 setup_local_APIC+0x137/0x46b() Hardware name: ...
 CPU1 logical APIC ID: 2 != 8
 ...

Fix this (for the time being, i.e. until
x86_32_early_logical_apicid() will get removed again, as Tejun
says ought to be possible) by overriding the previously stored
values at the point where the APIC driver gets overridden.

v2: Move this and the pre-existing override logic into
    arch/x86/kernel/apic/bigsmp_32.c.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/4E835D16020000780005844C@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:43:11 -08:00
Russell King
6ed5bebf24 ARM: smp: fix clipping of number of CPUs
commit a06f916b7a upstream.

Rather than clipping the number of CPUs using the compile-time NR_CPUS
constant, use the runtime nr_cpu_ids value instead.  This allows the
nr_cpus command line option to work as expected.

Reported-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:42:46 -08:00
Josh Stone
2b810d9963 x86: Fix compilation bug in kprobes' twobyte_is_boostable
commit 315eb8a2a1 upstream.

When compiling an i386_defconfig kernel with gcc-4.6.1-9.fc15.i686, I
noticed a warning about the asm operand for test_bit in kprobes'
can_boost.  I discovered that this caused only the first long of
twobyte_is_boostable[] to be output.

Jakub filed and fixed gcc PR50571 to correct the warning and this output
issue.  But to solve it for less current gcc, we can make kprobes'
twobyte_is_boostable[] non-const, and it won't be optimized out.

Before:

    CC      arch/x86/kernel/kprobes.o
  In file included from include/linux/bitops.h:22:0,
                   from include/linux/kernel.h:17,
                   from [...]/arch/x86/include/asm/percpu.h:44,
                   from [...]/arch/x86/include/asm/current.h:5,
                   from [...]/arch/x86/include/asm/processor.h:15,
                   from [...]/arch/x86/include/asm/atomic.h:6,
                   from include/linux/atomic.h:4,
                   from include/linux/mutex.h:18,
                   from include/linux/notifier.h:13,
                   from include/linux/kprobes.h:34,
                   from arch/x86/kernel/kprobes.c:43:
  [...]/arch/x86/include/asm/bitops.h: In function ‘can_boost.part.1’:
  [...]/arch/x86/include/asm/bitops.h:319:2: warning: use of memory input
        without lvalue in asm operand 1 is deprecated [enabled by default]

  $ objdump -rd arch/x86/kernel/kprobes.o | grep -A1 -w bt
       551:	0f a3 05 00 00 00 00 	bt     %eax,0x0
                          554: R_386_32	.rodata.cst4

  $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o

  arch/x86/kernel/kprobes.o:     file format elf32-i386

  Contents of section .data:
   0000 48000000 00000000 00000000 00000000  H...............
  Contents of section .rodata.cst4:
   0000 4c030000                             L...

Only a single long of twobyte_is_boostable[] is in the object file.

After, without the const on twobyte_is_boostable:

  $ objdump -rd arch/x86/kernel/kprobes.o | grep -A1 -w bt
       551:	0f a3 05 20 00 00 00 	bt     %eax,0x20
                          554: R_386_32	.data

  $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o

  arch/x86/kernel/kprobes.o:     file format elf32-i386

  Contents of section .data:
   0000 48000000 00000000 00000000 00000000  H...............
   0010 00000000 00000000 00000000 00000000  ................
   0020 4c030000 0f000200 ffff0000 ffcff0c0  L...............
   0030 0000ffff 3bbbfff8 03ff2ebb 26bb2e77  ....;.......&..w

Now all 32 bytes are output into .data instead.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:42:33 -08:00
Jack Steiner
f584ab5b90 x86: uv2: Workaround for UV2 Hub bug (system global address format)
commit 6a469e4665 upstream.

This is a workaround for a UV2 hub bug that affects the format of system
global addresses.

The GRU API for UV2 was inadvertently broken by a hardware change.  The
format of the physical address used for TLB dropins and for addresses used
with instructions running in unmapped mode has changed.  This change was
not documented and became apparent only when diags failed running on
system simulators.

For UV1, TLB and GRU instruction physical addresses are identical to
socket physical addresses (although high NASID bits must be OR'ed into the
address).

For UV2, socket physical addresses need to be converted.  The NODE portion
of the physical address needs to be shifted so that the low bit is in bit
39 or bit 40, depending on an MMR value.

It is not yet clear if this bug will be fixed in a silicon respin.  If it
is fixed, the hub revision will be incremented & the workaround disabled.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11 09:42:33 -08:00
Takashi Iwai
8548c84da2 x86: Fix S4 regression
Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
regression since 2.6.39, namely the machine reboots occasionally at S4
resume.  It doesn't happen always, overall rate is about 1/20.  But,
like other bugs, once when this happens, it continues to happen.

This patch fixes the problem by essentially reverting the memory
assignment in the older way.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai.lu@oracle.com>
[ We'll hopefully find the real fix, but that's too late for 3.1 now ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-24 06:55:20 +02:00
Linus Torvalds
1bf1aacedc Merge branch 'samsung-fixes-4' of git://github.com/kgene/linux-samsung
* 'samsung-fixes-4' of git://github.com/kgene/linux-samsung:
  ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
  ARM: S5P: fix offset calculation on gpio-interrupt
2011-10-23 10:44:40 +03:00
Domenico Andreoli
fb630b9fc9 ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
v2:
- register_syscore_ops(&s3c24xx_irq_syscore_ops) does not need to be
  conditionally compiled out, it is already optimized out on !CONFIG_PM
- fix also s3c2412 and s3c2416 affected by the same build issue

v1:
s3c2440.c fails to build if !CONFIG_PM because in such case
s3c2410_pm_syscore_ops is not defined. Same error should happen also
in s3c2410.c and s3c2442.c

Signed-off-by: Domenico Andreoli <cavokz@gmail.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2011-10-22 04:00:53 +09:00
Marek Szyprowski
1052cff317 ARM: S5P: fix offset calculation on gpio-interrupt
Offsets of the irq controller registers were calculated
correctly only for first GPIO bank. This patch fixes
calculation of the register offsets for all GPIO banks.

Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2011-10-21 18:05:02 +09:00
Linus Torvalds
fd11e153b8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Add alignment flag to PCI expansion resources
  sparc: Avoid calling sigprocmask()
  sparc: Use set_current_blocked()
  sparc32,leon: SRMMU MMU Table probe fix
2011-10-20 22:16:28 +03:00
Kjetil Oftedal
aad4564498 sparc: Add alignment flag to PCI expansion resources
Currently no type of alignment is specified for PCI expansion roms while 
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in 
pci/probe.c, and has been verified to work with various cards on a ultra 10.

Signed-off-By: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 16:20:50 -07:00
Linus Torvalds
8bc03e8f3a Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
  ARM: 7122/1: localtimer: add header linux/errno.h explicitly
  ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
  ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES
2011-10-16 13:08:27 -07:00
Zoltan Devai
f8be12d153 ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
This is unneeded and causes an abort on the SPMP8000 platform.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Zoltan Devai <zoss@devai.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-15 11:04:22 +01:00
Shawn Guo
bb1ac3ec95 ARM: 7122/1: localtimer: add header linux/errno.h explicitly
Per the text in  Documentation/SubmitChecklist as below, we should
explicitly have header linux/errno.h in localtimer.h for ENXIO
reference.

1: If you use a facility then #include the file that defines/declares
   that facility.  Don't depend on other header files pulling in ones
   that you use.

Otherwise, we may run into some compiling error like the following one,
if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined.

  arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’:
  arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function)

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-15 11:04:22 +01:00
Will Deacon
29a541f6c1 ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.

Reported-by: Alasdair Grant <alasdair.grant@arm.com>
Reported-by: Matt Horsnell <matt.horsnell@arm.com>
Reported-by: Michael Williams <michael.williams@arm.com>
Cc: stable@kernel.org
Cc: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-15 11:04:22 +01:00
Linus Torvalds
95bc156c62 Merge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tile
* 'stable' of git://github.com/cmetcalf-tilera/linux-tile:
  tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
2011-10-14 16:59:11 +12:00
Linus Torvalds
2ad53110d6 Merge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86: Default to vsyscall=native for now
2011-10-14 16:54:56 +12:00
Mika Westerberg
153b19a3b9 x86, mrst: use a temporary variable for SFI irq
SFI tables reside in RAM and should not be modified once they are
written.  Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum.  This will
break kexec as the second kernel fails to validate SFI tables.

To fix this we use temporary variable for irq number.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-14 16:53:27 +12:00
Chris Metcalf
d52104b29a tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly.  To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-10-13 08:25:01 -04:00
David S. Miller
27f20dca01 sparc: Avoid calling sigprocmask()
Use set_current_blocked() instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-12 12:27:35 -07:00