Commit Graph

242369 Commits

Author SHA1 Message Date
Tarun Kanti DebBarma
8985b63d07 OMAP2+: hwmod: fix incorrect computation of autoidle_mask
Autoidle is a single bit, TIOCP_CFG[0], setting on OMAP1/2/3/4 platforms.
In _set_module_autoidle() I am seeing 0x3 value where the mask is computed.
This should be 0x1.

v2:
(1) Modified the subject.
(2) Modified the description with further specific information.

Baseline:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

Tested Info:
Boot tested on OMAP 1/2/3/4.

Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Acked-by: Rajendra Nayak <rnayak@ti.com>
Acked-by: Benoit Cousson <b-cousson@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-03-10 03:23:55 -07:00
J. Bruce Fields
d891eedbc3 fs/dcache: allow d_obtain_alias() to return unhashed dentries
Without this patch, inodes are not promptly freed on last close of an
unlinked file by an nfs client:

	client$ mount -tnfs4 server:/export/ /mnt/
	client$ tail -f /mnt/FOO
	...
	server$ df -i /export
	server$ rm /export/FOO
	(^C the tail -f)
	server$ df -i /export
	server$ echo 2 >/proc/sys/vm/drop_caches
	server$ df -i /export

the df's will show that the inode is not freed on the filesystem until
the last step, when it could have been freed after killing the client's
tail -f. On-disk data won't be deallocated either, leading to possible
spurious ENOSPC.

This occurs because when the client does the close, it arrives in a
compound with a putfh and a close, processed like:

	- putfh: look up the filehandle.  The only alias found for the
	  inode will be DCACHE_UNHASHED alias referenced by the filp
	  this, so it creates a new DCACHE_DISCONECTED dentry and
	  returns that instead.
	- close: closes the existing filp, which is destroyed
	  immediately by dput() since it's DCACHE_UNHASHED.
	- end of the compound: release the reference
	  to the current filehandle, and dput() the new
	  DCACHE_DISCONECTED dentry, which gets put on the
	  unused list instead of being destroyed immediately.

Nick Piggin suggested fixing this by allowing d_obtain_alias to return
the unhashed dentry that is referenced by the filp, instead of making it
create a new dentry.

Leave __d_find_alias() alone to avoid changing behavior of other
callers.

Also nfsd doesn't need all the checks of __d_find_alias(); any dentry,
hashed or unhashed, disconnected or not, should work.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 05:18:54 -05:00
Benoit Cousson
478f478bc1 OMAP3: hwmod data: Remove masters port links for interconnects.
Master ports from interconnect are generating some annoying circular
references that become tricky to handle if we have to dynamically
remove some IP on some variant platforms.
Since they are not used for the moment, and since we can still build
that relation using the reverse relation (slave port from the IP
toward master port of the interconnect), let remove them for the
moment like it is done on OMAP4.

Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Sanjeev Premi <premi@ti.com>
2011-03-10 11:18:50 +01:00
Shawn Guo
0590a79031 ARM: mxs/mx28evk: add framebuffer device
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2011-03-10 11:18:35 +01:00
Shawn Guo
12b90f8a2c ARM: mx28: set proper parent for lcdif clock
Most likely, the LCD panel on mx28 platform will require a pixel
clock higher than ref_xtal_clk (24 MHz), so the patch initializes
the parent of lcdif clock as ref_pix_clk.

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2011-03-10 11:17:29 +01:00
Linus Walleij
29772c4e28 ARM: 6764/1: pl011: factor out FIFO to TTY code
This piece of code was just slightly different between the DMA
and IRQ paths, in DMA mode we surely shouldn't read more than
256 character either, so factor this out in its own function and
use for both DMA and PIO mode.

Tested on Ux500 and U300.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-10 10:07:24 +00:00
Linus Walleij
ead76f329f ARM: 6763/1: pl011: add optional RX DMA to PL011 v2
This adds an optional RX DMA codepath for the devices that
support this by using the apropriate burst sizes instead of
pulling single bytes.

Includes portions of code written by Russell King during
a PL08x hacking session.

This has been tested on U300 and Ux500.

Tested-by: Jerzy Kasenberg <jerzy.kasenberg@tieto.com>
Tested-by: Grzegorz Sygieda <grzegorz.sygieda@tieto.com>
Tested-by: Marcin Mielczarczyk <marcin.mielczarczyk@tieto.com>
Signed-off-by: Per Forlin <per.friden@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-10 10:07:22 +00:00
Stepan Moskovchenko
6fa85e5ce3 ARM: 6796/1: Footbridge: Fix I/O mappings for NOMMU mode
Use the correct I/O address definitions for Footbridge
peripherals when the kernel is compiled without MMU
support.

Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-10 10:04:30 +00:00
Benoit Cousson
b9ccf8afe2 OMAP3: hwmod data: Fix incorrect SmartReflex -> L4 CORE interconnect links
Commit d344272671 ("OMAP3: PM: Adding
smartreflex hwmod data") added data that claims that the L4 CORE has
two slave interfaces that originate from the SmartReflex modules,
omap3_l4_core__sr1 and omap3_l4_core__sr2.  But as those two data
structure records show, it's L4 CORE that has a master port towards
SR1 and SR2.
Move the incorrect data from slaves list to master list.

Based on a path by Paul Walmsley <paul@pwsan.com>

    https://patchwork.kernel.org/patch/623171/

That is based on a patch by Benoît Cousson <b-cousson@ti.com>:

    https://patchwork.kernel.org/patch/590561/

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Benoît Cousson <b-cousson@ti.com>
Cc: Sanjeev Premi <premi@ti.com>
Cc: Thara Gopinath <thara@ti.com>
2011-03-10 11:04:00 +01:00
Stephen Boyd
7d85d61f6a ARM: 6797/1: hw_breakpoint: Fix newlines in WARNings
These warnings are missing newlines and spaces causing confusing
looking output when they trigger.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-10 10:03:45 +00:00
Matt Carlson
c5908939b2 tg3: Remove 5750 PCI code
The 5750 ASIC rev was never released as a PCI device.  It only exists as
a PCIe device.  This patch removes the code that supports the former
configuration.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:14 -08:00
Matt Carlson
e256f8a351 tg3: Move tg3_init_link_config to tg3_phy_probe
This patch moves the function that initializes the link configuration
closer to the place where the rest of the phy code is initialized.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:13 -08:00
Matt Carlson
683644b747 tg3: Refine VAux decision process
In the near future, the VAux switching decision process is going to get
more complicated.  This patch refines and consolidates the existing
algorithm in anticipation of the new scheme.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:13 -08:00
Matt Carlson
4143470c10 tg3: cleanup pci device table vars
Commit 895950c2a6, entitled
"tg3: Use DEFINE_PCI_DEVICE_TABLE" moved two pci device tables into the
global address space, but didn't declare them static and didn't prefix
them with "tg3_".  This patch fixes those problems.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:12 -08:00
Matt Carlson
d4894f3ea7 tg3: Add code to verify RODATA checksum of VPD
This patch adds code to verify the checksum stored in the "RV" info
keyword of the RODATA VPD section.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:12 -08:00
Matt Carlson
01c3a3920f tg3: Fix NVRAM selftest
The tg3 NVRAM selftest actually fails when validating the checksum of
the legacy NVRAM format.  However, the test still reported success
because the last update of the return code was a success from the NVRAM
reads.  This patch fixes the code so that the error return code defaults
to a failure status.  Then the patch fixes the reason why the checsum
validation failed.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:11 -08:00
Matt Carlson
bb18bb942a tg3: Add missed 5719 workaround change
Commit 2866d956fe, entitled
"tg3: Expand 5719 workaround" extended a 5719 A0 workaround to all
revisions of the chip.  There was a change that should have been a
part of that patch that was missed.  This patch adds the missing
piece.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 01:56:11 -08:00
Michal Simek
caa66ce905 microblaze: Fix circular headers dependency when ftrace is enabled.
Remove compilation failure when ftrace in enabled.

Error log:
  CC      kernel/trace/power-traces.o
In file included from arch/microblaze/include/asm/irq.h:15,
                 from include/linux/irq.h:27,
                 from include/asm-generic/hardirq.h:12,
                 from arch/microblaze/include/asm/hardirq.h:15,
                 from include/linux/hardirq.h:7,
                 from include/linux/ftrace_event.h:7,
                 from include/trace/ftrace.h:19,
                 from include/trace/define_trace.h:96,
                 from include/trace/events/power.h:240,
                 from kernel/trace/power-traces.c:14:
include/linux/interrupt.h: In function '__raise_softirq_irqoff':
include/linux/interrupt.h:413: error: implicit declaration of function 'trace_softirq_raise'
In file included from include/trace/ftrace.h:554,
                 from include/trace/define_trace.h:96,
                 from include/trace/events/power.h:240,
                 from kernel/trace/power-traces.c:14:
include/trace/events/irq.h: In function 'ftrace_test_probe_irq_handler_entry':
include/trace/events/irq.h:37: error: implicit declaration of function 'check_trace_callback_type_irq_handler_entry'
include/trace/events/irq.h: In function 'ftrace_test_probe_irq_handler_exit':
include/trace/events/irq.h:67: error: implicit declaration of function 'check_trace_callback_type_irq_handler_exit'
include/trace/events/irq.h: In function 'ftrace_test_probe_softirq_entry':
include/trace/events/irq.h:112: error: implicit declaration of function 'check_trace_callback_type_softirq_entry'
include/trace/events/irq.h: In function 'ftrace_test_probe_softirq_exit':
include/trace/events/irq.h:126: error: implicit declaration of function 'check_trace_callback_type_softirq_exit'
include/trace/events/irq.h: In function 'ftrace_test_probe_softirq_raise':
include/trace/events/irq.h:140: error: implicit declaration of function 'check_trace_callback_type_softirq_raise'
make[5]: *** [kernel/trace/power-traces.o] Error 1
make[4]: *** [kernel/trace] Error 2
make[3]: *** [kernel] Error 2

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michal Simek <monstr@monstr.eu>
2011-03-10 10:39:51 +01:00
Marco Stornelli
1ca551c6ca Check for immutable/append flag in fallocate path
In the fallocate path the kernel doesn't check for the immutable/append
flag. It's possible to have a race condition in this scenario: an
application open a file in read/write and it does something, meanwhile
root set the immutable flag on the file, the application at that point
can call fallocate with success. In addition, we don't allow to do any
unreserve operation on an append only file but only the reserve one.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 04:22:15 -05:00
Stephen Rothwell
991ac30d8b sysctl: the include of rcupdate.h is only needed in the kernel
Fixes this built error:

include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 04:19:56 -05:00
Al Viro
9177ada99d fat: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:45:49 -05:00
Al Viro
8ce84eeb5b jfs: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:45:28 -05:00
Al Viro
4714e63731 ocfs2: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:45:07 -05:00
Al Viro
53fe924161 gfs2: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:44:48 -05:00
Al Viro
529c5f958f fuse: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:44:31 -05:00
Al Viro
0eb980e317 ceph: fix d_revalidate oopsen on NFS exports
can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:44:05 -05:00
Al Viro
c78f4cc5e7 reiserfs xattr ->d_revalidate() shouldn't care about RCU
... it returns an error unconditionally

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:42:01 -05:00
Andrea Arcangeli
a79e53d856 x86/mm: Fix pgd_lock deadlock
It's forbidden to take the page_table_lock with the irq disabled
or if there's contention the IPIs (for tlb flushes) sent with
the page_table_lock held will never run leading to a deadlock.

Nobody takes the pgd_lock from irq context so the _irqsave can be
removed.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-10 09:41:57 +01:00
Al Viro
ae50adcb0a /proc/self is never going to be invalidated...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-10 03:41:53 -05:00
Andrey Vagin
f86268549f x86/mm: Handle mm_fault_error() in kernel space
mm_fault_error() should not execute oom-killer, if page fault
occurs in kernel space.  E.g. in copy_from_user()/copy_to_user().

This would happen if we find ourselves in OOM on a
copy_to_user(), or a copy_from_user() which faults.

Without this patch, the kernels hangs up in copy_from_user(),
because OOM killer sends SIG_KILL to current process, but it
can't handle a signal while in syscall, then the kernel returns
to copy_from_user(), reexcute current command and provokes
page_fault again.

With this patch the kernel return -EFAULT from copy_from_user().

The code, which checks that page fault occurred in kernel space,
has been copied from do_sigbus().

This situation is handled by the same way on powerpc, xtensa,
tile, ...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201103092322.p29NMNPH001682@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-10 09:41:40 +01:00
Stephen Hemminger
a252bebe22 tcp: mark tcp_congestion_ops read_mostly
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 00:40:17 -08:00
Jiro SEKIBA
4d3cf1bc55 nilfs2: move NILFS_SUPER_MAGIC to linux/magic.h
move NILFS_SUPER_MAGIC macro to linux/magic.h from linux/nilfs2_fs.h
in the same manner as other filesystem magic number defined in the file.

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-10 17:29:40 +09:00
Jens Axboe
4c63f5646e Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core
Conflicts:
	block/blk-core.c
	block/blk-flush.c
	drivers/md/raid1.c
	drivers/md/raid10.c
	drivers/md/raid5.c
	fs/nilfs2/btnode.c
	fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:58:35 +01:00
Vivek Goyal
69d60eb96a blk-throttle: Use blk_plug in throttle dispatch
Use plug in throttle dispatch also as we are dispatching a bunch of
bios in throttle context and some of them might merge.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
721a9602e6 block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
cf15900e12 aio: remove request submission batching
This should be useless now that we have on-stack plugging. So lets just
kill it.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Shaohua Li
9f5b942546 fs: make aio plug
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
2ed1a6bcf9 fs: make mpage read/write_pages() plug
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
5b417b1873 read-ahead: use plugging
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
55602dd66f fs: make generic file read/write functions plug
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Jens Axboe
73c1010119 block: initial patch for on-stack per-task plugging
This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.

The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.

The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Jens Axboe
a488e74976 scsi: convert to blk_delay_queue()
It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit. A
default delay of 3 msec is defined, to match the previous
behaviour.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Jens Axboe
0a41e90bb7 ide-cd: convert to blk_delay_queue() for a short pause
It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David S. Miller <davem@davemloft.net>
2011-03-10 08:45:54 +01:00
Jens Axboe
3cca6dc1c8 block: add API for delaying work/request_fn a little bit
Currently we use plugging for that, but as plugging is going away,
we need an alternative mechanism.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Konrad Rzeszutek Wilk
bbd5a762b4 xen/hvc: Disable probe_irq_on/off from poking the hvc-console IRQ line.
This fixes a particular nasty racing problem found when using
Xen hypervisor with the console (hvc) output being routed to the
serial port and the serial port receiving data when
probe_irq_off(probe_irq_on) is running.

Specifically the bug manifests itself with:

[    4.470693] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[    4.470693] IP: [<ffffffff810a8c65>] handle_IRQ_event+0xe/0xc9
..snip..
[    4.470693] Call Trace:
[    4.470693]  <IRQ>
[    4.470693]  [<ffffffff810aa645>] handle_percpu_irq+0x3c/0x69
[    4.470693]  [<ffffffff8123cda7>] __xen_evtchn_do_upcall+0xfd/0x195
[    4.470693]  [<ffffffff810308cf>] ? xen_restore_fl_direct_end+0x0/0x1
[    4.470693]  [<ffffffff8123d873>] xen_evtchn_do_upcall+0x32/0x47
[    4.470693]  [<ffffffff81034dfe>] xen_do_hypervisor_callback+0x1e/0x30
[    4.470693]  <EOI>
[    4.470693]  [<ffffffff8100922a>] ? hypercall_page+0x22a/0x1000
[    4.470693]  [<ffffffff8100922a>] ? hypercall_page+0x22a/0x1000
[    4.470693]  [<ffffffff810301c5>] ? xen_force_evtchn_callback+0xd/0xf
[    4.470693]  [<ffffffff810308e2>] ? check_events+0x12/0x20
[    4.470693]  [<ffffffff81030889>] ? xen_irq_enable_direct_end+0x0/0x7
[    4.470693]  [<ffffffff810ab0a0>] ? probe_irq_on+0x8f/0x1d7
[    4.470693]  [<ffffffff812b105e>] ? serial8250_config_port+0x7b7/0x9e6
[    4.470693]  [<ffffffff812ad66c>] ? uart_add_one_port+0x11b/0x305

The bug is trigged by three actors working together:
 A). serial_8250_config_port calling
	probe_irq_off(probe_irq_on())
     wherein all of the IRQ handlers are being started and shut off.
     The functions utilize the sleep functions so the minimum time
     they are run is 120 msec.
 B). Xen hypervisor receiving on the serial line any character and
     setting the bits in the event channel - during this 120 msec timeframe.
 C). The hvc API makes a call to 'request_irq' (and hence setting desc->action
     to a valid value), much much later - when user space opens
     /dev/console (hvc_open). To make the console usable during bootup,
     the Xen HVC implementation sets the IRQ chip (and correspondingly
     the event channel) much earlier. The IRQ chip handler that is used
     is the handle_percpu_irq (aaca49642b)

Back to the issue. When A) is being called it ends up calling the
xen_percpu_chip's chip->startup twice and chip->shutdown once. Those
are set to the default_startup and mask_irq (events.c) respectivly.
If (and this seems to depend on what serial concentrator you use), B)
gets data from the serial port it sets in the event channel a pending bit.
When A) calls chip->startup(), the masking of the pending bit, and
unmasking of the event channel mask, and also setting of the upcall_pending
flag is done (since there is data present on the event channel).
If before the 120 msec has elapsed, any IRQ handler (Xen IRQ has one
IRQ handler, which checks the event channels bitmap to figure which one
to call) is called we end up calling the handle_percpu_irq. The
handle_percpu_irq calls desc->action (which is NULL) and we blow up.

Caveats: I could only reproduce this on 2.6.32 pvops. I am not sure
why this is not showing up on 2.6.38 kernel.

The probe_irq_on/off has code to disable poking specific IRQ lines. This is
done by using the set_irq_noprobe() and then we do not have to
worry about the handle_percpu_irq being called before the IRQ action
handler has been installed.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 00:57:59 -05:00
David S. Miller
cc7e17ea04 ipv4: Optimize flow initialization in fib_validate_source().
Like in commit 44713b67db
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.

Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 20:57:50 -08:00
David S. Miller
67e28ffd86 ipv4: Optimize flow initialization in input route lookup.
Like in commit 44713b67db
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.

Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 20:42:07 -08:00
David S. Miller
7343ff31eb ipv6: Don't create clones of host routes.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462

In commit d80bc0fd26 ("ipv6: Always
clone offlink routes.") we forced the kernel to always clone offlink
routes.

The reason we do that is to make sure we never bind an inetpeer to a
prefixed route.

The logic turned on here has existed in the tree for many years,
but was always off due to a protecting CPP define.  So perhaps
it's no surprise that there is a logic bug here.

The problem is that we canot clone a route that is already a
host route (ie. has DST_HOST set).  Because if we do, an identical
entry already exists in the routing tree and therefore the
ip6_rt_ins() call is going to fail.

This sets off a series of failures and high cpu usage, because when
ip6_rt_ins() fails we loop retrying this operation a few times in
order to handle a race between two threads trying to clone and insert
the same host route at the same time.

Fix this by simply using the route as-is when DST_HOST is set.

Reported-by: slash@ac.auone-net.jp
Reported-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 19:55:25 -08:00
J. Bruce Fields
352b5d13c0 svcrpc: fix bad argument in unix_domain_find
"After merging the nfsd tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

net/sunrpc/svcauth_unix.c: In function 'unix_domain_find':
net/sunrpc/svcauth_unix.c:58: warning: passing argument 1 of
+'svcauth_unix_domain_release' from incompatible pointer type
net/sunrpc/svcauth_unix.c:41: note: expected 'struct auth_domain *' but
argument
+is of type 'struct unix_domain *'

Introduced by commit 8b3e07ac90 ("svcrpc: fix rare race on unix_domain
creation")."

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-09 22:40:30 -05:00