Commit Graph

180921 Commits

Author SHA1 Message Date
Neil Horman
6afa5c708e coredump: suppress uid comparison test if core output files are pipes
commit 76595f79d7 upstream.

Modify uid check in do_coredump so as to not apply it in the case of
pipes.

This just got noticed in testing.  The end of do_coredump validates the
uid of the inode for the created file against the uid of the crashing
process to ensure that no one can pre-create a core file with different
ownership and grab the information contained in the core when they
shouldn' tbe able to.  This causes failures when using pipes for a core
dumps if the crashing process is not root, which is the uid of the pipe
when it is created.

The fix is simple.  Since the check for matching uid's isn't relevant for
pipes (a process can't create a pipe that the uermodehelper code will open
anyway), we can just just skip it in the event ispipe is non-zero

Reverts a pipe-affecting change which was accidentally made in

: commit c46f739dd3
: Author:     Ingo Molnar <mingo@elte.hu>
: AuthorDate: Wed Nov 28 13:59:18 2007 +0100
: Commit:     Linus Torvalds <torvalds@woody.linux-foundation.org>
: CommitDate: Wed Nov 28 10:58:01 2007 -0800
:
:     vfs: coredumping fix

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: maximilian attems <max@stro.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:18 -07:00
Steven Rostedt
a94b326792 tracing: Do not record user stack trace from NMI context
commit b6345879cc upstream.

A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.

Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:

  # echo 1 > /debug/tracing/options/userstacktrace
  # echo 1 > /debug/tracing/events/lock/enable
  # while :; do ls > /dev/null ; done

Enabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.

 1)               |    copy_user_handle_tail() {
 1)               |      bad_area_nosemaphore() {
 1)               |        __bad_area_nosemaphore() {
 1)               |          no_context() {
 1)               |            fixup_exception() {
 1)   0.319 us    |              search_exception_tables();
 1)   0.873 us    |            }
[...]
 1)   0.314 us    |  __rcu_read_unlock();
 1)   0.325 us    |    native_apic_mem_write();
 1)   0.943 us    |  }
 1)   0.304 us    |  rcu_nmi_exit();
[...]
 1)   0.479 us    |  find_vma();
 1)               |  bad_area() {
 1)               |    __bad_area() {

After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:

ffffffff8135b660 <page_fault>:
ffffffff8135b660:       48 83 ec 78             sub    $0x78,%rsp
ffffffff8135b664:       e8 97 01 00 00          callq  ffffffff8135b800 <error_entry>

What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.

The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.

The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:17 -07:00
Steven Rostedt
8c3492dc6b tracing: Disable buffer switching when starting or stopping trace
commit a2f8071428 upstream.

When the trace iterator is read, tracing_start() and tracing_stop()
is called to stop tracing while the iterator is processing the trace
output.

These functions disable both the standard buffer and the max latency
buffer. But if the wakeup tracer is running, it can switch these
buffers between the two disables:

  buffer = global_trace.buffer;
  if (buffer)
      ring_buffer_record_disable(buffer);

      <<<--------- swap happens here

  buffer = max_tr.buffer;
  if (buffer)
      ring_buffer_record_disable(buffer);

What happens is that we disabled the same buffer twice. On tracing_start()
we can enable the same buffer twice. All ring_buffer_record_disable()
must be matched with a ring_buffer_record_enable() or the buffer
can be disable permanently, or enable prematurely, and cause a bug
where a reset happens while a trace is commiting.

This patch protects these two by taking the ftrace_max_lock to prevent
a switch from occurring.

Found with Li Zefan's ftrace_stress_test.

Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:17 -07:00
Steven Rostedt
70e27dedd1 tracing: Use same local variable when resetting the ring buffer
commit 283740c619 upstream.

In the ftrace code that resets the ring buffer it references the
buffer with a local variable, but then uses the tr->buffer as the
parameter to reset. If the wakeup tracer is running, which can
switch the tr->buffer with the max saved buffer, this can break
the requirement of disabling the buffer before the reset.

   buffer = tr->buffer;
   ring_buffer_record_disable(buffer);
   synchronize_sched();
   __tracing_reset(tr->buffer, cpu);

If the tr->buffer is swapped, then the reset is not happening to the
buffer that was disabled. This will cause the ring buffer to fail.

Found with Li Zefan's ftrace_stress_test.

Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:17 -07:00
Lai Jiangshan
fe030f50c5 tracing: Fix warning in s_next of trace file ops
commit ac91d85456 upstream.

This warning in s_next() can be triggered by lseek():
 [<c018b3f7>] ? s_next+0x77/0x80
 [<c013e3c1>] warn_slowpath_common+0x81/0xa0
 [<c018b3f7>] ? s_next+0x77/0x80
 [<c013e3fa>] warn_slowpath_null+0x1a/0x20
 [<c018b3f7>] s_next+0x77/0x80
 [<c01efa77>] traverse+0x117/0x200
 [<c01eff13>] seq_lseek+0xa3/0x120
 [<c01efe70>] ? seq_lseek+0x0/0x120
 [<c01d7081>] vfs_llseek+0x41/0x50
 [<c01d8116>] sys_llseek+0x66/0xa0
 [<c0102bd0>] sysenter_do_call+0x12/0x26

The iterator "leftover" variable is zeroed in the opening of the trace
file. But lseek can call s_start() which will call s_next() without
reseting the "leftover" variable back to zero, which might trigger
the WARN_ON_ONCE(iter->leftover) that is in s_next().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B8CE06A.9090207@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:16 -07:00
Steven Rostedt
a86a83770f function-graph: Init curr_ret_stack with ret_stack
commit ea14eb7140 upstream.

If the graph tracer is active, and a task is forked but the allocating of
the processes graph stack fails, it can cause crash later on.

This is due to the temporary stack being NULL, but the curr_ret_stack
variable is copied from the parent. If it is not -1, then in
ftrace_graph_probe_sched_switch() the following:

	for (index = next->curr_ret_stack; index >= 0; index--)
		next->ret_stack[index].calltime += timestamp;

Will cause a kernel OOPS.

Found with Li Zefan's ftrace_stress_test.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:16 -07:00
Frederic Weisbecker
65bddd3610 hw-breakpoints: Remove stub unthrottle callback
commit 1e259e0a99 upstream.

We support event unthrottling in breakpoint events. It means
that if we have more than sysctl_perf_event_sample_rate/HZ,
perf will throttle, ignoring subsequent events until the next
tick.

So if ptrace exceeds this max rate, it will omit events, which
breaks the ptrace determinism that is supposed to report every
triggered breakpoints. This is likely to happen if we set
sysctl_perf_event_sample_rate to 1.

This patch removes support for unthrottling in breakpoint
events to break throttling and restore ptrace determinism.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:15 -07:00
Frederic Weisbecker
1dc877c1e8 x86/stacktrace: Don't dereference bad frame pointers
commit 29044ad150 upstream.

Callers of a stacktrace might pass bad frame pointers. Those
are usually checked for safety in stack walking helpers before
any dereferencing, but this is not the case when we need to go
through one more frame pointer that backlinks the irq stack to
the previous one, as we don't have any reliable address boudaries
to compare this frame pointer against.

This raises crashes when we record callchains for ftrace events
with perf because we don't use the right helpers to capture
registers there. We get wrong frame pointers as we call
task_pt_regs() even on kernel threads, which is a wrong thing
as it gives us the initial state of any kernel threads freshly
created. This is even not what we want for user tasks. What we want
is a hot snapshot of registers when the ftrace event triggers, not
the state before a task entered the kernel.

This requires more thoughts to do it correctly though.
So first put a guardian to ensure the given frame pointer
can be dereferenced to avoid crashes. We'll think about how to fix
the callers in a subsequent patch.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:15 -07:00
Suresh Siddha
429f074ef0 x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K already
commit 281ff33b7c upstream.

We currently enforce the !RW mapping for the kernel mapping that maps
holes between different text, rodata and data sections. However, kernel
identity mappings will have different RWX permissions to the pages mapping to
text and to the pages padding (which are freed) the text, rodata sections.
Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
kernel text and kernel identity mappings are different, so we can enable
protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
large page mappings for kernel text.

Konrad reported a boot failure with the Linux Xen paravirt guest because of
this. In this paravirt guest case, the kernel text mapping and the kernel
identity mapping share the same page-table pages. Thus forcing the !RW mapping
for some of the kernel mappings also cause the kernel identity mappings to be
read-only resulting in the boot failure. Linux Xen paravirt guest also
uses 4k mappings and don't use 2M mapping.

Fix this issue and retain large page performance advantage for native kernels
by not working hard and not enforcing !RW for the kernel text mapping,
if the current mapping is already using small page mapping.

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1266522700.2909.34.camel@sbs-t61.sc.intel.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:15 -07:00
Lai Jiangshan
01a8e5d5cc ring-buffer: Move disabled check into preempt disable section
commit 52fbe9cde7 upstream.

The ring buffer resizing and resetting relies on a schedule RCU
action. The buffers are disabled, a synchronize_sched() is called
and then the resize or reset takes place.

But this only works if the disabling of the buffers are within the
preempt disabled section, otherwise a window exists that the buffers
can be written to while a reset or resize takes place.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B949E43.2010906@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:14 -07:00
Bob Copeland
cd8c2b3ae7 ath5k: fix setup for CAB queue
commit a951ae2176 upstream.

The beacon sent gating doesn't seem to work with any combination
of flags.  Thus, buffered frames tend to stay buffered forever,
using up tx descriptors.

Instead, use the DBA gating and hold transmission of the buffered
frames until 80% of the beacon interval has elapsed using the ready
time.  This fixes the following error in AP mode:

   ath5k phy0: no further txbuf available, dropping packet

Add a comment to acknowledge that this isn't the best solution.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:14 -07:00
Bob Copeland
d50ef74730 ath5k: dont use external sleep clock in AP mode
commit 5d6ce628f9 upstream.

When using the external sleep clock in AP mode, the
TSF increments too quickly, causing beacon interval
to be much lower than it is supposed to be, resulting
in lots of beacon-not-ready interrupts.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14802.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:13 -07:00
Bruno Randolf
b55c7626ed ath5k: fix I/Q calibration (for real)
commit 86415d43ef upstream.

I/Q calibration was completely broken, resulting in a high number of CRC errors
on received packets. before i could see around 10% to 20% CRC errors, with this
patch they are between 0% and 3%.

1.) the removal of the mask in commit "ath5k: Fix I/Q calibration
(f1cf2dbd0f)" resulted in no mask beeing used
when writing the I/Q values into the register. additional errors in the
calculation of the values (see 2.) resulted too high numbers, exceeding the
masks, so wrong values like 0xfffffffe were written. to be safe we should
always use the bitmask when writing parts of a register.

2.) using a (s32) cast for q_coff is a wrong conversion to signed, since we
convert to a signed value later by substracting 128. this resulted in too low
numbers for Q many times, which were limited to -16 by the boundary check later
on.

3.) checked everything against the HAL sources and took over comments and minor
optimizations from there.

4.) we can't use ENABLE_BITS when we want to write a number (the number can
contain zeros). also always write the correction values first and set ENABLE
bit last, like the HAL does.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:13 -07:00
Jean Delvare
266623b31c i2c-i801: Don't use the block buffer for I2C block writes
commit c074c39d62 upstream.

Experience has shown that the block buffer can only be used for SMBus
(not I2C) block transactions, even though the datasheet doesn't
mention this limitation.

Reported-by: Felix Rubinstein <felixru@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Oleg Ryjkov <oryjkov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:12 -07:00
Jean Delvare
a9eaacfdf1 i2c-powermac: Be less verbose in the absence of real errors.
commit 8e4b980c28 upstream.

Be less verbose in the absence of real errors. We don't have to report
failed probes to the users, it's only confusing them.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Andrey Gusev <ronne@list.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:12 -07:00
Christoph Fritz
ad39fbecc3 Input: i8042 - add ALDI/MEDION netbook E1222 to qurik reset table
commit 31968ecf58 upstream.

ALDI/MEDION netbook E1222 needs to be in the reset quirk list for
its touchpad's proper function.

Reported-by: Michael Fischer <mifi@gmx.de>
Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:12 -07:00
Thomas Bächler
73c6d0219f Input: alps - add support for the touchpad on Toshiba Tecra A11-11L
commit eb8bff85c5 upstream.

Signed-off-by: Thomas Bächler <thomas@archlinux.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:11 -07:00
john stultz
9e6776b665 timekeeping: Prevent oops when GENERIC_TIME=n
commit ad6759fbf3 upstream.

Aaro Koskinen reported an issue in kernel.org bugzilla #15366, where
on non-GENERIC_TIME systems, accessing
/sys/devices/system/clocksource/clocksource0/current_clocksource
results in an oops.

It seems the timekeeper/clocksource rework missed initializing the
curr_clocksource value in the !GENERIC_TIME case.

Thanks to Aaro for reporting and diagnosing the issue as well as
testing the fix!

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LKML-Reference: <1267475683.4216.61.camel@localhost.localdomain>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:11 -07:00
Takashi Iwai
9ba9ca73e6 ALSA: hda - Fix input source elements of secondary ADCs on Realtek
commit 5311114d48 upstream.

Since alc_auto_create_input_ctls() doesn't set the elements for the
secondary ADCs, "Input Source" elemtns for these also get empty, resulting
in buggy outputs of alsactl like:
	control.14 {
		comment.access 'read write'
		comment.type ENUMERATED
		comment.count 1
		iface MIXER
		name 'Input Source'
		index 1
		value 0
	}

This patch fixes alc_mux_enum_*() (and others) to fall back to the
first entry if the secondary input mux is empty.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:10 -07:00
Ralf Gerbig
f46a0cd9a4 ALSA: hda - Sound MSI fallout on a Asus mobo NVIDIA MCP55
commit ecd216260f upstream.

without the following patch audio ssttuutteerrs on
ASUS M2N32-SLI PREMIUM ACPI BIOS Revision 1304
the sound device is:
00:0e.1 Audio device: nVidia Corporation MCP55 High Definition Audio (rev a2)
worked with 2.6.32

Signed-off-by: Ralf Gerbig <rge@quengel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:10 -07:00
Louis Rilling
58a60dad3f tg3: Fix tg3_poll_controller() passing wrong pointer to tg3_interrupt()
commit fe234f0e5c upstream.

Commit 09943a1819
	Author: Matt Carlson <mcarlson@broadcom.com>
	Date:   Fri Aug 28 14:01:57 2009 +0000

	tg3: Convert ISR parameter to tnapi

forgot to update tg3_poll_controller(), leading to intermittent crashes with
netpoll.

Fix this.

Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:10 -07:00
Sujith
5421e40def mac80211: Fix HT rate control configuration
commit 4fa0043731 upstream.

Handling HT configuration changes involved setting the channel
with the new HT parameters and then issuing a rate_update()
notification to the driver.

This behavior changed after the off-channel changes. Now, the channel
is not updated with the new HT params in enable_ht() - instead, it
is now done when the scan work terminates. This results in the driver
depending on stale information, defaulting to non-HT mode always.

Fix this by passing the new channel type to the driver.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:09 -07:00
Russell King
6c6a223088 ARM: Fix decompressor's kernel size estimation for ROM=y
commit 98e12b5a6e upstream.

Commit 2552fc2 changed the way the decompressor decides if it is safe
to decompress the kernel directly to its final location.  Unfortunately,
it took the top of the compressed data as being the stack pointer,
which it is for ROM=n cases.  However, for ROM=y, the stack pointer
is not relevant, and results in the wrong answer.

Fix this by explicitly storing the end of the biggybacked data in the
decompressor, and use that to calculate the compressed image size.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:09 -07:00
Russell King
d47325e5f1 decompress: fix new decompressor for PIC
commit 5ceaa2f39b upstream.

The ARM kernel decompressor wants to be able to relocate r/w data
independently from the rest of the image, and we do this by ensuring that
r/w data has global visibility.  Define STATIC_RW_DATA to be empty to
achieve this.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Alain Knaff <alain@knaff.lu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:08 -07:00
Julia Lawall
997c7813ce drivers/scsi/ses.c: eliminate double free
commit 9b3a6549b2 upstream.

The few lines below the kfree of hdr_buf may go to the label err_free
which will also free hdr_buf.  The most straightforward solution seems to
be to just move the kfree of hdr_buf after these gotos.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier E;
expression E1;
iterator I;
statement S;
@@

*kfree(E);
... when != E = E1
    when != I(E,...) S
    when != &E
*kfree(E);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01 16:01:08 -07:00
Greg Kroah-Hartman
dbdafe5ccf Linux 2.6.33.1 v2.6.33.1 2010-03-15 09:09:39 -07:00
Ian Campbell
4f92d68b54 x86, mm: Allow highmem user page tables to be disabled at boot time
commit 1431559200 upstream.

Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
enable CONFIG_HIGHPTE for any x86 configuration which has highmem
enabled. This means that the overhead applies even to machines which
have a fairly modest amount of high memory and which therefore do not
really benefit from allocating PTEs in high memory but still pay the
price of the additional mapping operations.

Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
no actual highptes being allocated there was a reduction in system
time used from 59.737s to 55.9s.

With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 175.396 (0.238914)
  User Time 515.983 (5.85019)
  System Time 59.737 (1.26727)
  Percent CPU 263.8 (71.6796)
  Context Switches 39989.7 (4672.64)
  Sleeps 42617.7 (246.307)

With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 174.278 (0.831968)
  User Time 515.659 (6.07012)
  System Time 55.9 (1.07799)
  Percent CPU 263.8 (71.266)
  Context Switches 39929.6 (4485.13)
  Sleeps 42583.7 (373.039)

This patch allows the user to control the allocation of PTEs in
highmem from the command line ("userpte=nohigh") but retains the
status-quo as the default.

It is possible that some simple heuristic could be developed which
allows auto-tuning of this option however I don't have a sufficiently
large machine available to me to perform any particularly meaningful
experiments. We could probably handwave up an argument for a threshold
at 16G of total RAM.

Assuming 768M of lowmem we have 196608 potential lowmem PTE
pages. Each page can map 2M of RAM in a PAE-enabled configuration,
meaning a maximum of 384G of RAM could potentially be mapped using
lowmem PTEs.

Even allowing generous factor of 10 to account for other required
lowmem allocations, generous slop to account for page sharing (which
reduces the total amount of RAM mappable by a given number of PT
pages) and other innacuracies in the estimations it would seem that
even a 32G machine would not have a particularly pressing need for
highmem PTEs. I think 32G could be considered to be at the upper bound
of what might be sensible on a 32 bit machine (although I think in
practice 64G is still supported).

It's seems questionable if HIGHPTE is even a win for any amount of RAM
you would sensibly run a 32 bit kernel on rather than going 64 bit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:14 -07:00
Thomas Gleixner
371f1d1775 sched: Don't use possibly stale sched_class
commit 83ab0aa0d5 upstream.

setscheduler() saves task->sched_class outside of the rq->lock held
region for a check after the setscheduler changes have become
effective. That might result in checking a stale value.

rtmutex_setprio() has the same problem, though it is protected by
p->pi_lock against setscheduler(), but for correctness sake (and to
avoid bad examples) it needs to be fixed as well.

Retrieve task->sched_class inside of the rq->lock held region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:14 -07:00
Suresh Siddha
51c80d13a1 sched: Fix SMT scheduler regression in find_busiest_queue()
commit 9000f05c6d upstream.

Fix a SMT scheduler performance regression that is leading to a scenario
where SMT threads in one core are completely idle while both the SMT threads
in another core (on the same socket) are busy.

This is caused by this commit (with the problematic code highlighted)

   commit bdb94aa5db
   Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
   Date:   Tue Sep 1 10:34:38 2009 +0200

   sched: Try to deal with low capacity

   @@ -4203,15 +4223,18 @@ find_busiest_queue()
   ...
	for_each_cpu(i, sched_group_cpus(group)) {
   +	unsigned long power = power_of(i);

   ...

   -	wl = weighted_cpuload(i);
   +	wl = weighted_cpuload(i) * SCHED_LOAD_SCALE;
   +	wl /= power;

   -	if (rq->nr_running == 1 && wl > imbalance)
   +	if (capacity && rq->nr_running == 1 && wl > imbalance)
		continue;

On a SMT system, power of the HT logical cpu will be 589 and
the scheduler load imbalance (for scenarios like the one mentioned above)
can be approximately 1024 (SCHED_LOAD_SCALE). The above change of scaling
the weighted load with the power will result in "wl > imbalance" and
ultimately resulting in find_busiest_queue() return NULL, causing
load_balance() to think that the load is well balanced. But infact
one of the tasks can be moved to the idle core for optimal performance.

We don't need to use the weighted load (wl) scaled by the cpu power to
compare with  imabalance. In that condition, we already know there is only a
single task "rq->nr_running == 1" and the comparison between imbalance,
wl is to make sure that we select the correct priority thread which matches
imbalance. So we really need to compare the imabalnce with the original
weighted load of the cpu and not the scaled load.

But in other conditions where we want the most hammered(busiest) cpu, we can
use scaled load to ensure that we consider the cpu power in addition to the
actual load on that cpu, so that we can move the load away from the
guy that is getting most hammered with respect to the actual capacity,
as compared with the rest of the cpu's in that busiest group.

Fix it.

Reported-by: Ma Ling <ling.ma@intel.com>
Initial-Analysis-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1266023662.2808.118.camel@sbs-t61.sc.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:13 -07:00
Vaidyanathan Srinivasan
39226dabf8 sched: Fix sched_mv_power_savings for !SMT
commit 28f5318167 upstream.

Fix for sched_mc_powersavigs for pre-Nehalem platforms.
Child sched domain should clear SD_PREFER_SIBLING if parent will have
SD_POWERSAVINGS_BALANCE because they are contradicting.

Sets the flags correctly based on sched_mc_power_savings.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100208100555.GD2931@dirshya.in.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:12 -07:00
Gleb Natapov
065bc5cd99 KVM: x86 emulator: Check CPL level during privilege instruction emulation
commit e92805ac12 upstream.

Add CPL checking in case emulator is tricked into emulating
privilege instruction from userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:11 -07:00
Gleb Natapov
5832b65a5c KVM: x86 emulator: Add group9 instruction decoding
commit 60a29d4ea4 upstream.

Use groups mechanism to decode 0F C7 instructions.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:11 -07:00
Gleb Natapov
8bc4cf28e8 KVM: x86 emulator: Forbid modifying CS segment register by mov instruction
commit 8b9f44140b upstream.

Inject #UD if guest attempts to do so. This is in accordance to Intel
SDM.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:11 -07:00
Gleb Natapov
1c731613a9 KVM: x86 emulator: Add group8 instruction decoding
commit 2db2c2eb62 upstream.

Use groups mechanism to decode 0F BA instructions.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:10 -07:00
Sheng Yang
36725f4d96 KVM: VMX: Trap and invalid MWAIT/MONITOR instruction
commit 59708670b6 upstream.

We don't support these instructions, but guest can execute them even if the
feature('monitor') haven't been exposed in CPUID. So we would trap and inject
a #UD if guest try this way.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:09 -07:00
Mike Snitzer
6c46bff05f dm ioctl: only issue uevent on resume if state changed
commit 0f3649a9e3 upstream.

Only issue a uevent on a resume if the state of the device changed,
i.e. if it was suspended and/or its table was replaced.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:08 -07:00
Mikulas Patocka
f64089821b dm: free dm_io before bio_endio not after
commit a97f925a32 upstream.

Free the dm_io structure before calling bio_endio() instead of after it,
to ensure that the io_pool containing it is not referenced after it is
freed.

This partially fixes a problem described here
  https://www.redhat.com/archives/dm-devel/2010-February/msg00109.html

thread 1:
bio_endio(bio, io_error);
/* scheduling happens */
					thread 2:
					close the device
					remove the device
thread 1:
free_io(md, io);

Thread 2, when removing the device, sees non-empty md->io_pool (because the
io hasn't been freed by thread 1 yet) and may crash with BUG in mempool_free.
Thread 1 may also crash, when freeing into a nonexisting mempool.

To fix this we must make sure that bio_endio() is the last call and
the md structure is not accessed afterwards.

There is another bio_endio in process_barrier, but it is called from the thread
and the thread is destroyed prior to freeing the mempools, so this call is
not affected by the bug.

A similar bug exists with module unloads - the module may be unloaded
immediately after bio_endio - but that is more difficult to fix.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:08 -07:00
Trond Myklebust
53165f58af NFS: Fix an allocation-under-spinlock bug
commit ebed9203b6 upstream.

sunrpc_cache_update() will always call detail->update() from inside the
detail->hash_lock, so it cannot allocate memory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:07 -07:00
James Hogan
1f2ba616b4 rtc-coh901331: fix braces in resume code
commit 5a98c04d78 upstream.

The else part of the if statement is indented but does not have braces
around it. It clearly should since it uses clk_enable and clk_disable
which are supposed to balance.

Signed-off-by: James Hogan <james@albanarts.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:05 -07:00
Joe Perches
c54f376268 scripts/get_maintainer.pl: fix possible infinite loop
commit 3c840c18bc upstream.

If MAINTAINERS section entries are misformatted, it was possible to have
an infinite loop.

Correct the defect by always moving the index to the end of section + 1

Also, exit check for exclude as soon as possible.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:05 -07:00
Lars-Peter Clausen
a8fe08331a s3cmci: initialize default platform data no_wprotect and no_detect with 1
commit c212808a1b upstream.

If no platform_data was givin to the device it's going to use it's default
platform data struct which has all fields initialized to zero.  As a
result the driver is going to try to request gpio0 both as write protect
and card detect pin.  Which of course will fail and makes the driver
unusable

Previously to the introduction of no_wprotect and no_detect the behavior
was to assume that if no platform data was given there is no write protect
or card detect pin.  This patch restores that behavior.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:04 -07:00
Lars-Peter Clausen
b4705ec3af s3cmci: s3cmci_card_present: Use no_detect to decide whether there is a card detect pin
commit dc2ed55280 upstream.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:04 -07:00
Trond Myklebust
daaeb8a821 SUNRPC: Handle EINVAL error returns from the TCP connect operation
commit 9fcfe0c83c upstream.

This can, for instance, happen if the user specifies a link local IPv6
address.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:03 -07:00
Neil Brown
f774f57a79 sunrpc: remove unnecessary svc_xprt_put
commit ab1b18f70a upstream.

The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt.  This is seen in svc_revisit
which is where things are added to this queue.  dr->xprt is set to
NULL and the reference to the xprt it put.

So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.

Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.

Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:01 -07:00
Alex Deucher
293ad92768 drm/radeon/kms/atom: fix shr/shl ops
commit 6a8a2d702b upstream.

The whole attribute table is valid for
shr/shl ops.

Fixes fdo bug 26668

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:01 -07:00
Maarten Maathuis
07406de2b6 drm/ttm: handle OOM in ttm_tt_swapout
commit 290e55056e upstream.

- Without this change I get a general protection fault.
- Also use PTR_ERR where applicable.

Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:00 -07:00
Zhao Yakui
190ef1c7ed drm/i915: Use a dmi quirk to skip a broken SDVO TV output.
commit 6070a4a928 upstream.

This IBM system has a multi-function SDVO card that reports both VGA
and TV, but the system has no TV connector.  The TV connector always
reported as connected, which would lead to poor modesetting choices.

https://bugs.freedesktop.org/show_bug.cgi?id=25787

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Tested-by: Vance <liangghv@sg.ibm.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:07:00 -07:00
Jan Dumon
011ee01431 USB: unusual_devs: Add support for multiple Option 3G sticks
commit 46216e4fbe upstream.

Enable the SD-Card interface on multiple Option 3G sticks.
The unusual_devs.h entry is necessary because the device descriptor is
vendor-specific. That prevents usb-storage from binding to it as an interface
driver.

Signed-off-by: Jan Dumon <j.dumon@option.com>
Signed-off-by: Phil Dibowitz <phil@ipom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:06:59 -07:00
Alan Cox
40d6d89c48 USB: cp210x: Add 81E8 (Zephyr Bioharness)
commit bd07c551aa upstream.

As reported in
http://bugzilla.kernel.org/show_bug.cgi?id=10980

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:06:59 -07:00
Daniel Sangorrin
6d73d95b3a USB: serial: ftdi: add CONTEC vendor and product id
commit 46b72d78cb upstream.

This is a patch to ftdi_sio_ids.h and ftdi_sio.c that adds
identifiers for CONTEC USB serial converter. I tested it
with the device COM-1(USB)H

Signed-off-by: Daniel Sangorrin <daniel.sangorrin@gmail.com>
Cc: Andreas Mohr <andi@lisas.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 09:06:58 -07:00