commit 8c3ba8d049 upstream.
When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.
The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register) safely,
it doesn't consider lower prioritized queued interrupts stored in the IRR
register.
If you have more than one pending interrupt within the same 32 bit word in
the LAPIC vector table registers you may find yourself entering the IO
APIC setup with pending interrupts left in the LAPIC. This is a situation
for wich the IO APIC setup is not prepared. Depending of what/which
interrupt vector/vectors are stuck in the APIC tables your system may show
various degrees of malfunctioning. That was the reason why the
check_timer() failed in our system, the timer interrupts was blocked by
pending interrupts from the old kernel when routed trough the IO APIC.
Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:
+if (loops++ > MAX_LOOPS) {
+ printk("LAPIC pending clean-up")
+ break;
+}
while (queued);
with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.
[trenn@suse.de:
V2: Use tsc if avail to bail out after 1 sec due to possible virtual
apic_read calls which may take rather long (suggested by: Avi Kivity
<avi@redhat.com>) If no tsc is available bail out quickly after
cpu_khz, if we broke out too early and still have irqs pending (which
should never happen?) we still get a WARN_ON...
V3: - Fixed indentation -> checkpatch clean
- max_loops must be signed
V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call
V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case]
Cc: <jbohac@novell.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit c686ecf504 upstream.
Commit e32e78c5ee
(powerpc: fix build with make 3.82) introduced a
typo in uImage target and broke building uImage:
make: *** No rule to make target `uImage'. Stop.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit a7c55cbee0 upstream.
Newer Intel processors identifying themselves as model 30 are not recognized by
oprofile.
<cpuinfo snippet>
model : 30
model name : Intel(R) Xeon(R) CPU X3470 @ 2.93GHz
</cpuinfo snippet>
Running oprofile on these machines gives the following:
+ opcontrol --init
+ opcontrol --list-events
oprofile: available events for CPU type "Intel Architectural Perfmon"
See Intel 64 and IA-32 Architectures Software Developer's Manual
Volume 3B (Document 253669) Chapter 18 for architectural perfmon events
This is a limited set of fallback events because oprofile doesn't know your CPU
CPU_CLK_UNHALTED: (counter: all)
Clock cycles when not halted (min count: 6000)
INST_RETIRED: (counter: all)
number of instructions retired (min count: 6000)
LLC_MISSES: (counter: all)
Last level cache demand requests from this core that missed the LLC
(min count: 6000)
Unit masks (default 0x41)
----------
0x41: No unit mask
LLC_REFS: (counter: all)
Last level cache demand requests from this core (min count: 6000)
Unit masks (default 0x4f)
----------
0x4f: No unit mask
BR_MISS_PRED_RETIRED: (counter: all)
number of mispredicted branches retired (precise) (min count: 500)
+ opcontrol --shutdown
Tested using oprofile 0.9.6.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 45c34e05c4 upstream.
Back when the patch was submitted for "Add Xeon 7500 series support to
oprofile", Robert Richter had asked for a followon patch that
converted all the CPU ID values to hex.
I have done that here for the "i386/core_i7" and "i386/atom" class
processors in the ppro_init() function and also added some comments on
where to find documentation on the Intel processors.
Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[ Upstream commits 86fa04b874
and b10f997bb0 ]
Should return 'long' instead of 'int'.
Thanks to Dimitris Michailidis and Tony Luck.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[ Upstream commit ef201bebe5 ]
As noticed by Linus, it is critical that some of the
rwsem constants be signed. Yet, hex constants are
unsigned unless explicitly casted or negated.
The most critical one is RWSEM_WAITING_BIAS.
This bug was exacerbated by commit
424acaaeb3 ("rwsem: wake queued readers
when writer blocks on active read lock")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[ Upstream commit bf8253bf5e7cfe17dd53e3f6340a45b11d9fb51c ]
SunBlade-2500 has 'parallel' device node with compatible
property "pnpALI,1533,3" so add that to the ID table.
Reported-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit d7c53c9e82 upstream.
When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d:
Stuck ??" message due to multiple cores concurrently accessing the
cpu_callin_mask, among others.
Since these codepaths are not protected from concurrent access due to
the fact that there's no sane reason for making an already complex
code unnecessarily more complex - we hit the issue only when insanely
switching cores off- and online - serialize hotplugging cores on the
sysfs level and be done with it.
[ v2.1: fix !HOTPLUG_CPU build ]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100819181029.GC17171@aftab>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 9605456919 upstream.
It's wrong for several reasons, but the most direct one is that the
fault may be for the stack accesses to set up a previous SIGBUS. When
we have a kernel exception, the kernel exception handler does all the
fixups, not some user-level signal handler.
Even apart from the nested SIGBUS issue, it's also wrong to give out
kernel fault addresses in the signal handler info block, or to send a
SIGBUS when a system call already returns EFAULT.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 8d3d50bf19 upstream.
During a EEH recover, the pci_dev structure can be null, mainly if an
eeh event is detected during cpi config operation. In this case, the
pci_dev will not be known (and will be null) the kernel will crash
with the following message:
Unable to handle kernel paging request for data at address 0x000000a0
Faulting instruction address: 0xc00000000006b8b4
Oops: Kernel access of bad area, sig: 11 [#1]
NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0
LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
Call Trace:
[c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
[c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70
The bug occurs because pci_name() tries to access a null pointer.
This patch just guarantee that pci_name() is not called on Null pointers.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit fad539956c upstream.
Iranna D Ankad reported that IBM x3950 systems have boot
problems after this commit:
|
| commit b9c61b7007
|
| x86/pci: update pirq_enable_irq() to setup io apic routing
|
The problem is that with the patch, the machine freezes when
console=ttyS0,... kernel serial parameter is passed.
It seem to freeze at DVD initialization and the whole problem
seem to be DVD/pata related, but somehow exposed through the
serial parameter.
Such apic problems can expose really weird behavior:
ACPI: IOAPIC (id[0x10] address[0xfecff000] gsi_base[0])
IOAPIC[0]: apic_id 16, version 0, address 0xfecff000, GSI 0-2
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[3])
IOAPIC[1]: apic_id 15, version 0, address 0xfec00000, GSI 3-38
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[39])
IOAPIC[2]: apic_id 14, version 0, address 0xfec01000, GSI 39-74
ACPI: INT_SRC_OVR (bus 0 bus_irq 1 global_irq 4 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 5 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 6 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 4 global_irq 7 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 6 global_irq 9 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 7 global_irq 10 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 11 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 12 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 12 global_irq 15 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 13 global_irq 16 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 17 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 18 dfl dfl)
It turns out that the system has three io apic controllers, but
boot ioapic routing is in the second one, and that gsi_base is
not 0 - it is using a bunch of INT_SRC_OVR...
So these recent changes:
1. one set routing for first io apic controller
2. assume irq = gsi
... will break that system.
So try to remap those gsis, need to seperate boot_ioapic_idx
detection out of enable_IO_APIC() and call them early.
So introduce boot_ioapic_idx, and remap_ioapic_gsi()...
-v2: shift gsi with delta instead of gsi_base of boot_ioapic_idx
-v3: double check with find_isa_irq_apic(0, mp_INT) to get right
boot_ioapic_idx
-v4: nr_legacy_irqs
-v5: add print out for boot_ioapic_idx, and also make it could be
applied for current kernel and previous kernel
-v6: add bus_irq, in acpi_sci_ioapic_setup, so can get overwride
for sci right mapping...
-v7: looks like pnpacpi get irq instead of gsi, so need to revert
them back...
-v8: split into two patches
-v9: according to Eric, use fixed 16 for shifting instead of remap
-v10: still need to touch rsparser.c
-v11: just revert back to way Eric suggest...
anyway the ioapic in first ioapic is blocked by second...
-v12: two patches, this one will add more loop but check apic_id and irq > 16
Reported-by: Iranna D Ankad <iranna.ankad@in.ibm.com>
Bisected-by: Iranna D Ankad <iranna.ankad@in.ibm.com>
Tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: len.brown@intel.com
LKML-Reference: <4B8A321A.1000008@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit ea0854170c upstream.
Newly added memory can not be accessed via /dev/mem, because we do not
update the variables high_memory, max_pfn and max_low_pfn.
Add a function update_end_of_memory_vars() to update these variables for
64-bit kernels.
[akpm@linux-foundation.org: simplify comment]
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Li Haicheng <haicheng.li@intel.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit e2a465675d upstream.
It's possible that SBA IOMMU might fail to find I/O space under heavy
I/Os. SBA IOMMU panics on allocation failure but it shouldn't; drivers
can handle the failure. The majority of other IOMMU drivers don't panic
on allocation failure.
This patch fixes SBA IOMMU path to handle allocation failure properly.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Leonardo Chiquitto <lchiquitto@novell.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 0696b711e4 upstream.
Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.
This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf
Add a new argument "mult" to update_vsyscall() and hand in the
timekeeping internal NTP adjusted multiplier.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 3c5d92a0cf upstream.
Allow the architecture to request a normal jiffy tick when the system
goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
used to prevent the system going fully idle if there has been an
interrupt other than a clock comparator interrupt since the last wakeup.
On s390 the HiperSockets response time for 1 connection ping-pong goes
down from 42 to 34 microseconds. The CPU cost decreases by 27%.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
LKML-Reference: <20090929122533.402715150@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Jolly <jjolly@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 868003ca7a upstream.
This is a follow up to
14cb0de (arm/imx/gpio: add spinlock protection)
and fixes the following build failure:
CC arch/arm/mach-imx/pcm970-baseboard.o
In file included from arch/arm/include/asm/gpio.h:6,
from include/linux/gpio.h:8,
from arch/arm/mach-imx/pcm970-baseboard.c:20:
arch/arm/plat-mxc/include/mach/gpio.h:40: error: expected specifier-qualifier-list before 'spinlock_t'
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit cdf357f1e1 upstream.
On versions of the Cortex-A9 prior to r2p0, performing TLB invalidations by
ASID match can result in the incorrect ASID being broadcast to other CPUs.
As a consequence of this, the targetted TLB entries are not invalidated
across the system.
This workaround changes the TLB flushing routines to invalidate entries
regardless of the ASID.
Tested-by: Rob Clark <rob@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 9f242dc10e upstream.
When running on VMware's platform, we have seen situations where
the AP's try to calibrate the lpj values and fail to get good calibration
runs becasue of timing issues. As a result delays don't work correctly
on all cpus.
The solutions is to set preset_lpj value based on the current tsc frequency
value. This is similar to what KVM does as well.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
LKML-Reference: <1280790637.14933.29.camel@ank32.eng.vmware.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit e32e78c5ee upstream.
Thomas Backlund reported that the powerpc build broke with make 3.82.
It failed with the following message:
arch/powerpc/Makefile:183: *** mixed implicit and normal rules. Stop.
The fix is to avoid mixing non-wildcard and wildcard targets.
Reported-by: Thomas Backlund <tmb@mandriva.org>
Tested-by: Thomas Backlund <tmb@mandriva.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 8a22b9996b upstream.
xen_sched_clock only counts unstolen time. In principle this should
be useful to the Linux scheduler so that it knows how much time a process
actually consumed. But in practice this doesn't work very well as the
scheduler expects the sched_clock time to be synchronized between
cpus. It also uses sched_clock to measure the time a task spends
sleeping, in which case "unstolen time" isn't meaningful.
So just use plain xen_clocksource_read to return wallclock nanoseconds
for sched_clock.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit d9b68e5e88 upstream.
The firmware handles '\t' internally, so stop trying to emulate it
(which, incidentally, had a bug in it.)
Fixes a really weird hang at bootup in rcu_bootup_announce, which,
as far as I can tell, is the first printk in the core kernel to use
a tab as the first character.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 2e65a2075c upstream.
Commit 3fea60261e ("Input: twl40300-keypad - fix handling of "all
ground" rows") broke compilation as I managed to use non-existent
keycodes.
Reported-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit b70f4e85bf upstream.
Typo in down_spin() meant it only read the low 32 bits of the
"serve" value, instead of the full 64 bits. This results in the
system hanging when the values in ticket/serve get larger than
32-bits. A big enough system running the right test can hit this
in a just a few hours.
Broken since 883a3acf5b
[IA64] Re-implement spinaphores using ticket lock concepts
Reported via IRC by Bjorn Helgaas
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 95e8f634d7 upstream.
In the FPU emulator code of the MIPS, the Cause bits of the FCSR register
are not currently writeable by the ctc1 instruction. In odd corner cases,
this can cause problems. For example, a case existed where a divide-by-zero
exception was generated by the FPU, and the signal handler attempted to
restore the FPU registers to their state before the exception occurred. In
this particular setup, writing the old value to the FCSR register would
cause another divide-by-zero exception to occur immediately. The solution
is to change the ctc1 instruction emulator code to allow the Cause bits of
the FCSR register to be writeable. This is the behaviour of the hardware
that the code is emulating.
This problem was found by Shane McDonald, but the credit for the fix goes
to Kevin Kissell. In Kevin's words:
I submit that the bug is indeed in that ctc_op: case of the emulator. The
Cause bits (17:12) are supposed to be writable by that instruction, but the
CTC1 emulation won't let them be updated by the instruction. I think that
actually if you just completely removed lines 387-388 [...] things would
work a good deal better. At least, it would be a more accurate emulation of
the architecturally defined FPU. If I wanted to be really, really pedantic
(which I sometimes do), I'd also protect the reserved bits that aren't
necessarily writable.
Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com>
To: anemo@mba.ocn.ne.jp
To: kevink@paralogos.com
To: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/1205/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit b6dacf63e9 upstream.
The ACPI spec tells us that the firmware will reenable SCI_EN on resume.
Reality disagrees in some cases. The ACPI spec tells us that the only way
to set SCI_EN is via an SMM call.
https://bugzilla.kernel.org/show_bug.cgi?id=13745 shows us that doing so
may break machines. Tracing the ACPI calls made by Windows shows that it
unconditionally sets SCI_EN on resume with a direct register write, and
therefore the overwhelming probability is that everything is fine with
this behaviour.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 718be4aaf3 upstream.
It turns out that there is a bit in the _CST for Intel FFH C3
that tells the OS if we should be checking BM_STS or not.
Linux has been unconditionally checking BM_STS.
If the chip-set is configured to enable BM_STS,
it can retard or completely prevent entry into
deep C-states -- as illustrated by turbostat:
http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/
ref: Intel Processor Vendor-Specific ACPI Interface Specification
table 4 "_CST FFH GAS Field Encoding"
Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state
https://bugzilla.kernel.org/show_bug.cgi?id=15886
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Always invalidate spte and flush TLBs when changing page size, to make
sure different sized translations for the same address are never cached
in a CPU's TLB.
Currently the only case where this occurs is when a non-leaf spte pointer is
overwritten by a leaf, large spte entry. This can happen after dirty
logging is disabled on a memslot, for example.
Noticed by Andrea.
KVM-Stable-Tag
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit 3be2264be3)
This patch implements a workaround for AMD erratum 383 into
KVM. Without this erratum fix it is possible for a guest to
kill the host machine. This patch implements the suggested
workaround for hypervisors which will be published by the
next revision guide update.
[jan: fix overflow warning on i386]
[xiao: fix unused variable warning]
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit 67ec660777)
This patch moves handling of the MC vmexits to an earlier
point in the vmexit. The handle_exit function is too late
because the vcpu might alreadry have changed its physical
cpu.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit fe5913e4e1)
If cr0.wp=0, we have to allow the guest kernel access to a page with pte.w=0.
We do that by setting spte.w=1, since the host cr0.wp must remain set so the
host can write protect pages. Once we allow write access, we must remove
user access otherwise we mistakenly allow the user to write the page.
Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit 69325a1225)
commit ff4878089e upstream.
hpet_disable is called unconditionally on machine reboot if hpet support
is compiled in the kernel.
hpet_disable only checks if the machine is hpet capable but doesn't make
sure that hpet has been initialized.
[ tglx: Made it a one liner and removed the redundant hpet_address check ]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
LKML-Reference: <alpine.DEB.2.00.1007211726240.22235@kaball-desktop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 2503a5ecd8 upstream.
RealView boards with certain revisions of the L220 cache controller (ARM11*
processors only) may have issues (hardware deadlock) with the recent changes to
the mb() barrier implementation (DSB followed by an L2 cache sync). The patch
redefines the RealView ARM11MPCore mandatory barriers without the outer_sync()
call.
Tested-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 3fea60261e upstream.
The Nokia RX51 board code (arch/arm/mach-omap2/board-rx51-peripherals.c)
defines a key map for the matrix keypad keyboard. The hardware seems to
use all of the 8 rows and 8 columns of the keypad, although not all
possible locations are used.
The TWL4030 supports keypads with at most 8 rows and 8 columns. Most keys
are defined with a row and column number between 0 and 7, except
KEY(0xff, 2, KEY_F9),
KEY(0xff, 4, KEY_F10),
KEY(0xff, 5, KEY_F11),
which represent keycodes that should be emitted when entire row is
connected to the ground. since the driver handles this case as if we
had an extra column in the key matrix. Unfortunately we do not allocate
enough space and end up owerwriting some random memory.
Reported-and-tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit fd19dce7ac upstream.
Found one x2apic system kexec loop test failed
when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip)
first kernel can kexec second kernel, but second kernel can not kexec third one.
it can be duplicated on another system with BIOS preenabled x2apic.
First kernel can not kexec second kernel.
It turns out, when kernel boot with pre-enabled x2apic, it will not execute
disable_local_APIC on shutdown path.
when init_apic_mappings() is called in setup_arch, it will skip setting of
apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic())
Then later, disable_local_APIC() will bail out early because !apic_phys.
So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys.
another solution could be updating init_apic_mappings() to set apic_phys even
for preenabled x2apic system. Actually even for x2apic system, that lapic
address is mapped already in early stage.
BTW: is there any x2apic preenabled system with apicid of boot cpu > 255?
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4C3EB22B.3000701@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit d596043d71 upstream.
The x3950 family can have as many as 256 PCI buses in a single system, so
change the limits to the maximum. Since there can only be 256 PCI buses in one
domain, we no longer need the BUG_ON check.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
LKML-Reference: <20100701004519.GQ15515@tux1.beaverton.ibm.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 550f0d9222 upstream.
Clear the floating point exception flag before returning to
user space. This is needed, else the libc trampoline handler
may hit the same SIGFPE again while building up a trampoline
to a signal handler.
Fixes debian bug #559406.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes a bug in the KVM efer-msr write path. If a
guest writes to a reserved efer bit the set_efer function
injects the #GP directly. The architecture dependent wrmsr
function does not see this, assumes success and advances the
rip. This results in a #GP in the guest with the wrong rip.
This patch fixes this by reporting efer write errors back to
the architectural wrmsr function.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
(cherry picked from commit b69e8caef5)
Wallclock writing uses an unprotected global variable to hold the version;
this can cause one guest to interfere with another if both write their
wallclock at the same time.
Acked-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
(cherry picked from commit 9ed3c444ab)
On svm, kvm_read_pdptr() may require reading guest memory, which can sleep.
Push the spinlock into mmu_alloc_roots(), and only take it after we've read
the pdptr.
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
(cherry picked from commit 8facbbff07)
Per document, for feature control MSR:
Bit 1 enables VMXON in SMX operation. If the bit is clear, execution
of VMXON in SMX operation causes a general-protection exception.
Bit 2 enables VMXON outside SMX operation. If the bit is clear, execution
of VMXON outside SMX operation causes a general-protection exception.
This patch is to enable this kind of check with SMX for VMXON in KVM.
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
(cherry picked from commit cafd66595d)