commit ebb682f522 upstream.
The per_cpu cpuid4_info shared_map can contain stale data when CPUs are added
and removed.
The stale data can lead to a NULL pointer derefernce panic on a remove of a
CPU that has had siblings previously removed.
This patch resolves the panic by verifying a cpu is actually online before
adding it to the shared_cpu_map, only examining cpus that are part of
the same lower level cache, and by updating other siblings lowest level cache
maps when a cpu is added.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20091209183336.17855.98708.sendpatchset@prarit.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 7a0fc404ae upstream.
Atom erratum AAE44/AAF40/AAG38/AAH41:
"If software clears the PS (page size) bit in a present PDE (page
directory entry), that will cause linear addresses mapped through this
PDE to use 4-KByte pages instead of using a large page after old TLB
entries are invalidated. Due to this erratum, if a code fetch uses
this PDE before the TLB entry for the large page is invalidated then
it may fetch from a different physical address than specified by
either the old large page translation or the new 4-KByte page
translation. This erratum may also cause speculative code fetches from
incorrect addresses."
[http://download.intel.com/design/processor/specupdt/319536.pdf]
Where as commit 211b3d03c7 seems to
workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
opportunity for the bug to occur and does not totally remove it. This
patch disables mixed 4K/4MB page tables totally avoiding the page
splitting and not tripping this processor issue.
This is based on an original patch by Colin King.
Originally-by: Colin Ian King <colin.king@canonical.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit cb19060abf upstream.
Final stage linking can fail with
arch/x86/built-in.o: In function `store_cache_disable':
intel_cacheinfo.c:(.text+0xc509): undefined reference to `amd_get_nb_id'
arch/x86/built-in.o: In function `show_cache_disable':
intel_cacheinfo.c:(.text+0xc7d3): undefined reference to `amd_get_nb_id'
when CONFIG_CPU_SUP_AMD is not enabled because the amd_get_nb_id
helper is defined in AMD-specific code but also used in generic code
(intel_cacheinfo.c). Reorganize the L3 cache index disable code under
CONFIG_CPU_SUP_AMD since it is AMD-only anyway.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100218184210.GF20473@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit f619b3d842 upstream.
The show/store_cache_disable routines depend unnecessarily on NUMA's
cpu_to_node and the disabling of cache indices broke when !CONFIG_NUMA.
Remove that dependency by using a helper which is always correct.
While at it, enable L3 Cache Index disable on rev D1 Istanbuls which
sport the feature too.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100218184339.GG20473@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 897de50e08 upstream.
The cache_disable_[01] attribute in
/sys/devices/system/cpu/cpu?/cache/index[0-3]/
is enabled on all cache levels although only L3 supports it. Add it only
to the cache level that actually supports it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-5-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit dcf39daf3d upstream.
* Correct the masks used for writing the cache index disable indices.
* Do not turn off L3 scrubber - it is not necessary.
* Make sure wbinvd is executed on the same node where the L3 is.
* Check for out-of-bounds values written to the registers.
* Make show_cache_disable hex values unambiguous
* Check for Erratum #388
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-4-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 14be1f7454 upstream.
On UV systems, the TSC is not synchronized across blades. The
sched_clock_cpu() function is returning values that can go
backwards (I've seen as much as 8 seconds) when switching
between cpus.
As each cpu comes up, early_init_intel() will currently set the
sched_clock_stable flag true. When mark_tsc_unstable() runs, it
clears the flag, but this only occurs once (the first time a cpu
comes up whose TSC is not synchronized with cpu 0). After this,
early_init_intel() will set the flag again as the next cpu comes
up.
Only set sched_clock_stable if tsc has not been marked unstable.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100301174815.GC8224@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 557a701c16 upstream.
Easy fix for a regression introduced in 2.6.31.
On managed CPUs the cpufreq.c core will call driver->exit(cpu) on the
managed cpus and powernow_k8 will free the core's data.
Later driver->get(cpu) function might get called trying to read out the
current freq of a managed cpu and the NULL pointer check does not work on
the freed object -> better set it to NULL.
->get() is unsigned and must return 0 as invalid frequency.
Reference:
http://bugzilla.kernel.org/show_bug.cgi?id=14391
Signed-off-by: Thomas Renninger <trenn@suse.de>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit b160091802 upstream.
CONFIG_X86_CPU_DEBUG, which provides some parsed versions of the x86
CPU configuration via debugfs, has caused boot failures on real
hardware. The value of this feature has been marginal at best, as all
this information is already available to userspace via generic
interfaces.
Causes crashes that have not been fixed + minimal utility -> remove.
See the referenced LKML thread for more information.
Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <alpine.LFD.2.00.1001221755320.13231@localhost.localdomain>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 6c56ccecf0 upstream.
Commit 83ce4009 did the following change
If the TSC is constant and non-stop, also set it reliable.
But, there seems to be few systems that will end up with TSC warp across
sockets, depending on how the cpus come out of reset. Skipping TSC sync
test on such systems may result in time inconsistency later.
So, reenable TSC sync test even on constant and non-stop TSC systems.
Set, sched_clock_stable to 1 by default and reset it in
mark_tsc_unstable, if TSC sync fails.
This change still gives perf benefit mentioned in 83ce4009 for systems
where TSC is reliable.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20091217202702.GA18015@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 485a2e1973 upstream.
Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...
Note that "Intel Correct Machine Check Interrupts" already
has such a check.
Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B25FDC2.3020401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Upstream commit a2202aa292.
On platforms where bios handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
can't touch it.
Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set to masked
(clearing the bios programmed value for APIC_LVTTHMR).
And this leads to kernel take over the thermal monitoring interrupt
on AP's but not on BSP (leaving the bios programmed value only on BSP).
As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.
Fix this by reading the initial value of thermal LVT entry on BSP
and if bios has taken over the control, then program the same value
on all AP's and leave the thermal monitoring interrupt control
on all the logical cpu's to the bios.
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit bc09effabf upstream.
mce_timer must be passed to setup_timer() in all cases, no
matter whether it is going to be actually used. Otherwise, when
the CPU gets brought down, its call to del_timer_sync() will
never return, as the timer won't have a base associated, and
hence lock_timer_base() will loop infinitely.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B1DB831.2030801@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 7d1849aff6 upstream.
The x86 lapic nmi watchdog does not recognize AMD Family 11h,
resulting in:
NMI watchdog: CPU not supported
As far as I can see from available documentation (the BKDM),
family 11h looks identical to family 10h as far as the PMU
is concerned.
Extending the check to accept family 11h results in:
Testing NMI watchdog ... OK.
I've been running with this change on a Turion X2 Ultra ZM-82
laptop for a couple of weeks now without problems.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Removing the SMT/HT check, since the Errata doesn't mention
Hyper-Threading.
Adding in a printk, so that the user knows why acpi-cpufreq refuses to
load. Also, once system is blacklisted, don't repeat checks to see if
blacklisted. This also causes the message to only be printed once,
rather than for each CPU.
Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
There is a typo in the longhaul detection code so only Longhaul v1 or Longhaul v3
is selected. The Longhaul v2 is not selected even for CPUs which are capable of.
Tested on PCChips Giga Pro board. Frequency changes work and the Longhaul v2
detects that the board is not capable of changing CPU voltage.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
The MCE initialization code explicitly says it doesn't handle
asymmetric configurations where different CPUs support different
numbers of MCE banks, and it prints a big warning in that case.
Therefore, printing the "mce: CPU supports <x> MCE banks"
message into the kernel log for every CPU is pure redundancy
that clutters the log significantly for systems with lots of
CPUs.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
LKML-Reference: <adaeip473qt.fsf@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
The current bound checks for copy_from_user in the MTRR driver are
not as obvious as they could be, and gcc agrees with that.
This patch simplifies the boundary checks to the point that gcc can
now prove to itself that the copy_from_user() is never going past
its bounds.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090926205150.30797709@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.
While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:
- Fixed the mce_decode() weak alias - a weak alias is really not
good here, it should be a proper callback. A weak alias will be
overriden if a piece of code is built into the kernel - not
good, obviously.
- The patch initializes the callback on AMD family 10h and 11h.
- Added the more correct fallback printk of:
No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.
On CPUs that dont have a decoder.
- Made the surrounding code more readable.
Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.
(there's no unregister needed as this is core code.)
version -v2 by Borislav Petkov:
- add K8 to the set of supported CPUs
- always build in edac_mce_amd since we use an early_initcall now
- fix checkpatch warnings
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 22223c9b41, as
requested by Andi Kleen:
"Obviously kernels compiled with AMD support can still run on non AMD
systems, so messages like this can never be removed at compile time."
Requsted-by: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use rdmsrl_safe() when accessing MCE registers. While in
theory we always 'know' which ones are safe to access from
the capability bits, there's a lot of hardware variations
and reality might differ from theory, as it did in this case:
http://bugzilla.kernel.org/show_bug.cgi?id=14204
[ 0.010016] mce: CPU supports 5 MCE banks
[ 0.011029] general protection fault: 0000 [#1]
[ 0.011998] last sysfs file:
[ 0.011998] Modules linked in:
[ 0.011998]
[ 0.011998] Pid: 0, comm: swapper Not tainted (2.6.31_router #1) HP Vectra
[ 0.011998] EIP: 0060:[<c100d9b9>] EFLAGS: 00010246 CPU: 0
[ 0.011998] EIP is at mce_rdmsrl+0x19/0x60
[ 0.011998] EAX: 00000000 EBX: 00000001 ECX: 00000407 EDX: 08000000
[ 0.011998] ESI: 00000000 EDI: 8c000000 EBP: 00000405 ESP: c17d5eac
So WARN_ONCE() instead of crashing the box.
( also fix a number of stylistic inconsistencies in the code. )
Note, we might still crash in wrmsrl() if we get that far, but
we shouldnt if the registers are truly inaccessible.
Reported-by: GNUtoo <GNUtoo@no-log.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <bug-14204-5438@http.bugzilla.kernel.org/>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current raise_local() uses a struct mce that comes from mce_write()
as a parameter instead of the real inject-msg, so when we set
mce.finished = 0 to clear injected MCE, the real inject stays
valid.
This will cause the remaining inject-msg affect the next injection,
which is not desired.
To fix this, real inject-msg is used in raise_local instead of the
one on the stack.
This patch is based on the diagnosis and the fixes by Dean Nelson.
Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <1253601357.15717.757.camel@yhuang-dev.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If a system switches back and forth between hot and cold mode,
the MCE code will print a stream of critical kernel messages.
Extend the throttling code to properly notice this, by
only printing the first hot + cold transition and omitting
the rest up to CHECK_INTERVAL (5 minutes).
This way we'll only get a single incident of:
[ 102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1)
[ 102.357000] Disabling lock debugging due to kernel taint
[ 102.369223] CPU0: Temperature/speed normal
Every 5 minutes. The 'total events' count tells the number of cold/hot
transitions detected, should overheating occur after 5 minutes again:
[ 402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891)
[ 402.358001] CPU0: Temperature/speed normal
[ 450.704142] Machine check events logged
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of a mess of three separate percpu variables, consolidate
the state into a single structure.
Also clean up therm_throt_process(), use cleaner and more
understandable variable names and a clearer logic.
This, without changing the logic, makes the code more
streamlined, more readable and smaller as well:
text data bss dec hex filename
1487 169 4 1660 67c therm_throt.o.before
1432 176 4 1612 64c therm_throt.o.after
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
trivial: fix typo in aic7xxx comment
trivial: fix comment typo in drivers/ata/pata_hpt37x.c
trivial: typo in kernel-parameters.txt
trivial: fix typo in tracing documentation
trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
trivial: remove unnecessary semicolons
trivial: Fix duplicated word "options" in comment
trivial: kbuild: remove extraneous blank line after declaration of usage()
trivial: improve help text for mm debug config options
trivial: doc: hpfall: accept disk device to unload as argument
trivial: doc: hpfall: reduce risk that hpfall can do harm
trivial: SubmittingPatches: Fix reference to renumbered step
trivial: fix typos "man[ae]g?ment" -> "management"
trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
trivial: fix missing printk space in amd_k7_smp_check
trivial: fix typo s/ketymap/keymap/ in comment
trivial: fix typo "to to" in multiple files
trivial: fix typos in comments s/DGBU/DBGU/
...
* 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Tidy up after the big rename
perf: Do the big rename: Performance Counters -> Performance Events
perf_counter: Rename 'event' to event_id/hw_event
perf_counter: Rename list_entry -> group_entry, counter_list -> group_list
Manually resolved some fairly trivial conflicts with the tracing tree in
include/trace/ftrace.h and kernel/trace/trace_syscalls.c.
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter, powerpc, sparc: Fix compilation after perf_counter_overflow() change
perf_counter: x86: Fix PMU resource leak
perf util: SVG performance improvements
perf util: Make the timechart SVG width dynamic
perf timechart: Show the duration of scheduler delays in the SVG
perf timechart: Show the name of the waker/wakee in timechart
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Print the hypervisor returned tsc_khz during boot
x86: Correct segment permission flags in 64-bit linker script
x86: cpuinit-annotate SMP boot trampolines properly
x86: Increase timeout for EHCI debug port reset completion in early printk
x86: Fix uaccess_32.h typo
x86: Trivial whitespace cleanups
x86, apic: Fix missed handling of discrete apics
x86/i386: Remove duplicated #include
x86, mtrr: Convert loop to a while based construct, avoid naked semicolon
Revert 'x86: Fix system crash when loading with "reservetop" parameter'
x86, mce: Fix compile warning in case of CONFIG_SMP=n
x86, apic: Use logical flat on intel with <= 8 logical cpus
x86: SGI UV: Map MMIO-High memory range
x86: SGI UV: Add volatile semantics to macros that access chipset registers
x86: SGI UV: Fix IPI macros
x86: apic: Convert BUG() to BUG_ON()
x86: Remove final bits of CONFIG_X86_OLD_MCE
This trivial patch fixes one missing space in printk.
I already fixed it about half a year ago or more, but the change (in
arch/x86/kernel/cpu/smpboot.c at that time) didn't made into
mainline yet.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
index 28e5f59..6c139ed 100644
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
- provide compatibility Kconfig entry for existing PERF_COUNTERS .config's
- provide courtesy copy of old perf_counter.h, for user-space projects
- small indentation fixups
- fix up MAINTAINERS
- fix small x86 printout fallout
- fix up small PowerPC comment fallout (use 'counter' as in register)
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (58 commits)
perf_counter: Fix perf_copy_attr() pointer arithmetic
perf utils: Use a define for the maximum length of a trace event
perf: Add timechart help text and add timechart to "perf help"
tracing, x86, cpuidle: Move the end point of a C state in the power tracer
perf utils: Be consistent about minimum text size in the svghelper
perf timechart: Add "perf timechart record"
perf: Add the timechart tool
perf: Add a SVG helper library file
tracing, perf: Convert the power tracer into an event tracer
perf: Add a sample_event type to the event_union
perf: Allow perf utilities to have "callback" options without arguments
perf: Store trace event name/id pairs in perf.data
perf: Add a timestamp to fork events
sched_clock: Make it NMI safe
perf_counter: Fix up swcounter throttling
x86, perf_counter, bts: Optimize BTS overflow handling
perf sched: Add --input=file option to builtin-sched.c
perf trace: Sample timestamp and cpu when using record flag
perf tools: Increase MAX_EVENT_LENGTH
perf tools: Fix memory leak in read_ftrace_printk()
...
On an AMD-64 system the processor frequency that is printed during
system boot, may be different than the tsc frequency that was
returned by the hypervisor, due to the value returned from
calibrate_cpu.
For debugging timekeeping or other related issues it might be
better to get the tsc_khz value returned by the hypervisor.
The patch below now prints the tsc frequency that the VMware
hypervisor returned.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
LKML-Reference: <1252095219.12518.13.camel@ank32.eng.vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>