[ Upstream commit 49c0ae80eb ]
Stephane reported that we don't set properly PERIOD sample type for
events with period term defined.
Before:
$ perf record -e cpu/cpu-cycles,period=1000/u ls
$ perf evlist -v
cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ...
After:
$ perf record -e cpu/cpu-cycles,period=1000/u ls
$ perf evlist -v
cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ...
Setting PERIOD sample type based on period term setup.
Committer note:
When we use -c or a period=N term in the event definition, then we don't
need to ask the kernel, for this event, via perf_event_attr.sample_type
|= PERF_SAMPLE_PERIOD, to put the event period in each sample for this
event, as we know it already, it is in perf_event_attr.sample_period.
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180201083812.11359-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 249d98e567 ]
When setting the "dwarf" unwinder for a specific event and not
specifying the max-stack, the attr.sample_max_stack ended up using an
uninitialized callchain_param.max_stack, fix it by using designated
initializers for that callchain_param variable, zeroing all non
explicitely initialized struct members.
Here is what happened:
# perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
perf_event_attr:
type 2
size 112
config 0x730
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user 1
{ wakeup_events, wakeup_watermark } 1
sample_regs_user 0xff0fff
sample_stack_user 8192
sample_max_stack 50656
sys_perf_event_open failed, error -75
Value too large for defined data type
# perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
perf_event_attr:
type 2
size 112
config 0x730
sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user 1
sample_regs_user 0xff0fff
sample_stack_user 8192
sample_max_stack 30448
sys_perf_event_open failed, error -75
Value too large for defined data type
#
Now the attr.sample_max_stack is set to zero and the above works as
expected:
# perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
0.000 probe_libc:inet_pton:(7feb7a998350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa39b6108f3f] (/usr/bin/ping)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eabad8c685 ]
When setting up DWARF callchains on specific events, without using
'record' or 'trace' --call-graph, but instead doing it like:
perf trace -e cycles/call-graph=dwarf/
The unwind__prepare_access() call in thread__insert_map() when we
process PERF_RECORD_MMAP(2) metadata events were not being performed,
precluding us from using per-event DWARF callchains, handling them just
when we asked for all events to be DWARF, using "--call-graph dwarf".
We do it in the PERF_RECORD_MMAP because we have to look at one of the
executable maps to figure out the executable type (64-bit, 32-bit) of
the DSO laid out in that mmap. Also to look at the architecture where
the perf.data file was recorded.
All this probably should be deferred to when we process a sample for
some thread that has callchains, so that we do this processing only for
the threads with samples, not for all of them.
For now, fix using DWARF on specific events.
Before:
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
0.000 probe_libc:inet_pton:(7fe9597bb350))
Problem processing probe_libc:inet_pton callchain, skipping...
#
After:
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
0.000 probe_libc:inet_pton:(7fd4aa930350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa804e51af3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa804e51b379] (/usr/bin/ping)
#
# perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
0.000 probe_libc:inet_pton:(7f9363b9e350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffa9e8a14e0f3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffa9e8a14e1379] (/usr/bin/ping)
#
# perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
0.000 probe_libc:inet_pton:(7f4947e1c350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa716d88ef3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa716d88f379] (/usr/bin/ping)
#
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
0.000 probe_libc:inet_pton:(7fa157696350))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
getaddrinfo (/usr/lib64/libc-2.26.so)
[0xffffa9ba39c74f40] (/usr/bin/ping)
#
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91d29b288a upstream.
timestamp_insn_cnt is used to estimate the timestamp based on the number of
instructions since the last known timestamp.
If the estimate is not accurate enough decoding might not be correctly
synchronized with side-band events causing more trace errors.
However there are always timestamps following an overflow, so the
estimate is not needed and can indeed result in more errors.
Suppress the estimate by setting timestamp_insn_cnt to zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63d8e38f6a upstream.
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.
The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.
That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ca8000684e ]
While monitoring a multithread process with pid option, perf sometimes
may return sys_perf_event_open failure with 3(No such process) if any of
the process's threads die before we open the event. However, we want
perf continue monitoring the remaining threads and do not exit with
error.
Here, the patch enables perf_evsel::ignore_missing_thread for -p option
to ignore complete failure if any of threads die before we open the event.
But it may still return sys_perf_event_open failure with 22(Invalid) if we
monitors several event groups.
sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8
sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8
WARNING: Ignored open failure for pid 28962
sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8
sys_perf_event_open failed, error -22
That is because when we ignore a missing thread, we change the thread_idx
without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
may return a wrong group_fd for the next thread and sys_perf_event_open()
return with 22.
sys_perf_event_open(){
...
if (group_fd != -1)
perf_fget_light()//to get corresponding group_leader by group_fd
...
if (group_leader)
if (group_leader->ctx->task != ctx->task)//should on the same task
goto err_context
...
}
This patch also fixes this bug by introducing perf_evsel__remove_fd() and
update_fds to allow removing fds for the missing thread.
Changes since v1:
- Change group_fd__remove() into a more genetic way without changing code logic
- Remove redundant condition
Changes since v2:
- Use a proper function name and add some comment.
- Multiline comment style fixes.
Committer testing:
Before this patch the recently added 'perf stat --per-thread' for system
wide counting would race while enumerating all threads using /proc:
[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directory
Usage: perf stat [<options>] [<command>]
-C, --cpu <cpu> list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directory
Usage: perf stat [<options>] [<command>]
-C, --cpu <cpu> list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]#
When, say, the kernel was being built, so lots of shortlived threads,
after this patch this doesn't happen.
Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Cheng Jian <cj.chengjian@huawei.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
[ Remove one use 'evlist' alias variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b3a2716dd ]
Commit d80406453a ("perf symbols: Allow user probes on versioned
symbols") allows user to find default versioned symbols (with "@@") in
map. However, it did not enable normal versioned symbol (with "@") for
perf-probe. E.g.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
Error: Failed to add events.
=====
This solves above issue by improving perf-probe symbol search function,
as below.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Added new event:
probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_get_state -aR sleep 1
# ./perf probe -l
probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
=====
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 35a8a148d8 ]
The command 'perf annotate' parses the output of objdump and also
investigates the comments produced by objdump. For example the
output of objdump produces (on x86):
23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
# 234008 <stderr@@GLIBC_2.2.5+0x9a8>
and the function mov__parse() is called to investigate the complete
line. Mov__parse() breaks this line into several parts and finally
calls function comment__symbol() to parse the data after the comment
character '#'. Comment__symbol() expects a hexadecimal address followed
by a symbol in '<' and '>' brackets.
However the 2nd parameter given to function comment__symbol()
always points to the comment character '#'. The address parsing
always returns 0 because the character '#' is not a digit and
strtoull() fails without being noticed.
Fix this by advancing the second parameter to function comment__symbol()
by one byte before invocation and add an error check after strtoull()
has been called.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 6de783b6f5 ("perf annotate: Resolve symbols using objdump comment")
Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de19e5c3c5 upstream.
trigger_on() means that the trigger is available but not ready, however
trigger_on() was making it ready. That can segfault if the signal comes
before trigger_ready(). e.g. (USR2 signal delivery not shown)
$ perf record -e intel_pt//u -S sleep 1
perf: Segmentation fault
Obtained 16 stack frames.
/home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550]
/lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
/home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6]
/home/ahunter/bin/perf() [0x43a45b]
/lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
/lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5]
/home/ahunter/bin/perf() [0x4ec6c9]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4eca15]
/home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77]
/home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0]
/home/ahunter/bin/perf(cmd_record+0x722) [0x43c132]
/home/ahunter/bin/perf() [0x4a11ae]
/home/ahunter/bin/perf(main+0x5d4) [0x427fb4]
Note, for testing purposes, this is hard to hit unless you add some sleep()
in builtin-record.c before record__open().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes: 3dcc4436fa ("perf tools: Introduce trigger class")
Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59622fd496 ]
The Intel PMU event aliases have a implicit period= specifier to set the
default period.
Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.
Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.
I only did it for -c/-F for now, I think that's the only case that's
broken currently.
Before:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 2000003
After:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 1000
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171020202755.21410-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Looks like I've reached the new level of stupidity, adding missing braces.
Committer testing:
Given the following eBPF C filter, that will add a record when it
returns true, i.e. when the tv_nsec variable is > 2000ns, should be
built and installed via sys_bpf(), but fails to do so before this patch:
# cat filter.c
#include <uapi/linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))
SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
int func(void *ctx, int err, long nsec)
{
return nsec > 1000;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf trace -e nanosleep,filter.c usleep 1
invalid or unsupported event: 'filter.c'
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
And works again after it is applied, the nothing is inserted when the co
# perf trace -e *sleep,filter.c usleep 1
0.000 ( 0.066 ms): usleep/23994 nanosleep(rqtp: 0x7ffead94a0d0) = 0
# perf trace -e *sleep,filter.c usleep 2
0.000 ( 0.008 ms): usleep/24378 nanosleep(rqtp: 0x7fffa021ba50) ...
0.008 ( ): perf_bpf_probe:func:(ffffffffb410cb30) tv_nsec=2000)
0.000 ( 0.066 ms): usleep/24378 ... [continued]: nanosleep()) = 0
#
The intent of 9445464bb8 is kept:
# perf stat -e 'cpu/uops_executed.core,krava/' true
event syntax error: '..cuted.core,krava/'
\___ unknown term
valid terms: cmask,pc,event,edge,in_tx,any,ldlat,inv,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
#
# perf stat -e 'cpu/uops_executed.core,period=1/' true
Performance counter stats for 'true':
808,332 cpu/uops_executed.core,period=1/
0.002997237 seconds time elapsed
#
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-diea0ihbwpxfw6938huv3whj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo reported broken builds in some distros using a newer flex
release, 2.6.4, found in Alpine Linux 3.6 and Edge, with flex not
spotting the REJECT macro:
CC /tmp/build/perf/util/parse-events-flex.o
util/parse-events.l: In function 'parse_events_lex':
/tmp/build/perf/util/parse-events-flex.c:4734:16: error: \
'reject_used_but_not_detected' undeclared (first use in this function)
It's happening because we put the REJECT under another USER_REJECT macro
in following commit:
9445464bb8 perf tools: Unwind properly location after REJECT
Fortunately flex provides option for force it to use REJECT, adding it
to parse-events.l.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-7kdont984mw12ijk7rji6b8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have defined YY_USER_ACTION to keep trace of the column location
during events parsing, but we need to clean it up when we call REJECT.
When REJECT is called, the lexer shrinks the text and re-runs the
matching, so we need to address it in resuming the previous location
value to keep it correct for error display, like:
Before:
$ perf stat -e 'cpu/uops_executed.core,krava/' true
event syntax error: '..38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;1\
1;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;\
21;38;50
�'
\___ unknown term
After:
$ ./perf stat -e 'cpu/uops_executed.core,krava/' true
event syntax error: '..cuted.core,krava/'
\___ unknown term
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vug2hchlny30jfsfrumbym26@git.kernel.org
Link: http://lkml.kernel.org/r/20171009140944.GD28623@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf top is often crashing at very random locations on powerpc. After
investigating, I found the crash only happens when sample is of zero
length symbol. Powerpc kernel has many such symbols which does not
contain length details in vmlinux binary and thus start and end
addresses of such symbols are same.
Structure
struct sym_hist {
u64 nr_samples;
u64 period;
struct sym_hist_entry addr[0];
};
has last member 'addr[]' of size zero. 'addr[]' is an array of addresses
that belongs to one symbol (function). If function consist of 100
instructions, 'addr' points to an array of 100 'struct sym_hist_entry'
elements. For zero length symbol, it points to the *empty* array, i.e.
no members in the array and thus offset 0 is also invalid for such
array.
static int __symbol__inc_addr_samples(...)
{
...
offset = addr - sym->start;
h = annotation__histogram(notes, evidx);
h->nr_samples++;
h->addr[offset].nr_samples++;
h->period += sample->period;
h->addr[offset].period += sample->period;
...
}
Here, when 'addr' is same as 'sym->start', 'offset' becomes 0, which is
valid for normal symbols but *invalid* for zero length symbols and thus
updating h->addr[offset] causes memory corruption.
Fix this by adding one dummy element for zero length symbols.
Link: https://lkml.org/lkml/2016/10/10/148
Fixes: edee44be59 ("perf annotate: Don't throw error for zero length symbols")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1508854806-10542-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In current xyarray code, xyarray__max_x() returns max_y, and xyarray__max_y()
returns max_x.
It's confusing and for code logic it looks not correct.
Error happens when closing evsel fd. Let's see this scenario:
1. Allocate an fd (pseudo-code)
perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
{
evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int));
}
xyarray__new(int xlen, int ylen, size_t entry_size)
{
size_t row_size = ylen * entry_size;
struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
xy->entry_size = entry_size;
xy->row_size = row_size;
xy->entries = xlen * ylen;
xy->max_x = xlen;
xy->max_y = ylen;
......
}
So max_x is ncpus, max_y is nthreads and row_size = nthreads * 4.
2. Use perf syscall and get the fd
int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
for (cpu = 0; cpu < cpus->nr; cpu++) {
for (thread = 0; thread < nthreads; thread++) {
int fd, group_fd;
fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
group_fd, flags);
FD(evsel, cpu, thread) = fd;
}
}
static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
{
return &xy->contents[x * xy->row_size + y * xy->entry_size];
}
These codes don't have issues. The issue happens in the closing of fd.
3. Close fd.
void perf_evsel__close_fd(struct perf_evsel *evsel)
{
int cpu, thread;
for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++)
for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) {
close(FD(evsel, cpu, thread));
FD(evsel, cpu, thread) = -1;
}
}
Since xyarray__max_x() returns max_y (nthreads) and xyarry__max_y()
returns max_x (ncpus), so above code is actually to be:
for (cpu = 0; cpu < nthreads; cpu++)
for (thread = 0; thread < ncpus; ++thread) {
close(FD(evsel, cpu, thread));
FD(evsel, cpu, thread) = -1;
}
It's not correct!
This change is introduced by "475fb533fb7d" ("perf evsel: Fix buffer overflow
while freeing events")
This fix is to let xyarray__max_x() return max_x (ncpus) and
let xyarry__max_y() return max_y (nthreads)
Committer note:
This was also fixed by Ravi Bangoria, who provided the same patch,
noticing the problem with 'perf record':
<quote Ravi>
I see 'perf record -p <pid>' crashes with following log:
*** Error in `./perf': free(): invalid next size (normal): 0x000000000298b340 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7fd85c87e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f7fd85d137a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7fd85d553c]
./perf(perf_evsel__close+0xb4)[0x4b7614]
./perf(perf_evlist__delete+0x100)[0x4ab180]
./perf(cmd_record+0x1d9)[0x43a5a9]
./perf[0x49aa2f]
./perf(main+0x631)[0x427841]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7fd8571830]
./perf(_start+0x29)[0x427a59]
</>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Fixes: d74be47673 ("perf xyarray: Save max_x, max_y")
Link: http://lkml.kernel.org/r/1508339478-26674-1-git-send-email-yao.jin@linux.intel.com
Link: http://lkml.kernel.org/r/1508327446-15302-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events. It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
+elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
+python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
#2 0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
#3 perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
file_offset=file_offset@entry=1136) at util/session.c:1522
#4 0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
data_offset=<optimized out>, session=0x0) at util/session.c:1899
#5 perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
#6 0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
at builtin-buildid-list.c:83
#7 cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
#8 0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
at perf.c:296
#9 0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
(gdb)
Fix it by adding a stub event handler for namespace event.
Committer testing:
Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:
# perf record -a --namespaces sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
# perf buildid-list | wc -l
38
# perf buildid-list | head -5
e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
#
It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:
# perf buildid-list -H
Segmentation fault (core dumped)
^C
#
Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes: f3b3614a28 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/20171017132900.11043-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the check wether the eBPF file exists, to consider it
as eBPF input file. This way we can differentiate eBPF events
from events that end up with same suffix as eBPF file.
Before:
$ perf stat -e 'cpu/uops_executed.core/' true
bpf: builtin compilation failed: -95, try external compiler
WARNING: unable to get correct kernel building directory.
Hint: Set correct kbuild directory using 'kbuild-dir' option in [llvm]
section of ~/.perfconfig or set it to "" to suppress kbuild
detection.
event syntax error: 'cpu/uops_executed.core/'
\___ Failed to load cpu/uops_executed.c from source: 'version' section incorrect or lost
After:
$ perf stat -e 'cpu/uops_executed.core/' true
Performance counter stats for 'true':
181,533 cpu/uops_executed.core/:u
0.002795447 seconds time elapsed
If user makes type in the eBPF file, we prioritize the event syntax
and show following warning:
$ perf stat -e 'krava.c//' true
event syntax error: 'krava.c//'
\___ Cannot find PMU `krava.c'. Missing kernel support?
Reported-and-Tested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20171013083736.15037-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, perf record is broken on arm/arm64 systems when the PMU is
specified explicitly as part of the event, e.g.
$ ./perf record -e armv8_cortex_a53/cpu_cycles/u true
In such cases, perf record fails to open events unless
perf_event_paranoid is set to -1, even if the PMU in question supports
mode exclusion. Further, even when perf_event_paranoid is toggled, no
samples are recorded.
This is an unintended side effect of commit:
e3ba76deef ("perf tools: Force uncore events to system wide monitoring)
... which assumes that if a PMU has an associated cpu_map, it is an
uncore PMU, and forces events for such PMUs to be system-wide.
This is not true for arm/arm64 systems, which can have heterogeneous
CPUs. To account for this, multiple CPU PMUs are exposed, each with a
"cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM
PMUs do not have a "cpumask" file, and only have a "cpus" file. For the
gory details as to why, see commit:
7e3fcffe95 ("perf pmu: Support alternative sysfs cpumask")
Given all of this, we can instead identify uncore PMUs by explicitly
checking for a "cpumask" file, and restore arm/arm64 PMU support back to
a working state. This patch does so, adding a new perf_pmu::is_uncore
field, and splitting the existing cpumask parsing so that it can be
reused.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by Will Deacon <will.deacon@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: 4.12+ <stable@vger.kernel.org>
Fixes: e3ba76deef ("perf tools: Force uncore events to system wide monitoring)
Link: http://lkml.kernel.org/r/1507315102-5942-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The build of kernel v4.14-rc1 for i686 fails on RHEL 6 with the error
in tools/perf:
util/syscalltbl.c:157: error: expected ';', ',' or ')' before '__maybe_unused'
mv: cannot stat `util/.syscalltbl.o.tmp': No such file or directory
Fix it by placing/moving:
#include <linux/compiler.h>
outside of #ifdef HAVE_SYSCALL_TABLE block.
Signed-off-by: Akemi Yagi <toracat@elrepo.org>
Cc: Alan Bartlett <ajb@elrepo.org>
Link: http://lkml.kernel.org/r/oq41r8$1v9$1@blaine.gmane.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With --call-graph option, perf report can display call chains using
type, min percent threshold, optional print limit and order. And the
default call-graph parameter is 'graph,0.5,caller,function,percent'.
Before this patch, 'perf report --call-graph' shows incorrect debug
messages as below:
# perf report --call-graph
Invalid callchain mode: 0.5
Invalid callchain order: 0.5
Invalid callchain sort key: 0.5
Invalid callchain config key: 0.5
Invalid callchain mode: caller
Invalid callchain mode: function
Invalid callchain order: function
Invalid callchain mode: percent
Invalid callchain order: percent
Invalid callchain sort key: percent
That is because in function __parse_callchain_report_opt(),each field of
the call-graph parameter is passed to parse_callchain_{mode,order,
sort_key,value} in turn until it meets the matching value.
For example, the order field "caller" is passed to
parse_callchain_mode() firstly and obviously it doesn't match any mode
field. Therefore parse_callchain_mode() will shows the debug message
"Invalid callchain mode: caller", which could confuse users.
The patch fixes this issue by moving the warning out of the function
parse_callchain_{mode,order,sort_key,value}.
Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/1506154694-39691-1-git-send-email-zhangmengting@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yet another fix for probing the max attr.precise_ip setting: it is not
enough settting attr.exclude_kernel for !root users, as they _can_
profile the kernel if the kernel.perf_event_paranoid sysctl is set to
-1, so check that as well.
Testing it:
As non root:
$ sysctl kernel.perf_event_paranoid
kernel.perf_event_paranoid = 2
$ perf record sleep 1
$ perf evlist -v
cycles:uppp: ..., exclude_kernel: 1, ... precise_ip: 3, ...
Now as non-root, but with kernel.perf_event_paranoid set set to the
most permissive value, -1:
$ sysctl kernel.perf_event_paranoid
kernel.perf_event_paranoid = -1
$ perf record sleep 1
$ perf evlist -v
cycles:ppp: ..., exclude_kernel: 0, ... precise_ip: 3, ...
$
I.e. non-root, default kernel.perf_event_paranoid: :uppp modifier = not allowed to sample the kernel,
non-root, most permissible kernel.perf_event_paranoid: :ppp = allowed to sample the kernel.
In both cases, use the highest available precision: attr.precise_ip = 3.
Reported-and-Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d37a369790 ("perf evsel: Fix attr.exclude_kernel setting for default cycles:p")
Link: http://lkml.kernel.org/n/tip-nj2qkf75xsd6pw6hhjzfqqdx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Peter reported that when he explicitely asked for multiple events with
the same name on the command line it got coalesced into just one line,
i.e.:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
3,269,652 cycles
0.000884123 seconds time elapsed
#
And while there is the --no-merges option to disable that auto-merging,
this is a blunt change in behaviour for such explicit request, so change
the code so that this auto merging is done only when handling the multi
PMU aliases with the same name that introduced this coalescing,
restoring the previous behaviour for the explicit case:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,472,837 cycles
1,472,837 cycles
1,472,837 cycles
0.001764870 seconds time elapsed
#
Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 430daf2dc7 ("perf stat: Collapse identically named events")
Link: http://lkml.kernel.org/r/20170831184122.GK4831@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With two new methods, one to find the first match, returning its syscall
id and its index in whatever internal database it keeps the syscall
into, then one to find the next match, if any.
Implemented only on arches where we actually read the syscall table from
the kernel sources, i.e. x86-64 for now, all the others use the libaudit
method for which this returns -1, i.e. just stubs were added, with the
actual implementation using whatever libaudit functions for matching
that may be available.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i0sj4rxk1a63pfe9gl8z8irs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The branch history code has a loop detection function. With this, we can
get the number of iterations by calculating the removed loops.
While it would be nice for knowing the average cycles of iterations.
This patch adds up the cycles in branch entries of removed loops and
save the result to the next branch entry (e.g. branch entry A).
Finally it will display the iteration number and average cycles at the
"from" of branch entry A.
For example:
perf record -g -j any,save_type ./div
perf report --branch-history --no-children --stdio
--22.63%--main div.c:42 (RET CROSS_2M)
compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2)
|
--10.73%--compute_flag div.c:27 (RET CROSS_2M)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M)
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1502111115-18305-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On x86, the plt header size is as same as the plt entry size, and can be
identified from shdr's sh_entsize of the plt.
But we can't assume that the sh_entsize of the plt shdr is always the
plt entry size in all architecture, and the plt header size may be not
as same as the plt entry size in some architecure.
On ARM, the plt header size is 20 bytes and the plt entry size is 12
bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils
implementation. The plt section is as follows:
Disassembly of section .plt:
000004a0 <__cxa_finalize@plt-0x14>:
4a0: e52de004 push {lr} ; (str lr, [sp, #-4]!)
4a4: e59fe004 ldr lr, [pc, #4] ; 4b0 <_init+0x1c>
4a8: e08fe00e add lr, pc, lr
4ac: e5bef008 ldr pc, [lr, #8]!
4b0: 00008424 .word 0x00008424
000004b4 <__cxa_finalize@plt>:
4b4: e28fc600 add ip, pc, #0, 12
4b8: e28cca08 add ip, ip, #8, 20 ; 0x8000
4bc: e5bcf424 ldr pc, [ip, #1060]! ; 0x424
000004c0 <printf@plt>:
4c0: e28fc600 add ip, pc, #0, 12
4c4: e28cca08 add ip, ip, #8, 20 ; 0x8000
4c8: e5bcf41c ldr pc, [ip, #1052]! ; 0x41c
On AARCH64, the plt header size is 32 bytes and the plt entry size is 16
bytes. The plt section is as follows:
Disassembly of section .plt:
0000000000000560 <__cxa_finalize@plt-0x20>:
560: a9bf7bf0 stp x16, x30, [sp,#-16]!
564: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
568: f944be11 ldr x17, [x16,#2424]
56c: 9125e210 add x16, x16, #0x978
570: d61f0220 br x17
574: d503201f nop
578: d503201f nop
57c: d503201f nop
0000000000000580 <__cxa_finalize@plt>:
580: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
584: f944c211 ldr x17, [x16,#2432]
588: 91260210 add x16, x16, #0x980
58c: d61f0220 br x17
0000000000000590 <__gmon_start__@plt>:
590: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
594: f944c611 ldr x17, [x16,#2440]
598: 91262210 add x16, x16, #0x988
59c: d61f0220 br x17
NOTES:
In addition to ARM and AARCH64, other architectures, such as
s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider
this issue.
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: David Tolnay <dtolnay@gmail.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 9aaf5a5f47 ("perf probe: Check kprobes blacklist when
adding new events"), 'perf probe' supports checking the blacklist of the
fuctions which can not be probed. But the checking condition is wrong,
that the end_addr of the symbol which is the start_addr of the next
symbol can't be included.
Committer notes:
IOW make it match its kernel counterpart in kernel/kprobes.c:
bool within_kprobe_blacklist(unsigned long addr)
Each entry have as its end address not its end address, but the first
address _outside_ that symbol, which for related functions, is the first
address of the next symbol, like these from kernel/trace/trace_probe.c:
0xffffffffbd198df0-0xffffffffbd198e40 print_type_u8
0xffffffffbd198e40-0xffffffffbd198e90 print_type_u16
0xffffffffbd198e90-0xffffffffbd198ee0 print_type_u32
0xffffffffbd198ee0-0xffffffffbd198f30 print_type_u64
0xffffffffbd198f30-0xffffffffbd198f80 print_type_s8
0xffffffffbd198f80-0xffffffffbd198fd0 print_type_s16
0xffffffffbd198fd0-0xffffffffbd199020 print_type_s32
0xffffffffbd199020-0xffffffffbd199070 print_type_s64
0xffffffffbd199070-0xffffffffbd1990c0 print_type_x8
0xffffffffbd1990c0-0xffffffffbd199110 print_type_x16
0xffffffffbd199110-0xffffffffbd199160 print_type_x32
0xffffffffbd199160-0xffffffffbd1991b0 print_type_x64
But not always:
0xffffffffbd1997b0-0xffffffffbd1997c0 fetch_kernel_stack_address (kernel/trace/trace_probe.c)
0xffffffffbd1c57f0-0xffffffffbd1c58b0 __context_tracking_enter (kernel/context_tracking.c)
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Fixes: 9aaf5a5f47 ("perf probe: Check kprobes blacklist when adding new events")
Link: http://lkml.kernel.org/r/1504011443-7269-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding dump_read function to gather all the dump output of read
function. Adding output of enabled and running times and id if enabled
(3 new lines with '...' prefix below).
$ perf record -s ...
$ perf report -D
958358311769 0x91f8 [0x40]: PERF_RECORD_READ: 3339 3339 cycles:u 0
... time enabled : 958358313731
... time running : 958358313731
... id : 80
Committer note:
Do not use 'read' as a variable name as it breaks the build on older
systems, such as RHEL6:
CC /tmp/build/perf/util/session.o
cc1: warnings being treated as errors
util/session.c: In function 'dump_read':
util/session.c:1132: error: declaration of 'read' shadows a global declaration
/usr/include/bits/unistd.h:35: error: shadowed declaration is here
mv: cannot stat `/tmp/build/perf/util/.session.o.tmp': No such file or directory
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>