Commit Graph

8046 Commits

Author SHA1 Message Date
Thomas Richter
2d8d8d23c4 perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
[ Upstream commit 7a92453620 ]

On Intel test case trace+probe_libc_inet_pton.sh succeeds and the
output is:

[root@f27 perf]# ./perf trace --no-syscalls
                  -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
     0.000 probe_libc:inet_pton:(7fa40ac618a0))
              __GI___inet_pton (/usr/lib64/libc-2.26.so)
              getaddrinfo (/usr/lib64/libc-2.26.so)
              main (/usr/bin/ping)

The kernel stack unwinder is used, it is specified implicitly
as call-graph=fp (frame pointer).

On s390x only dwarf is available for stack unwinding. It is also
done in user space. This requires different parameter setup
and result checking for s390x and Intel.

This patch adds separate perf trace setup and result checking
for Intel and s390x. On s390x specify this command line to
get a call-graph and handle the different call graph result
checking:

[root@s35lp76 perf]# ./perf trace --no-syscalls
	-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms
     0.000 probe_libc:inet_pton:(3ffb9942060))
            __GI___inet_pton (/usr/lib64/libc-2.26.so)
            gaih_inet (inlined)
            __GI_getaddrinfo (inlined)
            main (/usr/bin/ping)
            __libc_start_main (/usr/lib64/libc-2.26.so)
            _start (/usr/bin/ping)
[root@s35lp76 perf]#

Before:
[root@s8360047 perf]# ./perf test -vv 58
58: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 26349
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
0.000 probe_libc:inet_pton:(3ff925c2060))
test child finished with -1
 ---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@s8360047 perf]#

After:
[root@s35lp76 perf]# ./perf test -vv 57
57: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 38708
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms
0.000 probe_libc:inet_pton:(3ff87342060))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
__GI_getaddrinfo (inlined)
main (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/ping)
test child finished with 0
 ---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
[root@s35lp76 perf]#

On Intel the test case runs unchanged and succeeds.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:20 +02:00
Jiri Olsa
2f79b5e52d perf evsel: Fix period/freq terms setup
[ Upstream commit 49c0ae80eb ]

Stephane reported that we don't set properly PERIOD sample type for
events with period term defined.

Before:
  $ perf record -e cpu/cpu-cycles,period=1000/u ls
  $ perf evlist -v
  cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ...

After:
  $ perf record -e cpu/cpu-cycles,period=1000/u ls
  $ perf evlist -v
  cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ...

Setting PERIOD sample type based on period term setup.

Committer note:

When we use -c or a period=N term in the event definition, then we don't
need to ask the kernel, for this event, via perf_event_attr.sample_type
|= PERF_SAMPLE_PERIOD, to put the event period in each sample for this
event, as we know it already, it is in perf_event_attr.sample_period.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180201083812.11359-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:17 +02:00
Jiri Olsa
76e3ea2f95 perf record: Fix period option handling
[ Upstream commit f290aa1ffa ]

Stephan reported we don't unset PERIOD sample type when --no-period is
specified. Adding the unset check and reset PERIOD if --no-period is
specified.

Committer notes:

Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when
--no-period is used.

Before:

  # perf record --no-period sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]
  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  #

After:

[root@jouet ~]# perf record --no-period sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ]
[root@jouet ~]# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
[root@jouet ~]#

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:17 +02:00
Thomas Richter
77d17d0e89 perf record: Fix failed memory allocation for get_cpuid_str
[ Upstream commit 81fccd6ca5 ]

In x86 architecture dependend part function get_cpuid_str() mallocs a
128 byte buffer, but does not check if the memory allocation succeeded
or not.

When the memory allocation fails, function __get_cpuid() is called with
first parameter being a NULL pointer.  However this function references
its first parameter and operates on a NULL pointer which might cause
core dumps.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117131611.34319-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:06 +02:00
Arnaldo Carvalho de Melo
4e63115b6b perf callchain: Fix attr.sample_max_stack setting
[ Upstream commit 249d98e567 ]

When setting the "dwarf" unwinder for a specific event and not
specifying the max-stack, the attr.sample_max_stack ended up using an
uninitialized callchain_param.max_stack, fix it by using designated
initializers for that callchain_param variable, zeroing all non
explicitely initialized struct members.

Here is what happened:

  # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
    type                             2
    size                             112
    config                           0x730
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
    exclude_callchain_user           1
    { wakeup_events, wakeup_watermark } 1
    sample_regs_user                 0xff0fff
    sample_stack_user                8192
    sample_max_stack                 50656
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
    type                             2
    size                             112
    config                           0x730
    sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
    exclude_callchain_user           1
    sample_regs_user                 0xff0fff
    sample_stack_user                8192
    sample_max_stack                 30448
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  #

Now the attr.sample_max_stack is set to zero and the above works as
expected:

  # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
       0.000 probe_libc:inet_pton:(7feb7a998350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa39b6108f3f] (/usr/bin/ping)
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:06 +02:00
Arnaldo Carvalho de Melo
0eda4d03ef perf unwind: Do not look just at the global callchain_param.record_mode
[ Upstream commit eabad8c685 ]

When setting up DWARF callchains on specific events, without using
'record' or 'trace' --call-graph, but instead doing it like:

	perf trace -e cycles/call-graph=dwarf/

The unwind__prepare_access() call in thread__insert_map() when we
process PERF_RECORD_MMAP(2) metadata events were not being performed,
precluding us from using per-event DWARF callchains, handling them just
when we asked for all events to be DWARF, using "--call-graph dwarf".

We do it in the PERF_RECORD_MMAP because we have to look at one of the
executable maps to figure out the executable type (64-bit, 32-bit) of
the DSO laid out in that mmap. Also to look at the architecture where
the perf.data file was recorded.

All this probably should be deferred to when we process a sample for
some thread that has callchains, so that we do this processing only for
the threads with samples, not for all of them.

For now, fix using DWARF on specific events.

Before:

  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
     0.000 probe_libc:inet_pton:(7fe9597bb350))
  Problem processing probe_libc:inet_pton callchain, skipping...
  #

After:

  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
       0.000 probe_libc:inet_pton:(7fd4aa930350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa804e51af3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa804e51b379] (/usr/bin/ping)
  #
  # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
       0.000 probe_libc:inet_pton:(7f9363b9e350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffa9e8a14e0f3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffa9e8a14e1379] (/usr/bin/ping)
  #
  # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
       0.000 probe_libc:inet_pton:(7f4947e1c350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa716d88ef3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa716d88f379] (/usr/bin/ping)
  #
  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
       0.000 probe_libc:inet_pton:(7fa157696350))
                                         __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                         getaddrinfo (/usr/lib64/libc-2.26.so)
                                         [0xffffa9ba39c74f40] (/usr/bin/ping)
  #

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26 11:02:06 +02:00
Adrian Hunter
674e18de7b perf intel-pt: Fix timestamp following overflow
commit 91d29b288a upstream.

timestamp_insn_cnt is used to estimate the timestamp based on the number of
instructions since the last known timestamp.

If the estimate is not accurate enough decoding might not be correctly
synchronized with side-band events causing more trace errors.

However there are always timestamps following an overflow, so the
estimate is not needed and can indeed result in more errors.

Suppress the estimate by setting timestamp_insn_cnt to zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19 08:56:17 +02:00
Adrian Hunter
4039579fca perf intel-pt: Fix error recovery from missing TIP packet
commit 1c196a6c77 upstream.

When a TIP packet is expected but there is a different packet, it is an
error. However the unexpected packet might be something important like a
TSC packet, so after the error, it is necessary to continue from there,
rather than the next packet. That is achieved by setting pkt_step to
zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19 08:56:17 +02:00
Adrian Hunter
0733facf3b perf intel-pt: Fix sync_switch
commit 63d8e38f6a upstream.

sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.

The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.

The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.

That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19 08:56:17 +02:00
Adrian Hunter
ff295906bd perf intel-pt: Fix overlap detection to identify consecutive buffers correctly
commit 117db4b27b upstream.

Overlap detection was not not updating the buffer's 'consecutive' flag.
Marking buffers consecutive has the advantage that decoding begins from
the start of the buffer instead of the first PSB. Fix overlap detection
to identify consecutive buffers correctly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19 08:56:17 +02:00
Jiri Olsa
6a88a999c4 perf tools: Fix copyfile_offset update of output offset
[ Upstream commit fa1195ccc0 ]

We need to increase output offset in each iteration, not decrease it as
we currently do.

I guess we were lucky to finish in most cases in first iteration, so the
bug never showed. However it shows a lot when working with big (~4GB)
size data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 9c9f5a2f19 ("perf tools: Introduce copyfile_offset() function")
Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12 12:32:21 +02:00
Jin Yao
01ff15fcf4 perf report: Fix a no annotate browser displayed issue
[ Upstream commit 40c39e3046 ]

When enabling '-b' option in perf record, for example,

  perf record -b ...
  perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

        notes = symbol__annotation(act->ms.sym);
        if (!notes->src)
                return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12 12:32:16 +02:00
Mengting Zhang
93b8f4a230 perf evsel: Enable ignore_missing_thread for pid option
[ Upstream commit ca8000684e ]

While monitoring a multithread process with pid option, perf sometimes
may return sys_perf_event_open failure with 3(No such process) if any of
the process's threads die before we open the event. However, we want
perf continue monitoring the remaining threads and do not exit with
error.

Here, the patch enables perf_evsel::ignore_missing_thread for -p option
to ignore complete failure if any of threads die before we open the event.
But it may still return sys_perf_event_open failure with 22(Invalid) if we
monitors several event groups.

        sys_perf_event_open: pid 28960  cpu 40  group_fd 118202  flags 0x8
        sys_perf_event_open: pid 28961  cpu 40  group_fd 118203  flags 0x8
        WARNING: Ignored open failure for pid 28962
        sys_perf_event_open: pid 28962  cpu 40  group_fd [118203]  flags 0x8
        sys_perf_event_open failed, error -22

That is because when we ignore a missing thread, we change the thread_idx
without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
may return a wrong group_fd for the next thread and sys_perf_event_open()
return with 22.

        sys_perf_event_open(){
           ...
           if (group_fd != -1)
               perf_fget_light()//to get corresponding group_leader by group_fd
           ...
           if (group_leader)
              if (group_leader->ctx->task != ctx->task)//should on the same task
                   goto err_context
           ...
        }

This patch also fixes this bug by introducing perf_evsel__remove_fd() and
update_fds to allow removing fds for the missing thread.

Changes since v1:
- Change group_fd__remove() into a more genetic way without changing code logic
- Remove redundant condition

Changes since v2:
- Use a proper function name and add some comment.
- Multiline comment style fixes.

Committer testing:

Before this patch the recently added 'perf stat --per-thread' for system
wide counting would race while enumerating all threads using /proc:

  [root@jouet ~]# perf stat --per-thread
  failed to parse CPUs map: No such file or directory

   Usage: perf stat [<options>] [<command>]

      -C, --cpu <cpu>       list of cpus to monitor in system-wide
      -a, --all-cpus        system-wide collection from all CPUs
  [root@jouet ~]# perf stat --per-thread
  failed to parse CPUs map: No such file or directory

   Usage: perf stat [<options>] [<command>]

      -C, --cpu <cpu>       list of cpus to monitor in system-wide
      -a, --all-cpus        system-wide collection from all CPUs
  [root@jouet ~]#

When, say, the kernel was being built, so lots of shortlived threads,
after this patch this doesn't happen.

Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Cheng Jian <cj.chengjian@huawei.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
[ Remove one use 'evlist' alias variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12 12:32:12 +02:00
Masami Hiramatsu
d606bac136 perf probe: Add warning message if there is unexpected event name
[ Upstream commit 9f5c6d8777 ]

This improve the error message so that user can know event-name error
before writing new events to kprobe-events interface.

E.g.
   ======
   #./perf probe -x /lib64/libc-2.25.so malloc_get_state*
   Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
     Error: Failed to add events.
   ======

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12 12:32:12 +02:00
Masami Hiramatsu
3efc86f667 perf probe: Find versioned symbols from map
[ Upstream commit 4b3a2716dd ]

Commit d80406453a ("perf symbols: Allow user probes on versioned
symbols") allows user to find default versioned symbols (with "@@") in
map. However, it did not enable normal versioned symbol (with "@") for
perf-probe.  E.g.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
  Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
    Error: Failed to add events.
  =====

This solves above issue by improving perf-probe symbol search function,
as below.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
  Added new event:
    probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)

  You can now use it in all perf tools, such as:

	  perf record -e probe_libc:malloc_get_state -aR sleep 1

  # ./perf probe -l
    probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
  =====

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12 12:32:12 +02:00
Ilya Pronin
b69902a420 perf stat: Fix CVS output format for non-supported counters
commit 40c21898ba upstream.

When printing stats in CSV mode, 'perf stat' appends extra separators
when a counter is not supported:

<not supported>,,L1-dcache-store-misses,mesos/bd442f34-2b4a-47df-b966-9b281f9f56fc,0,100.00,,,,

Which causes a failure when parsing fields. The numbers of separators
should be the same for each line, no matter if the counter is or not
supported.

Signed-off-by: Ilya Pronin <ipronin@twitter.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20180306064353.31930-1-xiyou.wangcong@gmail.com
Fixes: 92a61f6412 ("perf stat: Implement CSV metrics output")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28 18:24:48 +02:00
Thomas Richter
e6fb81cb22 perf annotate: Fix objdump comment parsing for Intel mov dissassembly
[ Upstream commit 35a8a148d8 ]

The command 'perf annotate' parses the output of objdump and also
investigates the comments produced by objdump. For example the
output of objdump produces (on x86):

23eee:  4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
                                # 234008 <stderr@@GLIBC_2.2.5+0x9a8>

and the function mov__parse() is called to investigate the complete
line. Mov__parse() breaks this line into several parts and finally
calls function comment__symbol() to parse the data after the comment
character '#'. Comment__symbol() expects a hexadecimal address followed
by a symbol in '<' and '>' brackets.

However the 2nd parameter given to function comment__symbol()
always points to the comment character '#'. The address parsing
always returns 0 because the character '#' is not a digit and
strtoull() fails without being noticed.

Fix this by advancing the second parameter to function comment__symbol()
by one byte before invocation and add an error check after strtoull()
has been called.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 6de783b6f5 ("perf annotate: Resolve symbols using objdump comment")
Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19 08:42:52 +01:00
Thomas Richter
f9b186caa0 perf annotate: Fix unnecessary memory allocation for s390x
[ Upstream commit 36c263607d ]

This patch fixes a bug introduced with commit d9f8dfa9ba ("perf
annotate s390: Implement jump types for perf annotate").

'perf annotate' displays annotated assembler output by reading output of
command objdump and parsing the disassembled lines. For each shown
mnemonic this function sequence is executed:

  disasm_line__new()
  |
  +--> disasm_line__init_ins()
       |
       +--> ins__find()
            |
            +--> arch->associate_instruction_ops()

The s390x specific function assigned to function pointer
associate_instruction_ops refers to function s390__associate_ins_ops().

This function checks for supported mnemonics and assigns a NULL pointer
to unsupported mnemonics.  However even the NULL pointer is added to the
architecture dependend instruction array.

This leads to an extremely large architecture instruction array
(due to array resize logic in function arch__grow_instructions()).

Depending on the objdump output being parsed the array can end up
with several ten-thousand elements.

This patch checks if a mnemonic is supported and only adds supported
ones into the architecture instruction array. The array does not contain
elements with NULL pointers anymore.

Before the patch (With some debug printf output):

[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb

real	8m49.679s
user	7m13.008s
sys	0m1.649s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
			/tmp/xxxbb | tail -1
__ins__find sorted:1 nr_instructions:87433 ins:0x341583c0
[root@s35lp76 perf]#

The number of different s390x branch/jump/call/return instructions
entered into the array is 87433.

After the patch (With some printf debug output:)

[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa

real	1m24.553s
user	0m0.587s
sys	0m1.530s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
			/tmp/xxxaa | tail -1
__ins__find sorted:1 nr_instructions:56 ins:0x3f406570
[root@s35lp76 perf]#

The number of different s390x branch/jump/call/return instructions
entered into the array is 56 which is sensible.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19 08:42:52 +01:00
Adrian Hunter
c192a793f0 perf tools: Fix trigger class trigger_on()
commit de19e5c3c5 upstream.

trigger_on() means that the trigger is available but not ready, however
trigger_on() was making it ready. That can segfault if the signal comes
before trigger_ready(). e.g. (USR2 signal delivery not shown)

  $ perf record -e intel_pt//u -S sleep 1
  perf: Segmentation fault
  Obtained 16 stack frames.
  /home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550]
  /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
  /home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6]
  /home/ahunter/bin/perf() [0x43a45b]
  /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
  /lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5]
  /home/ahunter/bin/perf() [0x4ec6c9]
  /home/ahunter/bin/perf() [0x4ec73b]
  /home/ahunter/bin/perf() [0x4ec73b]
  /home/ahunter/bin/perf() [0x4ec73b]
  /home/ahunter/bin/perf() [0x4eca15]
  /home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77]
  /home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0]
  /home/ahunter/bin/perf(cmd_record+0x722) [0x43c132]
  /home/ahunter/bin/perf() [0x4a11ae]
  /home/ahunter/bin/perf(main+0x5d4) [0x427fb4]

Note, for testing purposes, this is hard to hit unless you add some sleep()
in builtin-record.c before record__open().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes: 3dcc4436fa ("perf tools: Introduce trigger class")
Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15 10:54:34 +01:00
Thomas Richter
6f8a0b0952 perf test: Fix test 21 for s390x
[ Upstream commit 996548499d ]

Test case 21 (Number of exit events of a simple workload) fails on
s390x. The reason is the invalid sample frequency supplied for this
test. On s390x the minimum sample frequency is much higher (see output
of /proc/service_levels).

Supply a save sample frequency value for s390x to fix this.  The value
will be adjusted by the s390x CPUMF frequency convertion function to a
value well below the sysctl kernel.perf_event_max_sample_rate value.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LPU-Reference: 20171123114611.93397-1-tmricht@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-1ynblyhi1n81idpido59nt1y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:55 +01:00
Satheesh Rajendran
8b6c6ab154 perf bench numa: Fixup discontiguous/sparse numa nodes
[ Upstream commit 321a7c35c9 ]

Certain systems are designed to have sparse/discontiguous nodes.  On
such systems, 'perf bench numa' hangs, shows wrong number of nodes and
shows values for non-existent nodes. Handle this by only taking nodes
that are exposed by kernel to userspace.

Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1edbcd353c009e109e93d78f2f46381930c340fe.1511368645.git.sathnaga@linux.vnet.ibm.com
Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:55 +01:00
Jiri Olsa
7efaeefce5 perf top: Fix window dimensions change handling
[ Upstream commit 89d0aeab42 ]

The stdio perf top crashes when we change the terminal
window size. The reason is that we assumed we get the
perf_top pointer as a signal handler argument which is
not the case.

Changing the SIGWINCH handler logic to change global
resize variable, which is checked in the main thread
loop.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ysuzwz77oev1ftgvdscn9bpu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:55 +01:00
Thomas Richter
475e6b835d perf test shell: Fix check open filename arg using 'perf trace' on s390x
[ Upstream commit ccafc38f1c ]

This 'perf test' case fails on s390x. The 'touch' command on s390x uses
the 'openat' system call to open the file named on the command line:

[root@s35lp76 perf]# perf probe -l
  probe:vfs_getname    (on getname_flags:72@fs/namei.c with pathname)
[root@s35lp76 perf]# perf trace -e open touch /tmp/abc
     0.400 ( 0.015 ms): touch/27542 open(filename:
		/usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
[root@s35lp76 perf]#

There is no 'open' system call for file '/tmp/abc'. Instead the 'openat'
system call is used:

[root@s35lp76 perf]# strace touch /tmp/abc
    execve("/usr/bin/touch", ["touch", "/tmp/abc"], 0x3ffd547ec98
			/* 30 vars */) = 0
    [...]
    openat(AT_FDCWD, "/tmp/abc", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
    [...]

On s390x the 'egrep' command does not find a matching pattern and
returns an error.

Fix this for s390x create a platform dependent command line to enable
the 'perf probe' call to listen to the 'openat' system call and get the
expected output.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20171114071847.2381-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-3qf38jk0prz54rhmhyu871my@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:54 +01:00
Ravi Bangoria
863b61caae perf annotate: Do not truncate instruction names at 6 chars
[ Upstream commit 05d0e62d9f ]

There are many instructions, esp on PowerPC, whose mnemonics are longer
than 6 characters. Using precision limit causes truncation of such
mnemonics.

Fix this by removing precision limit. Note that, 'width' is still 6, so
alignment won't get affected for length <= 6.

Before:

   li     r11,-1
   xscvdp vs1,vs1
   add.   r10,r10,r11

After:

  li     r11,-1
  xscvdpsxds vs1,vs1
  add.   r10,r10,r11

Reported-by: Donald Stence <dstence@us.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20171114032540.4564-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:54 +01:00
Namhyung Kim
182d948c7a perf help: Fix a bug during strstart() conversion
[ Upstream commit af98f2273f ]

The commit 8e99b6d453 changed prefixcmp() to strstart() but missed to
change the return value in some place.  It makes perf help print
annoying output even for sane config items like below:

  $ perf help
  '.root': unsupported man viewer sub key.
  ...

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sihyeon Jang <uneedsihyeon@gmail.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20171114001542.GA16464@sejong
Fixes: 8e99b6d453 ("tools include: Adopt strstarts() from the kernel")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:54 +01:00
Andi Kleen
bfb3906919 perf record: Fix -c/-F options for cpu event aliases
[ Upstream commit 59622fd496 ]

The Intel PMU event aliases have a implicit period= specifier to set the
default period.

Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.

Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.

I only did it for -c/-F for now, I think that's the only case that's
broken currently.

Before:

$ perf record -c 1000 -vv -e uops_issued.any
...
  { sample_period, sample_freq }   2000003

After:

$ perf record -c 1000 -vv -e uops_issued.any
...
  { sample_period, sample_freq }   1000

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171020202755.21410-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:07:54 +01:00
Levin, Alexander (Sasha Levin)
b9870f8581 kmemcheck: remove whats left of NOTRACK flags
commit d8be75663c upstream.

Now that kmemcheck is gone, we don't need the NOTRACK flags.

Link: http://lkml.kernel.org/r/20171007030159.22241-5-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22 15:42:23 +01:00
Randy Dunlap
da8eb8ad0e x86/decoder: Fix and update the opcodes map
commit f5b5fab178 upstream

Update x86-opcode-map.txt based on the October 2017 Intel SDM publication.
Fix INVPID to INVVPID.
Add UD0 and UD1 instruction opcodes.

Also sync the objtool and perf tooling copies of this file.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-29 17:53:42 +01:00
Martin Kepplinger
6d29ae49cd perf tools: Fix leaking rec_argv in error cases
[ Upstream commit c896f85a7c ]

Let's free the allocated rec_argv in case we return early, in order to
avoid leaking memory.

This adds free() at a few very similar places across the tree where it
was missing.

Signed-off-by: Martin Kepplinger <martink@posteo.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Martin kepplinger <martink@posteo.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170913191419.29806-1-martink@posteo.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-10 13:40:43 +01:00
Thomas Richter
c530d86ded perf test attr: Fix python error on empty result
[ Upstream commit 3440fe2790 ]

Commit d78ada4a76 ("perf tests attr: Do not store failed events") does
not create an event file in the /tmp directory when the
perf_open_event() system call failed.

This can lead to a situation where not /tmp/event-xx-yy-zz result file
exists at all (for example on a s390x virtual machine environment) where
no CPUMF hardware is available.

The following command then fails with a python call back chain instead
of printing failure:

  [root@s8360046 perf]# /usr/bin/python2 ./tests/attr.py -d ./tests/attr/ \
      -p ./perf -v -ttest-stat-basic
  running './tests/attr//test-stat-basic'
  Traceback (most recent call last):
    File "./tests/attr.py", line 379, in <module>
      main()
    File "./tests/attr.py", line 370, in main
      run_tests(options)
    File "./tests/attr.py", line 311, in run_tests
      Test(f, options).run()
    File "./tests/attr.py", line 300, in run
      self.compare(self.expect, self.result)
    File "./tests/attr.py", line 248, in compare
      exp_event.diff(res_event)
  UnboundLocalError: local variable 'res_event' referenced before assignment
  [root@s8360046 perf]#

This patch catches this pitfall and prints an error message instead:

  [root@s8360047 perf]# /usr/bin/python2 ./tests/attr.py -d ./tests/attr/ \
       -p ./perf  -vvv -ttest-stat-basic
  running './tests/attr//test-stat-basic'
    loading expected events
      Event event:base-stat
        fd = 1
        group_fd = -1
        flags = 0|8
        [....]
        sample_regs_user = 0
        sample_stack_user = 0
    'PERF_TEST_ATTR=/tmp/tmpJbMQMP ./perf stat -o /tmp/tmpJbMQMP/perf.data -e cycles kill >/dev/null 2>&1' ret '1', expected '1'
    loading result events
    compare
      matching [event:base-stat]
      match: [event:base-stat] matches []
      res_event is empty
  FAILED './tests/attr//test-stat-basic' - match failure
  [root@s8360047 perf]#

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20170913081209.39570-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-04d63nn7svfgxdhi60gq2mlm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-10 13:40:42 +01:00
Thomas Richter
6bb9d1772d perf test attr: Fix ignored test case result
[ Upstream commit 22905582f6 ]

Command perf test -v 16 (Setup struct perf_event_attr test) always
reports success even if the test case fails.  It works correctly if you
also specify -F (for don't fork).

   root@s35lp76 perf]# ./perf test -v 16
   15: Setup struct perf_event_attr               :
   --- start ---
   running './tests/attr/test-record-no-delay'
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.002 MB /tmp/tmp4E1h7R/perf.data
     (1 samples) ]
   expected task=0, got 1
   expected precise_ip=0, got 3
   expected wakeup_events=1, got 0
   FAILED './tests/attr/test-record-no-delay' - match failure
   test child finished with 0
   ---- end ----
   Setup struct perf_event_attr: Ok

The reason for the wrong error reporting is the return value of the
system() library call. It is called in run_dir() file tests/attr.c and
returns the exit status, in above case 0xff00.

This value is given as parameter to the exit() function which can only
handle values 0-0xff.

The child process terminates with exit value of 0 and the parent does
not detect any error.

This patch corrects the error reporting and prints the correct test
result.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20170913081209.39570-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-rdube6rfcjsr1nzue72c7lqn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-10 13:40:42 +01:00
Andrei Vagin
33974a414c perf trace: Call machine__exit() at exit
Otherwise 'perf trace' leaves a temporary file /tmp/perf-vdso.so-XXXXXX.

  $ perf trace -o log true
  $ ls -l /tmp/perf-vdso.*
  -rw------- 1 root root 8192 Nov  8 03:08 /tmp/perf-vdso.so-5bCpD0

Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vasily Averin <vvs@virtuozzo.com>
Link: http://lkml.kernel.org/r/20171108002246.8924-1-avagin@openvz.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:17:32 -03:00
Jiri Olsa
a271bfaf30 perf tools: Fix eBPF event specification parsing
Looks like I've reached the new level of stupidity, adding missing braces.

Committer testing:

Given the following eBPF C filter, that will add a record when it
returns true, i.e. when the tv_nsec variable is > 2000ns, should be
built and installed via sys_bpf(), but fails to do so before this patch:

  # cat filter.c
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))

  SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
  int func(void *ctx, int err, long nsec)
  {
	  return nsec > 1000;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  #

  # perf trace -e nanosleep,filter.c usleep 1
  invalid or unsupported event: 'filter.c'
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

And works again after it is applied, the nothing is inserted when the co

  # perf trace -e *sleep,filter.c usleep 1
     0.000 ( 0.066 ms): usleep/23994 nanosleep(rqtp: 0x7ffead94a0d0) = 0
  # perf trace -e *sleep,filter.c usleep 2
     0.000 ( 0.008 ms): usleep/24378 nanosleep(rqtp: 0x7fffa021ba50) ...
     0.008 (         ): perf_bpf_probe:func:(ffffffffb410cb30) tv_nsec=2000)
     0.000 ( 0.066 ms): usleep/24378  ... [continued]: nanosleep()) = 0
  #

The intent of 9445464bb8 is kept:

  # perf stat -e 'cpu/uops_executed.core,krava/'  true
  event syntax error: '..cuted.core,krava/'
                                    \___ unknown term

  valid terms: cmask,pc,event,edge,in_tx,any,ldlat,inv,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  #
  # perf stat -e 'cpu/uops_executed.core,period=1/'  true

   Performance counter stats for 'true':

           808,332      cpu/uops_executed.core,period=1/

       0.002997237 seconds time elapsed

  #

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-diea0ihbwpxfw6938huv3whj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:10:58 -03:00
Jiri Olsa
b6af53b7d6 perf tools: Add "reject" option for parse-events.l
Arnaldo reported broken builds in some distros using a newer flex
release, 2.6.4, found in Alpine Linux 3.6 and Edge, with flex not
spotting the REJECT macro:

  CC       /tmp/build/perf/util/parse-events-flex.o
  util/parse-events.l: In function 'parse_events_lex':
  /tmp/build/perf/util/parse-events-flex.c:4734:16: error: \
  'reject_used_but_not_detected' undeclared (first use in this function)

It's happening because we put the REJECT under another USER_REJECT macro
in following commit:

  9445464bb8 perf tools: Unwind properly location after REJECT

Fortunately flex provides option for force it to use REJECT, adding it
to parse-events.l.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-7kdont984mw12ijk7rji6b8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:09:03 -03:00
Ingo Molnar
294cbd05e3 Merge branch 'linus' into perf/urgent, to pick up dependent commits
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-03 12:30:12 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Jiri Olsa
9445464bb8 perf tools: Unwind properly location after REJECT
We have defined YY_USER_ACTION to keep trace of the column location
during events parsing, but we need to clean it up when we call REJECT.

When REJECT is called, the lexer shrinks the text and re-runs the
matching, so we need to address it in resuming the previous location
value to keep it correct for error display, like:

Before:
  $ perf stat -e 'cpu/uops_executed.core,krava/'  true
  event syntax error: '..38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;1\
1;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;\
21;38;50
�'
                                  \___ unknown term

After:
  $ ./perf stat -e 'cpu/uops_executed.core,krava/'  true
  event syntax error: '..cuted.core,krava/'
                                    \___ unknown term

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vug2hchlny30jfsfrumbym26@git.kernel.org
Link: http://lkml.kernel.org/r/20171009140944.GD28623@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-27 11:42:51 -03:00
Ravi Bangoria
331c7cb307 perf symbols: Fix memory corruption because of zero length symbols
Perf top is often crashing at very random locations on powerpc.  After
investigating, I found the crash only happens when sample is of zero
length symbol. Powerpc kernel has many such symbols which does not
contain length details in vmlinux binary and thus start and end
addresses of such symbols are same.

Structure

  struct sym_hist {
        u64                   nr_samples;
        u64                   period;
        struct sym_hist_entry addr[0];
  };

has last member 'addr[]' of size zero. 'addr[]' is an array of addresses
that belongs to one symbol (function). If function consist of 100
instructions, 'addr' points to an array of 100 'struct sym_hist_entry'
elements. For zero length symbol, it points to the *empty* array, i.e.
no members in the array and thus offset 0 is also invalid for such
array.

  static int __symbol__inc_addr_samples(...)
  {
        ...
        offset = addr - sym->start;
        h = annotation__histogram(notes, evidx);
        h->nr_samples++;
        h->addr[offset].nr_samples++;
        h->period += sample->period;
        h->addr[offset].period += sample->period;
        ...
  }

Here, when 'addr' is same as 'sym->start', 'offset' becomes 0, which is
valid for normal symbols but *invalid* for zero length symbols and thus
updating h->addr[offset] causes memory corruption.

Fix this by adding one dummy element for zero length symbols.

Link: https://lkml.org/lkml/2016/10/10/148
Fixes: edee44be59 ("perf annotate: Don't throw error for zero length symbols")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1508854806-10542-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-25 13:01:09 -03:00
Li Zhijian
74f8e22c15 perf test shell trace+probe_libc_inet_pton.sh: Be compatible with Debian/Ubuntu
In debian/ubuntu, libc.so is located at a different place,
/lib/x86_64-linux-gnu/libc-2.23.so, so it outputs like this when testing:

  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.040 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.040/0.040/0.040/0.000 ms
  0.000 probe_libc:inet_pton:(7f0e2db741c0))
  __GI___inet_pton (/lib/x86_64-linux-gnu/libc-2.23.so)
  getaddrinfo (/lib/x86_64-linux-gnu/libc-2.23.so)
  [0xffffa9d40f34ff4d] (/bin/ping)

Fix up the libc path to make sure this test works in more OSes.

Committer testing:

When this test fails one can use 'perf test -v', i.e. in verbose mode, where
it'll show the expected backtrace, so, after applying this test:

On Fedora 26:

  # perf test -v ping
  62: probe libc's inet_pton & backtrace it with ping       :
  --- start ---
  test child forked, pid 23322
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms
  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms
  0.000 probe_libc:inet_pton:(7fe344310d80))
  __GI___inet_pton (/usr/lib64/libc-2.25.so)
  getaddrinfo (/usr/lib64/libc-2.25.so)
  _init (/usr/bin/ping)
  test child finished with 0
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Ok
  #

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philip Li <philip.li@intel.com>
Link: http://lkml.kernel.org/r/1508315649-18836-1-git-send-email-lizhijian@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-18 09:14:18 -03:00
Jin Yao
3d8bba9535 perf xyarray: Fix wrong processing when closing evsel fd
In current xyarray code, xyarray__max_x() returns max_y, and xyarray__max_y()
returns max_x.

It's confusing and for code logic it looks not correct.

Error happens when closing evsel fd. Let's see this scenario:

1. Allocate an fd (pseudo-code)

  perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
  {
	evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int));
  }

  xyarray__new(int xlen, int ylen, size_t entry_size)
  {
	size_t row_size = ylen * entry_size;
	struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);

	xy->entry_size = entry_size;
	xy->row_size   = row_size;
	xy->entries    = xlen * ylen;
	xy->max_x      = xlen;
	xy->max_y      = ylen;
	......
  }

So max_x is ncpus, max_y is nthreads and row_size = nthreads * 4.

2. Use perf syscall and get the fd

  int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
		     struct thread_map *threads)
  {
	for (cpu = 0; cpu < cpus->nr; cpu++) {

		for (thread = 0; thread < nthreads; thread++) {
			int fd, group_fd;

			fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
						 group_fd, flags);

			FD(evsel, cpu, thread) = fd;
	}
  }

  static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
  {
	return &xy->contents[x * xy->row_size + y * xy->entry_size];
  }

These codes don't have issues. The issue happens in the closing of fd.

3. Close fd.

  void perf_evsel__close_fd(struct perf_evsel *evsel)
  {
	int cpu, thread;

	for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++)
		for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) {
			close(FD(evsel, cpu, thread));
			FD(evsel, cpu, thread) = -1;
		}
  }

  Since xyarray__max_x() returns max_y (nthreads) and xyarry__max_y()
  returns max_x (ncpus), so above code is actually to be:

        for (cpu = 0; cpu < nthreads; cpu++)
                for (thread = 0; thread < ncpus; ++thread) {
                        close(FD(evsel, cpu, thread));
                        FD(evsel, cpu, thread) = -1;
                }

  It's not correct!

This change is introduced by "475fb533fb7d" ("perf evsel: Fix buffer overflow
while freeing events")

This fix is to let xyarray__max_x() return max_x (ncpus) and
let xyarry__max_y() return max_y (nthreads)

Committer note:

This was also fixed by Ravi Bangoria, who provided the same patch,
noticing the problem with 'perf record':

<quote Ravi>
I see 'perf record -p <pid>' crashes with following log:

   *** Error in `./perf': free(): invalid next size (normal): 0x000000000298b340 ***
   ======= Backtrace: =========
   /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7fd85c87e5]
   /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f7fd85d137a]
   /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7fd85d553c]
   ./perf(perf_evsel__close+0xb4)[0x4b7614]
   ./perf(perf_evlist__delete+0x100)[0x4ab180]
   ./perf(cmd_record+0x1d9)[0x43a5a9]
   ./perf[0x49aa2f]
   ./perf(main+0x631)[0x427841]
   /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7fd8571830]
   ./perf(_start+0x29)[0x427a59]
</>

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Fixes: d74be47673 ("perf xyarray: Save max_x, max_y")
Link: http://lkml.kernel.org/r/1508339478-26674-1-git-send-email-yao.jin@linux.intel.com
Link: http://lkml.kernel.org/r/1508327446-15302-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-18 09:09:36 -03:00
Namhyung Kim
7f0cd23615 perf buildid-list: Fix crash when processing PERF_RECORD_NAMESPACE
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events.  It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000000000 in ?? ()
  Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
  +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
  +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
  (gdb) where
  #0  0x0000000000000000 in ?? ()
  #1  0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
      evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
      tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
  #2  0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
      sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
  #3  perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
      file_offset=file_offset@entry=1136) at util/session.c:1522
  #4  0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
      data_offset=<optimized out>, session=0x0) at util/session.c:1899
  #5  perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
  #6  0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
      at builtin-buildid-list.c:83
  #7  cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
  #8  0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
      at perf.c:296
  #9  0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
  #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
  #11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
  (gdb)

Fix it by adding a stub event handler for namespace event.

Committer testing:

Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:

  # perf record -a --namespaces sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
  # perf buildid-list | wc -l
  38
  # perf buildid-list | head -5
  e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
  874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
  ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
  5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
  f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
  #

It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:

  # perf buildid-list -H
  Segmentation fault (core dumped)
  ^C
  #

Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes: f3b3614a28 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/20171017132900.11043-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-17 11:09:19 -03:00
Taeung Song
3f50f614d6 perf record: Fix documentation for a inexistent option '-l'
'perf record' had a '-l' option that meant "scale counter values" a very
long time ago, but it currently belongs to 'perf stat' as '-c'.  So
remove it. I found this problem in the below case.

    $ perf record -e cycles -l sleep 3
      Error: unknown switch `l

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1507907412-19813-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-17 09:05:36 -03:00
Jiri Olsa
29479bfe83 perf tools: Check wether the eBPF file exists in event parsing
Adding the check wether the eBPF file exists, to consider it
as eBPF input file. This way we can differentiate eBPF events
from events that end up with same suffix as eBPF file.

Before:

  $ perf stat -e 'cpu/uops_executed.core/'  true
  bpf: builtin compilation failed: -95, try external compiler
  WARNING:        unable to get correct kernel building directory.
  Hint:   Set correct kbuild directory using 'kbuild-dir' option in [llvm]
          section of ~/.perfconfig or set it to "" to suppress kbuild
          detection.

  event syntax error: 'cpu/uops_executed.core/'
                       \___ Failed to load cpu/uops_executed.c from source: 'version' section incorrect or lost

After:

  $ perf stat -e 'cpu/uops_executed.core/'  true

   Performance counter stats for 'true':

             181,533      cpu/uops_executed.core/:u

         0.002795447 seconds time elapsed

If user makes type in the eBPF file, we prioritize the event syntax
and show following warning:

  $ perf stat -e 'krava.c//'  true
  event syntax error: 'krava.c//'
                       \___ Cannot find PMU `krava.c'. Missing kernel support?

Reported-and-Tested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20171013083736.15037-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-13 16:45:04 -03:00
Jiri Olsa
d0e35234f6 perf hists: Add extra integrity checks to fmt_free()
Make sure the struct perf_hpp_fmt is properly unhooked before we free
it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20171013083736.15037-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-13 16:43:42 -03:00
Jiri Olsa
70b01dfd76 perf hists: Fix crash in perf_hpp__reset_output_field()
Du Changbin reported crash [1] when calling perf_hpp__reset_output_field()
after unregistering field via perf_hpp__column_unregister().

This ends up in calling following list_del* sequence on
the same format:

  perf_hpp__column_unregister:
    list_del(&format->list);
  perf_hpp__reset_output_field:
    list_del_init(&fmt->list);

where the later list_del_init might touch already freed formats.

Fixing this by replacing list_del() with list_del_init() in
perf_hpp__column_unregister().

[1] http://marc.info/?l=linux-kernel&m=149059595826019&w=2

Reported-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20171013083736.15037-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-13 16:43:33 -03:00
Mark Rutland
66ec11919a perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU
Currently, perf record is broken on arm/arm64 systems when the PMU is
specified explicitly as part of the event, e.g.

$ ./perf record -e armv8_cortex_a53/cpu_cycles/u true

In such cases, perf record fails to open events unless
perf_event_paranoid is set to -1, even if the PMU in question supports
mode exclusion. Further, even when perf_event_paranoid is toggled, no
samples are recorded.

This is an unintended side effect of commit:

  e3ba76deef ("perf tools: Force uncore events to system wide monitoring)

... which assumes that if a PMU has an associated cpu_map, it is an
uncore PMU, and forces events for such PMUs to be system-wide.

This is not true for arm/arm64 systems, which can have heterogeneous
CPUs. To account for this, multiple CPU PMUs are exposed, each with a
"cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM
PMUs do not have a "cpumask" file, and only have a "cpus" file. For the
gory details as to why, see commit:

 7e3fcffe95 ("perf pmu: Support alternative sysfs cpumask")

Given all of this, we can instead identify uncore PMUs by explicitly
checking for a "cpumask" file, and restore arm/arm64 PMU support back to
a working state. This patch does so, adding a new perf_pmu::is_uncore
field, and splitting the existing cpumask parsing so that it can be
reused.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by Will Deacon <will.deacon@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: 4.12+ <stable@vger.kernel.org>
Fixes: e3ba76deef ("perf tools: Force uncore events to system wide monitoring)
Link: http://lkml.kernel.org/r/1507315102-5942-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-09 15:48:46 -03:00
Mark Santaniello
e9516c0813 perf script: Add missing separator for "-F ip,brstack" (and brstackoff)
Prior to commit 55b9b50811 ("perf script: Support -F brstack,dso and
brstacksym,dso"), we were printing a space before the brstack data. It
seems that this space was important.  Without it, parsing is difficult.

Very sorry for the mistake.

Notice here how the "ip" and "brstack" run together:

$ perf script -F ip,brstack | head -n 1
          22e18c40x22e19e2/0x22e190b/P/-/-/0 0x22e19a1/0x22e19d0/P/-/-/0 0x22e195d/0x22e1990/P/-/-/0 0x22e18e9/0x22e1943/P/-/-/0 0x22e1a69/0x22e18c0/P/-/-/0 0x22e19f7/0x22e1a20/P/-/-/0 0x22e1910/0x22e19ee/P/-/-/0 0x22e19e2/0x22e190b/P/-/-/0 0x22e19a1/0x22e19d0/P/-/-/0 0x22e195d/0x22e1990/P/-/-/0 0x22e18e9/0x22e1943/P/-/-/0 0x22e1a69/0x22e18c0/P/-/-/0 0x22e19f7/0x22e1a20/P/-/-/0 0x22e1910/0x22e19ee/P/-/-/0 0x22e19e2/0x22e190b/P/-/-/0 0x22e19a1/0x22e19d0/P/-/-/0

After this diff, sanity is restored:

$ perf script -F ip,brstack | head -n 1
          22e18c4 0x22e19e2/0x22e190b/P/-/-/0  0x22e19a1/0x22e19d0/P/-/-/0  0x22e195d/0x22e1990/P/-/-/0  0x22e18e9/0x22e1943/P/-/-/0  0x22e1a69/0x22e18c0/P/-/-/0  0x22e19f7/0x22e1a20/P/-/-/0  0x22e1910/0x22e19ee/P/-/-/0  0x22e19e2/0x22e190b/P/-/-/0  0x22e19a1/0x22e19d0/P/-/-/0  0x22e195d/0x22e1990/P/-/-/0  0x22e18e9/0x22e1943/P/-/-/0  0x22e1a69/0x22e18c0/P/-/-/0  0x22e19f7/0x22e1a20/P/-/-/0  0x22e1910/0x22e19ee/P/-/-/0  0x22e19e2/0x22e190b/P/-/-/0  0x22e19a1/0x22e19d0/P/-/-/0

Signed-off-by: Mark Santaniello <marksan@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: 4.13+ <stable@vger.kernel.org>
Fixes: 55b9b50811 ("perf script: Support -F brstack,dso and brstacksym,dso")
Link: http://lkml.kernel.org/r/20171006080722.3442046-1-marksan@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-06 09:48:32 -03:00
Ravi Bangoria
c1fbc0cf81 perf callchain: Compare dsos (as well) for CCKEY_FUNCTION
Two functions from different binaries can have same start address. Thus,
comparing only start address in match_chain() leads to inconsistent
callchains. Fix this by adding a check for dsos as well.

Ex, https://www.spinics.net/lists/linux-perf-users/msg04067.html

Reported-by: Alexander Pozdneev <pozdneyev@gmail.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: zhangmengting@huawei.com
Link: http://lkml.kernel.org/r/20171005091234.5874-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-05 10:52:54 -03:00
Ingo Molnar
1addcd55bc Merge tag 'perf-urgent-for-mingo-4.14-20170928' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix syscalltbl build failure (Akemi Yagi)

- Fix attr.exclude_kernel setting for default cycles:p, this time for
  !root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo)

- Sync kernel ABI headers with tooling headers (Ingo Molnar)

- Remove misleading debug messages with --call-graph option (Mengting Zhang)

- Revert vmlinux symbol resolution patches for s390x (Thomas Richter)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29 19:31:46 +02:00
Thomas Richter
5357413f5c perf test: Fix vmlinux failure on s390x part 2
On s390x perf test 1 failed. It turned out that commit cf6383f73c
("perf report: Fix kernel symbol adjustment for s390x") was incorrect.

The previous implementation in dso__load_sym() is also suitable for
s390x.

Therefore this patch undoes commit cf6383f73c

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Fixes: cf6383f73c ("perf report: Fix kernel symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-v101o8k25vuja2ogosgf15yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-28 13:01:42 -03:00