Commit Graph

18572 Commits

Author SHA1 Message Date
Sangwon Hong
577980a000 perf kmem: Document a missing option & an argument
First, 'perf kmem' has a '--force' option, but didn't document it on the
man page. So add it.

Second, the '--time' option has to get a value, but isn't documented on
the man page. Describe it.

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1518381517-30766-1-git-send-email-qpakzk@gmail.com
[ Add blank like after --force block, as requested by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:42 -03:00
Jaecheol Shin
ac2c306838 perf annotate: Add missing arguments in Man page
Some options must require an argument. But input, stdio-color, cpu have
no them.  So I added it.

Signed-off-by: Jaecheol Shin <jcgod413@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20180207095205.62715-1-jcgod413@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:41 -03:00
Mathieu Poirier
796bfadd83 perf cs-etm: Properly deal with cpu maps
This patch allows the CoreSight AUX info section to fit topologies where
only a subset of all available CPUs are present, avoiding at the same
time accessing the ETM configuration areas of CPUs that have been
offlined.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1518478737-24649-1-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:41 -03:00
Mathieu Poirier
d2785de15f perf auxtrace arm: Fixing uninitialised variable
When working natively on arm64 the compiler gets pesky and complains
that variable 'i' is uninitialised, something that breaks the
compilation.  Here no further checks are needed since variable
'found_spe' can only be true if variable 'i' has been initialised as
part of the for loop.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1518467557-18505-4-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:40 -03:00
Jin Yao
147c508f30 perf tools: Use target->per_thread and target->system_wide flags
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 73c0ca1eee ("perf thread_map: Enumerate all threads from /proc")
Link: http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:40 -03:00
Mathieu Poirier
099c113099 perf cs-etm: Freeing allocated memory
This patch frees all the memory allocated in function
cs_etm__alloc_queue().

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1518467557-18505-2-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:39 -03:00
Jiri Olsa
ab6e9a9934 perf tests: Use arch__compare_symbol_names to compare symbols
The symbol search called by machine__find_kernel_symbol_by_name is using
internally arch__compare_symbol_names function to compare 2 symbol
names, because different archs have different ways of comparing symbols.
Mostly for skipping '.' prefixes and similar.

In test 1 when we try to find matching symbols in kallsyms and vmlinux,
by address and by symbol name. When either is found we compare the pair
symbol names  by simple strcmp, which is not good enough for reasons
explained in previous paragraph.

On powerpc this can cause lockup, because even thought we found the
pair, the compared names are different and don't match simple strcmp.
Following code path is executed, that leads to lockup:

   - we find the pair in kallsyms by sym->start
next_pair:
   - we compare the names and it fails
   - we find the pair by sym->name
   - the pair addresses match so we call goto next_pair
     because we assume the names match in this case

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 031b84c407 ("perf probe ppc: Enable matching against dot symbols automatically")
Link: http://lkml.kernel.org/r/20180215122635.24029-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:26:01 -03:00
Jiri Olsa
a73e24d240 perf tools: Do not create kernel maps in sample__resolve()
There's no need for kernel maps to be allocated at this point - sample
processing.

We search for kernel maps using the kernel map_groups in machine::kmaps
which is static. If vmlinux maps for any reason still don't exist, the
search correctly fails because they are not in the map group.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:59 -03:00
Jiri Olsa
e8f3879f76 perf machine: Remove machine__load_kallsyms()
The current machine__load_kallsyms() function has no caller, so replace
it directly with __machine__load_kallsyms().  Also remove the no_kcore
argument as it was always called with a 'true' value.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:58 -03:00
Jiri Olsa
1fb87b8e95 perf machine: Don't search for active kernel start in __machine__create_kernel_maps
We should not search for the kernel start address in
__machine__create_kernel_maps(), because it's being used in the 'report'
code path, where we are interested in kernel MMAP data address (the one
recorded via 'perf record', possibly on another machine, or an older or
newer kernel on the same machine where analysis is being performed)
instead of in current kernel address.

The __machine__create_kernel_maps() function serves purely for creating
the machines kernel maps and setting up the kmap group. The report code
path then sets the address based on the data from kernel MMAP event in
the machine__set_kernel_mmap() function.

The kallsyms search address logic is used for test code, that calls
machine__create_kernel_maps() to get current maps and calls
machine__get_running_kernel_start() to get kernel starting address.

Use machine__set_kernel_mmap() to set the kernel maps start address and
moving map_groups__fixup_end to be call when all maps are in place.

Also make __machine__create_kernel_maps static, because there's no
external user.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:57 -03:00
Jiri Olsa
05db6ff73d perf machine: Generalize machine__set_kernel_mmap()
So it could be called without event object, just with start and end
values. It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:57 -03:00
Jiri Olsa
8c7f1bb37b perf machine: Move kernel mmap name into struct machine
It simplifies and centralizes the code. The kernel mmap name is set for
machine type, which we know from the beginning, so there's no reason to
generate it every time we need it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:57 -03:00
Jiri Olsa
81f981d7ec perf machine: Free root_dir in machine__init() error path
Free root_dir in machine__init() error path.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:56 -03:00
Jiri Olsa
c396296146 perf symbols: Check if we read regular file in dso__load()
The current code in dso__load() calls is_regular_file(), but it checks
its return value only after calling symsrc__init().

That can make symsrc__init() block in elf_* functions on reading
the file if the file happens to be device and not regular one.

Call symsrc__init() only for regular files. Also remove the
symsrc__destroy() cleanup, which is not needed now, because we call
symsrc__init() only for regular files.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:56 -03:00
Jiri Olsa
c53b4bb02b tools lib symbol: Skip non-address kallsyms line
Adding check on failed attempt to parse the address and skip the line
parsing early in that case.

The address can be replaced with '(null)' string in case user don't have
enough permissions, like:

  $ cat /proc/kallsyms
      (null) A irq_stack_union
      (null) A __per_cpu_start
      ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180215122635.24029-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:25:56 -03:00
yuzhoujian
f1f8ad52f8 perf stat: Add support to print counts after a period of time
Introduce a new option to print counts after N milliseconds and update
'perf stat' documentation accordingly.

Show below is the output of the new option for perf stat.

  $ perf stat --time 2000 -e cycles -a
  Performance counter stats for 'system wide':

        157,260,423      cycles

        2.003060766 seconds time elapsed

We can print the count deltas after N milliseconds with this new
introduced option. This option is not supported with "-I" option.

In addition, according to Kangliang's patch(19afd10410), the
monitoring overhead for system-wide core event could be very high if the
interval-print parameter was below 100ms, and the limitation value is
10ms.

So the same warning will be displayed when the time is set between 10ms
to 100ms, and the minimal time is limited to 10ms. Users can make a
decision according to their spcific cases.

Committer notes:

This actually stops the workload after the specified time, then prints
the counts.

So I renamed the option to --timeout and updated the documentation to
state that it will not just print the counts after the specified time,
but will really stop the 'perf stat' session and print the counts.

The rename from 'time' to 'timeout' also fixes the build in systems
where 'time' is used by glibc and can't be used as a name of a variable,
such as centos:5 and centos:6.

Changes since v3:
- none.

Changes since v2:
- modify the time check in __run_perf_stat func to keep some consistency
  with the workload case.
- add the warning when the time is set between 10ms to 100ms.
- add the pr_err when the time is set below 10ms.

Changes since v1:
- none.

Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1517217923-8302-3-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:18:06 -03:00
yuzhoujian
db06a269ec perf stat: Add support to print counts for fixed times
Introduce a new option to print counts for fixed number of times and
update 'perf stat' documentation accordingly.

Show below is the output of the new option for perf stat.

  $ perf stat -I 1000 --interval-count 2 -e cycles -a
  #           time             counts unit events
           1.002827089         93,884,870      cycles
           2.004231506         56,573,446      cycles

We can just print the counts for several times with this newly
introduced option. The usage of it is a little like 'vmstat', and it
should be used together with "-I" option.

  $ vmstat -n 1 2
  procs ---------memory-------------- --swap- ----io-- -system-- ------cpu---
   r  b swpd   free   buff   cache    si   so  bi   bo  in   cs us sy id wa st
   0  0    0 78270544 547484 51732076  0   0   0   20    1    1  1  0 99  0 0
   0  0    0 78270512 547484 51732080  0   0   0   16  477 1555  0  0 100 0 0

Changes since v3:
- merge interval_count check and times check to one line.
- fix the wrong indent in stat.h
- use stat_config.times instead of 'times' in cmd_stat function.

Changes since v2:
- none.

Changes since v1:
- change the name of the new option "times-print" to "interval-count".
- keep the new option interval specifically.

Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:24 -03:00
Jiri Olsa
ad52b8cb48 perf report: Add support to display group output for non group events
Add support to display group output for if non grouped events are
detected and user forces --group option. Now for non-group events
recorded like:

  $ perf record -e 'cycles,instructions' ls

you can still get group output by using --group option
in report:

  $ perf report --group --stdio
  ...
  #         Overhead  Command  Shared Object     Symbol
  # ................  .......  ................  ......................
  #
      17.67%   0.00%  ls       libc-2.25.so      [.] _IO_do_write@@GLIB
      15.59%  25.94%  ls       ls                [.] calculate_columns
      15.41%  31.35%  ls       libc-2.25.so      [.] __strcoll_l
  ...

Committer note:

We should improve on this by making sure that the first line states that
this is not a group, but since the user doesn't have to force group view
when really using grouped events (e.g. '{cycles,instructions}'), the
user better know what is being done...

Requested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180209092734.GB20449@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:24 -03:00
Jiri Olsa
8614ada0be perf report: Ask for ordered events for --tasks option
If we have the time in, keep the events in time order.

Committer notes:

Trying to be more verbose, what actual effect this will have in this particular
case?

Before and after this patch shows the artifacts:

  --- /tmp/before 2018-02-06 15:40:29.536411625 -0300
  +++ /tmp/after  2018-02-06 15:40:51.963403599 -0300
  @@ -5,34 +5,34 @@
         2540     2540     1818 |   gnome-terminal-
         3489     3489     2540 |    bash
        32433    32433     3489 |     perf
  -     32434    32434    32433 |      perf
  +     32434    32434    32433 |      make
        32441    32441    32434 |       make
        32514    32514    32441 |        make
          511      511    32514 |         sh
  -       512      512      511 |          sh
  +       512      512      511 |          install
<SNIP>

We don't have 'perf' calling 'perf' calling 'make', etc, the second
'perf' actually is 'make', i.e.  there was reordering of the relevant
PERF_RECORD_COMM and PERF_RECORD_FORK records.

Ditto for sh/install later on.

Look for FORK and COMM meta events, for those tids:

  # perf report -D | egrep 'PERF_RECORD_(FORK|COMM)' | egrep '3243[34]'
  0 14774650990679 0x1a3cd8 [0x38]: PERF_RECORD_FORK(32433:32433):(3489:3489)
  1 14774652080381 0x1d6568 [0x30]: PERF_RECORD_COMM exec: perf:32433/32433
  1 14774742473340 0x1dbb48 [0x38]: PERF_RECORD_FORK(32434:32434):(32433:32433)
  0 14774752005779 0x1a4af8 [0x30]: PERF_RECORD_COMM exec: make:32434/32434
  0 14774753997960 0x1a5578 [0x38]: PERF_RECORD_FORK(32435:32435):(32434:32434)
  0 14774756070782 0x1a5618 [0x38]: PERF_RECORD_FORK(32438:32438):(32434:32434)
  0 14774757772939 0x1a5680 [0x38]: PERF_RECORD_FORK(32440:32440):(32434:32434)
  0 14774758230600 0x1a56e8 [0x38]: PERF_RECORD_FORK(32441:32441):(32434:32434)
  #

First column is the cpu, second is the timestamp.

So they are on different CPUs, thus ring buffers, and when we don't use
the ordered_events class, we end up mixing that up, use it to take
advantage of the PERF_RECORD_FINISHED_ROUND meta events to go on
ordering the events using the PERF_SAMPLE_TIME present in the
PERF_RECORD_{FORK,COMM,EXIT,SAMPLE,etc} records in the ring buffer.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
a7402c943b perf tools: Fix comment for sort__* compare functions
In commit 2f15bd8c6c ("perf tools: Fix "Command" sort_entry's cmp and
collapse function") we switched from pointer to string comparison.

But failed to remove related comments. Removing them and adding another
one to warn before pointer comparison in here.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
fdf7c49c20 perf tests: Fix dwarf unwind for stripped binaries
When we strip the perf binary, dwarf unwind test stop
to work. The reason is that strip will remove static
function symbols, which we need to check for unwind.

This change will keep this test working in cases where
the global symbols are put into dynamic symbol table,
which is the case on x86. It still won't work on powerpc.

Making those 5 local functions global, and adding
'test_dwarf_unwind__' to their names.

Committer testing:

Before:

  # perf test dwarf
  58: DWARF unwind                               : Ok
  # strip ~/bin/perf
  # perf test dwarf
  58: DWARF unwind                               : FAILED!
  # perf test -v dwarf
  58: DWARF unwind                               :
  --- start ---
  test child forked, pid 6590
  unwind: thread map already set, dso=/home/acme/bin/perf
  <SNIP>
  unwind: access_mem addr 0x7ffce6c48098 val 48563f, offset 1144
  unwind: test__dwarf_unwind:ip = 0x4a54e5 (0xa54e5)
  got: test__dwarf_unwind 0xa54e5, expecting test__dwarf_unwind
  unwind: '':ip = 0x4a50bb (0xa50bb)
  failed: got unresolved address 0xa50bb
  unwind failed
  test child finished with -1
  ---- end ----
  DWARF unwind: FAILED!
  #

After:

  # perf test dwarf
  58: DWARF unwind                               : Ok
  # strip ~/bin/perf
  # perf test dwarf
  58: DWARF unwind                               : Ok
  #
  # perf test -v dwarf
  58: DWARF unwind                               :
  --- start ---
  test child forked, pid 7219
  unwind: thread map already set, dso=/home/acme/bin/perf
  <SNIP>
  unwind: access_mem addr 0x7fff007da2c8 val 48575f, offset 1144
  unwind: test__arch_unwind_sample:ip = 0x589044 (0x189044)
  got: test__arch_unwind_sample 0x189044, expecting test__arch_unwind_sample
  unwind: test_dwarf_unwind__thread:ip = 0x4a52f7 (0xa52f7)
  got: test_dwarf_unwind__thread 0xa52f7, expecting test_dwarf_unwind__thread
  unwind: test_dwarf_unwind__compare:ip = 0x4a5468 (0xa5468)
  got: test_dwarf_unwind__compare 0xa5468, expecting test_dwarf_unwind__compare
  unwind: bsearch:ip = 0x7f6608ae94d8 (0x394d8)
  got: bsearch 0x394d8, expecting bsearch
  unwind: test_dwarf_unwind__krava_3:ip = 0x4a54d1 (0xa54d1)
  got: test_dwarf_unwind__krava_3 0xa54d1, expecting test_dwarf_unwind__krava_3
  unwind: test_dwarf_unwind__krava_2:ip = 0x4a550b (0xa550b)
  got: test_dwarf_unwind__krava_2 0xa550b, expecting test_dwarf_unwind__krava_2
  unwind: test_dwarf_unwind__krava_1:ip = 0x4a554b (0xa554b)
  got: test_dwarf_unwind__krava_1 0xa554b, expecting test_dwarf_unwind__krava_1
  unwind: test__dwarf_unwind:ip = 0x4a5605 (0xa5605)
  got: test__dwarf_unwind 0xa5605, expecting test__dwarf_unwind
  test child finished with 0
  ---- end ----
  DWARF unwind: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
d9c5f32240 tools lib api fs: Add sysfs__read_xll function
Adding sysfs__read_xll function to be able to read sysfs files with hex
numbers in, which do not have 0x prefix.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
6baddfc690 tools lib api fs: Add filename__read_xll function
Adding filename__read_xll function to be able to read files with hex
numbers in, which do not have 0x prefix.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
3233b37a71 perf script: Add --show-round-event to display PERF_RECORD_FINISHED_ROUND
Adding --show-round-event to display PERF_RECORD_FINISHED_ROUND events
like:

  # perf script --show-round-events 2>/dev/null
               yes  8591 [002] 124177.397597:         18         cpu/mem-stores/P: ff...
               yes  8591 [002] 124177.397615:          1 cpu/mem-loads,ldlat=30/P: ff...
  PERF_RECORD_FINISHED_ROUND
              perf 10380 [001] 124177.397622:          6 cpu/mem-loads,ldlat=30/P: ff...
  PERF_RECORD_FINISHED_ROUND
           swapper     0 [000] 124177.400518:         88         cpu/mem-stores/P: ff...
           swapper     0 [000] 124177.400521:         88         cpu/mem-stores/P: ff...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Jiri Olsa
c3dec27b7f perf record: Put new line after target override warning
There's no new-line after target-override warning, now:

  $ perf record -a --per-thread
  Warning:
  SYSTEM/CPU switch overriding PER-THREAD^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]

with patch:

  $ perf record -a --per-thread
  Warning:
  SYSTEM/CPU switch overriding PER-THREAD
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 16ad2ffb82 ("perf tools: Introduce perf_target__strerror()")
Link: http://lkml.kernel.org/r/20180206181813.10943-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 10:09:23 -03:00
Ingo Molnar
3f9e646313 Merge tag 'perf-core-for-mingo-4.17-20180215' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core fixes from Arnaldo Carvalho de Melo:

- perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf top'
  using it, making it bearable to use it in large core count systems
  such as Knights Landing/Mill Intel systems (Kan Liang)

- s/390 now uses syscall.tbl, just like x86-64 to generate the syscall
  table id -> string tables used by 'perf trace' (Hendrik Brueckner)

- Use strtoull() instead of home grown function (Andy Shevchenko)

- Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)

- Document missing 'perf data --force' option (Sangwon Hong)

- Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William Cohen)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16 09:10:09 +01:00
Paul E. McKenney
a7c8655b07 sched/isolation: Eliminate NO_HZ_FULL_ALL
Commit 6f1982fedd ("sched/isolation: Handle the nohz_full= parameter")
broke CONFIG_NO_HZ_FULL_ALL=y kernels.  This breakage is due to the code
under CONFIG_NO_HZ_FULL_ALL failing to invoke the shiny new housekeeping
functions.  This means that rcutorture scenario TREE04 now emits RCU CPU
stall warnings due to the RCU grace-period kthreads not being awakened
at a time of their choosing, or perhaps even not at all:

[   27.731422] rcu_bh kthread starved for 21001 jiffies! g18446744073709551369 c18446744073709551368 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=3
[   27.731423] rcu_bh          I14936     9      2 0x80080000
[   27.731435] Call Trace:
[   27.731440]  __schedule+0x31a/0x6d0
[   27.731442]  schedule+0x31/0x80
[   27.731446]  schedule_timeout+0x15a/0x320
[   27.731453]  ? call_timer_fn+0x130/0x130
[   27.731457]  rcu_gp_kthread+0x66c/0xea0
[   27.731458]  ? rcu_gp_kthread+0x66c/0xea0

Because no one has complained about CONFIG_NO_HZ_FULL_ALL=y being broken,
I hypothesize that no one is in fact using it, other than rcutorture.
This commit therefore eliminates CONFIG_NO_HZ_FULL_ALL and updates
rcutorture's config files to instead use the nohz_full= kernel parameter
to put the desired CPUs into nohz_full mode.

Fixes: 6f1982fedd ("sched/isolation: Handle the nohz_full= parameter")

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
2018-02-15 15:40:37 -08:00
Prashant Bhole
ddd0010392 selftests/net: fixes psock_fanout eBPF test case
eBPF test fails due to verifier failure because log_buf is too small.
Fixed by increasing log_buf size

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:43:04 -05:00
Brenda J. Butler
95ce14c3ff tools: tc-testing: Update README and TODO
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
c25e473686 tools: tc-testing: valgrindPlugin
Run the command under test under valgrind.  Produce an extra set of
tap output for the memory check on each test.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
a13fedbe56 tools: tc-testing: nsPlugin
Move the functionality of creating a namespace before the test suite
and destroying it afterwards to a plugin.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
f6926e85ee tools: tc-testing: rootPlugin
Move the functionality that checks for root permissions into a plugin.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
93707cbabc tools: tc-testing: Introduce plugin architecture
This should be a general test architecture, and yet allow specific
tests to be done.  Introduce a plugin architecture.

An individual test has 4 stages, setup/execute/verify/teardown.  Each
plugin gets a chance to run a function at each stage, plus one call
before all the tests are called ("pre" suite) and one after all the
tests are called ("post" suite).  In addition, just before each
command is executed, the plugin gets a chance to modify the command
using the "adjust_command" hook.  This makes the test suite quite
flexible.

Future patches will take some functionality out of the tdc.py script and
place it in plugins.

To use the plugins, place the implementation in the plugins directory
and run tdc.py.  It will notice the plugins and use them.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
6fac733d9d tools: tc-testing: Refactor test-runner
Split the test_runner function into the loop part (test_runner)
and the contents (run_one_test) for maintainability.
It makes it a little easier to catch exceptions
in an individual test, and keep going (and flush a bunch
of tap results for the skipped tests).

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Brenda J. Butler
f87c7f646c tools: tc-testing: Command line parms
Separate the functionality of the command line parameters into "selection"
parameters, "action" parameters and other parameters.

"Selection" parameters are for choosing which tests on which to act.
"Action" parameters are for choosing what to do with the selected tests.
"Other" parameters are for global effect (like "help" or "verbose").

With this commit, we add the ability to name a directory as another
selection mechanism.  We can accumulate a number of tests by directory,
file, category, or even by test id, instead of being constrained to
run all tests in one collection or just one test.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:38:33 -05:00
Hendrik Brueckner
f1d0b4cde9 Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
This reverts commit f120c7b187e6c418238710b48723ce141f467543 which is no
longer required with the introduction of a syscall.tbl on s390.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-q1lg0nvhha1tk39ri9aqalcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:15 -03:00
Hendrik Brueckner
690d22d9d4 perf s390: Rework system call table creation by using syscall.tbl
Recently, s390 uses a syscall.tbl input file to generate its system call
table and unistd uapi header files.  Hence, update mksyscalltbl to use
it as input to create the system table for perf.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-4-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-bdyhllhsq1zgxv2qx4m377y6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:08 -03:00
Hendrik Brueckner
baa6761030 perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
Grab a copy of the s390 system call table file introduced with commit
857f46bfb0 "s390/syscalls: add system call
table".

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-3-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-hpw7vdjp7g92ivgpddrp5ydq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:00 -03:00
Ingo Molnar
f091f1d6a2 tools/headers: Synchronize kernel ABI headers, v4.16-rc1
Sync the following tooling headers with the latest kernel version:

  tools/arch/powerpc/include/uapi/asm/kvm.h
  tools/arch/x86/include/asm/cpufeatures.h
  tools/include/uapi/drm/i915_drm.h
  tools/include/uapi/linux/if_link.h
  tools/include/uapi/linux/kvm.h

All the changes are new ABI additions which don't impact their use
in existing tooling.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 10:01:46 -03:00
Thomas Richter
7a92453620 perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
On Intel test case trace+probe_libc_inet_pton.sh succeeds and the
output is:

[root@f27 perf]# ./perf trace --no-syscalls
                  -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
     0.000 probe_libc:inet_pton:(7fa40ac618a0))
              __GI___inet_pton (/usr/lib64/libc-2.26.so)
              getaddrinfo (/usr/lib64/libc-2.26.so)
              main (/usr/bin/ping)

The kernel stack unwinder is used, it is specified implicitly
as call-graph=fp (frame pointer).

On s390x only dwarf is available for stack unwinding. It is also
done in user space. This requires different parameter setup
and result checking for s390x and Intel.

This patch adds separate perf trace setup and result checking
for Intel and s390x. On s390x specify this command line to
get a call-graph and handle the different call graph result
checking:

[root@s35lp76 perf]# ./perf trace --no-syscalls
	-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms
     0.000 probe_libc:inet_pton:(3ffb9942060))
            __GI___inet_pton (/usr/lib64/libc-2.26.so)
            gaih_inet (inlined)
            __GI_getaddrinfo (inlined)
            main (/usr/bin/ping)
            __libc_start_main (/usr/lib64/libc-2.26.so)
            _start (/usr/bin/ping)
[root@s35lp76 perf]#

Before:
[root@s8360047 perf]# ./perf test -vv 58
58: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 26349
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
0.000 probe_libc:inet_pton:(3ff925c2060))
test child finished with -1
 ---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@s8360047 perf]#

After:
[root@s35lp76 perf]# ./perf test -vv 57
57: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 38708
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms
0.000 probe_libc:inet_pton:(3ff87342060))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
__GI_getaddrinfo (inlined)
main (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/ping)
test child finished with 0
 ---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
[root@s35lp76 perf]#

On Intel the test case runs unchanged and succeeds.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:47 -03:00
Sangwon Hong
ba7e851642 perf data: Document missing --force option
Add the --force option to the man page.

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1517831315-31490-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:33 -03:00
Andy Shevchenko
6677d26c8b perf tools: Substitute yet another strtoull()
Instead of home grown function let's use what library provides us.

Signed-off-by: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180129130359.1490-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:19 -03:00
Kan Liang
8cc42de736 perf top: Check the latency of perf_top__mmap_read()
The latency of perf_top__mmap_read() should be lower than refresh time.
If not, give some hints to reduce the latency.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-18-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:06 -03:00
Kan Liang
ebebbf0823 perf top: Switch default mode to overwrite mode
perf_top__mmap_read() has a severe performance issue in the Knights
Landing/Mill platform, when monitoring heavy load systems. It costs
several minutes to finish, which is unacceptable.

Currently, 'perf top' uses the non overwrite mode. For non overwrite
mode, it tries to read everything in the ringbuffer and doesn't pause
it. Once there are lots of samples delivered persistently, the
processing time could be very long. Also, the latest samples could be
lost when the ringbuffer is full.

For overwrite mode, it takes a snapshot for the system by pausing the
ringbuffer, which could significantly reduce the processing time.  Also,
the overwrite mode always keep the latest samples.  Considering the real
time requirement for 'perf top', the overwrite mode is more suitable for
it.

Actually, 'perf top' was overwrite mode. It is changed to non overwrite
mode since commit 93fc64f144 ("perf top: Switch to non overwrite
mode"). It's better to change it back to overwrite mode by default.

For the kernel which doesn't support overwrite mode, it will fall back
to non overwrite mode.

There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and can be tolerated.

For overwrite mode, unconditionally wait 100 ms before each snapshot. It
also reduces the overhead caused by pausing ringbuffer, especially on
light load system.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-17-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:54 -03:00
Kan Liang
a1ff5b05e9 perf top: Remove lost events checking
There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and could be tolerated by 'perf top'.

Remove the lost events checking.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-16-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:43 -03:00
Kan Liang
06cc1a470a perf hists browser: Add parameter to disable lost event warning
For overwrite mode, the ringbuffer will be paused. The event lost is
expected. It needs a way to notify the browser not print the warning.

It will be used later for perf top to disable lost event warning in
overwrite mode. There is no behavior change for now.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-15-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:26 -03:00
Kan Liang
204721d7ea perf top: Add overwrite fall back
Switch to non-overwrite mode if kernel doesnot support overwrite
ringbuffer.

It's only effect when overwrite mode is supported.  No change to current
behavior.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-14-git-send-email-kan.liang@intel.com
[ Use perf_missing_features.write_backward instead of the non merged is_write_backward_fail() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:14 -03:00
Arnaldo Carvalho de Melo
9a831b3a32 perf evsel: Expose the perf_missing_features struct
As tools may need to adjust to missing features, as 'perf top' will, in
the next csets, to cope with a missing 'write_backward' feature.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jelngl9q1ooaizvkcput9tic@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:53 -03:00
Kan Liang
63878a53ce perf top: Check per-event overwrite term
Per-event overwrite term is not forbidden in 'perf top', which can bring
problems. Because 'perf top' only support non-overwrite mode now.

Add new rules and check regarding to overwrite term for 'perf top'.
- All events either have same per-event term or don't have per-event
  mode setting. Otherwise, it will error out.
- Per-event overwrite term should be consistent as opts->overwrite.
  If not, updating the opts->overwrite according to per-event term.

Make it possible to support either non-overwrite or overwrite mode.
The overwrite mode is forbidden now, which will be removed when the
overwrite mode is supported later.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-12-git-send-email-kan.liang@intel.com
[ Renamed perf_top_overwrite_check to perf_top__overwrite_check, to follow existing convention ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:42 -03:00
Kan Liang
3effc2f165 perf mmap: Discard legacy interface for mmap read
Discards perf_mmap__read_backward() and perf_mmap__read_catchup(). No
tools use them.

There are tools still use perf_mmap__read_forward(). Keep it, but add
comments to point to the new interface for future use.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-11-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:17 -03:00