Commit Graph

1215123 Commits

Author SHA1 Message Date
Alexei Starovoitov
9e3b47abeb Merge branch 'add-support-cpu-v4-insns-for-rv64'
Pu Lehui says:

====================
Add support cpu v4 insns for RV64

Add support cpu v4 instructions for RV64. The relevant tests have passed as show bellow:

Summary: 6/166 PASSED, 0 SKIPPED, 0 FAILED

NOTE: ldsx_insn testcase uses fentry and needs to rely on ftrace direct call [0].
[0] https://lore.kernel.org/all/20230627111612.761164-1-suagrfillet@gmail.com/

v2:
- Use temporary reg to avoid clobbering the source reg in movs_8/16 insns. (Björn)
- Add Acked-by

v1:
https://lore.kernel.org/bpf/20230823231059.3363698-1-pulehui@huaweicloud.com
====================

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230824095001.3408573-1-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:28 -07:00
Pu Lehui
0209fd511f selftests/bpf: Enable cpu v4 tests for RV64
Enable cpu v4 tests for RV64, and the relevant tests have passed.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-8-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
83cc63afab riscv, bpf: Support unconditional bswap insn
Add support unconditional bswap instruction. Since riscv is always
little-endian, just treat the unconditional scenario the same as
big-endian conversion.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-7-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
3e18ff4bce riscv, bpf: Support signed div/mod insns
Add support signed div/mod instructions for RV64.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-6-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
d9839f16c1 riscv, bpf: Support 32-bit offset jmp insn
Add support 32-bit offset jmp instruction for RV64.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-5-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
694896ad3c riscv, bpf: Support sign-extension mov insns
Add support sign-extension mov instructions for RV64.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-4-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
3d06d8163f riscv, bpf: Support sign-extension load insns
Add Support sign-extension load instructions for RV64.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-3-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Pu Lehui
469fb2c3c1 riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
For LDX_B/H/W, when zext has been inserted by verifier, it'll return 1,
and no exception handling will continue. Also, when the offset is 12-bit
value, the redundant zext inserted by the verifier is not removed. Fix
both scenarios by moving down the removal of redundant zext.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20230824095001.3408573-2-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Alexei Starovoitov
1b580c9bb6 Merge branch 'samples-bpf-remove-unmaintained-xdp-sample-utilities'
Toke Høiland-Jørgensen says:

====================
samples/bpf: Remove unmaintained XDP sample utilities

The samples/bpf directory in the kernel tree started out as a way of showcasing
different aspects of BPF functionality by writing small utility programs for
each feature. However, as the BPF subsystem has matured, the preferred way of
including userspace code with a feature has become the BPF selftests, which also
have the benefit of being consistently run as part of the BPF CI system.

As a result of this shift, the utilities in samples/bpf have seen little love,
and have slowly bitrotted. There have been sporadic cleanup patches over the
years, but it's clear that the utilities are far from maintained.

For XDP in particular, some of the utilities have been used as benchmarking aids
when implementing new kernel features, which seems to be the main reason they
have stuck around; any updates the utilities have seen have been targeted at
this use case. However, as the BPF subsystem as a whole has moved on, it has
become increasingly difficult to incorporate new features into these utilities
because they predate most of the modern BPF features (such as kfuncs and BTF).

Rather than try to update these utilities and keep maintaining them in the
kernel tree, we have ported the useful features of the utilities to the
xdp-tools package. In the porting process we also updated the utilities to take
advantage of modern BPF features, integrated them with libxdp, and polished the
user interface.

As these utilities are standalone tools, maintaining them out of tree is
simpler, and we plan to keep maintaining them in the xdp-tools repo. To direct
users of these utilities to the right place, this series removes the utilities
from samples/bpf, leaving in place only a couple of utilities whose
functionality have not yet been ported to xdp-tools.

The xdp-tools repository is located on Github at the following URL:

https://github.com/xdp-project/xdp-tools

The commits in the series removes one utility each, explaining how the
equivalent functionality can be obtained with xdp-tools.

v2:
- Add equivalent xdp-tools commands for each removed utility
v3:
- Add link to xdp-tools in the README

Toke Høiland-Jørgensen (7):
  samples/bpf: Remove the xdp_monitor utility
  samples/bpf: Remove the xdp_redirect* utilities
  samples/bpf: Remove the xdp_rxq_info utility
  samples/bpf: Remove the xdp1 and xdp2 utilities
  samples/bpf: Remove the xdp_sample_pkts utility
  samples/bpf: Cleanup .gitignore
  samples/bpf: Add note to README about the XDP utilities moved to
    xdp-tools

 samples/bpf/.gitignore                    |  12 -
 samples/bpf/Makefile                      |  48 +-
 samples/bpf/README.rst                    |   6 +
 samples/bpf/xdp1_kern.c                   | 100 ----
 samples/bpf/xdp1_user.c                   | 166 ------
 samples/bpf/xdp2_kern.c                   | 125 -----
 samples/bpf/xdp_monitor.bpf.c             |   8 -
 samples/bpf/xdp_monitor_user.c            | 118 -----
 samples/bpf/xdp_redirect.bpf.c            |  49 --
 samples/bpf/xdp_redirect_cpu.bpf.c        | 539 -------------------
 samples/bpf/xdp_redirect_cpu_user.c       | 559 --------------------
 samples/bpf/xdp_redirect_map.bpf.c        |  97 ----
 samples/bpf/xdp_redirect_map_multi.bpf.c  |  77 ---
 samples/bpf/xdp_redirect_map_multi_user.c | 232 --------
 samples/bpf/xdp_redirect_map_user.c       | 228 --------
 samples/bpf/xdp_redirect_user.c           | 172 ------
 samples/bpf/xdp_rxq_info_kern.c           | 140 -----
 samples/bpf/xdp_rxq_info_user.c           | 614 ----------------------
 samples/bpf/xdp_sample_pkts_kern.c        |  57 --
 samples/bpf/xdp_sample_pkts_user.c        | 196 -------
 20 files changed, 7 insertions(+), 3536 deletions(-)
 delete mode 100644 samples/bpf/xdp1_kern.c
 delete mode 100644 samples/bpf/xdp1_user.c
 delete mode 100644 samples/bpf/xdp2_kern.c
 delete mode 100644 samples/bpf/xdp_monitor.bpf.c
 delete mode 100644 samples/bpf/xdp_monitor_user.c
 delete mode 100644 samples/bpf/xdp_redirect.bpf.c
 delete mode 100644 samples/bpf/xdp_redirect_cpu.bpf.c
 delete mode 100644 samples/bpf/xdp_redirect_cpu_user.c
 delete mode 100644 samples/bpf/xdp_redirect_map.bpf.c
 delete mode 100644 samples/bpf/xdp_redirect_map_multi.bpf.c
 delete mode 100644 samples/bpf/xdp_redirect_map_multi_user.c
 delete mode 100644 samples/bpf/xdp_redirect_map_user.c
 delete mode 100644 samples/bpf/xdp_redirect_user.c
 delete mode 100644 samples/bpf/xdp_rxq_info_kern.c
 delete mode 100644 samples/bpf/xdp_rxq_info_user.c
 delete mode 100644 samples/bpf/xdp_sample_pkts_kern.c
 delete mode 100644 samples/bpf/xdp_sample_pkts_user.c
====================

Link: https://lore.kernel.org/r/20230824102255.1561885-1-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:59 -07:00
Toke Høiland-Jørgensen
5a9fd0f778 samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
To help users find the XDP utilities, add a note to the README about the
new location and the conversion documentation in the commit messages.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-8-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
91b965136d samples/bpf: Cleanup .gitignore
Remove no longer present XDP utilities from .gitignore. Apart from the
recently removed XDP utilities this also includes the previously removed
xdpsock and xsk utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-7-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
cced0699cb samples/bpf: Remove the xdp_sample_pkts utility
The functionality of this utility is covered by the xdpdump utility in
xdp-tools.

There's a slight difference in usage as the xdpdump utility's main focus is
to dump packets before or after they are processed by an existing XDP
program. However, xdpdump also has the --load-xdp-program switch, which
will make it attach its own program if no existing program is loaded. With
this, xdp_sample_pkts usage can be converted as:

xdp_sample_pkts eth0
  --> xdpdump --load-xdp-program eth0

To get roughly equivalent behaviour.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-6-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
eaca21d6ee samples/bpf: Remove the xdp1 and xdp2 utilities
The functionality of these utilities have been incorporated into the
xdp-bench utility in xdp-tools.

Equivalent functionality is:

xdp1 eth0
  --> xdp-bench drop -p parse-ip -l load-bytes eth0

xdp2 eth0
  --> xdp-bench drop -p swap-macs eth0

Note that there's a slight difference in behaviour of those examples: the
swap-macs operation of xdp-bench doesn't use the bpf_xdp_load_bytes()
helper to load the packet data, whereas the xdp2 utility did so
unconditionally. For the parse-ip action the use of bpf_xdp_load_bytes()
can be selected by the '-l load-bytes' switch, with the difference that the
xdp-bench utility will perform two separate calls to the helper, one to
load the ethernet header and another to load the IP header; where the xdp1
utility only performed one call always loading 60 bytes of data.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-5-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
0e445e115f samples/bpf: Remove the xdp_rxq_info utility
The functionality of this utility has been incorporated into the xdp-bench
utility in xdp-tools, by way of the --rxq-stats argument to the 'drop',
'pass' and 'tx' commands of xdp-bench.

Some examples of how to convert xdp_rxq_info invocations into equivalent
xdp-bench commands:

xdp_rxq_info -d eth0
  --> xdp-bench pass --rxq-stats eth0

xdp_rxq_info -d eth0 -a XDP_DROP -m
  --> xdp-bench drop --rxq-stats -p swap-macs eth0

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-4-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
91dda69b08 samples/bpf: Remove the xdp_redirect* utilities
These utilities have all been ported to xdp-tools as functions of the
xdp-bench utility. The four different utilities in samples are incorporated
as separate subcommands to xdp-bench, with most of the command line
parameters left intact, except that mandatory arguments are always
positional in xdp-bench. For full usage details see the --help output of
each command, or the xdp-bench man page.

Some examples of how to convert usage to xdp-bench are:

xdp_redirect eth0 eth1
  --> xdp-bench redirect eth0 eth1

xdp_redirect_map eth0 eth1
  --> xdp-bench redirect-map eth0 eth1

xdp_redirect_map_multi eth0 eth1 eth2 eth3
  --> xdp-bench redirect-multi eth0 eth1 eth2 eth3

xdp_redirect_cpu -d eth0 -c 0 -c 1
  --> xdp-bench redirect-cpu -c 0 -c 1 eth0

xdp_redirect_cpu -d eth0 -c 0 -c 1 -r eth1
  --> xdp-bench redirect-cpu -c 0 -c 1 eth0 -r redirect -D eth1

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-3-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Toke Høiland-Jørgensen
e7c9e73d08 samples/bpf: Remove the xdp_monitor utility
This utility has been ported as-is to xdp-tools as 'xdp-monitor'. The only
difference in usage between the samples and xdp-tools versions is that the
'-v' command line parameter has been changed to '-e' in the xdp-tools
version for consistency with the other utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-2-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:43:50 -07:00
Jim Quinlan
6dac1507a6 PCI: brcmstb: Remove stale comment
A comment says that Multi-MSI is not supported by the driver.
A past commit [1] added this feature, so the comment is
incorrect and is removed.

[1] commit 198acab177 ("PCI: brcmstb: Enable Multi-MSI")

Link: https://lore.kernel.org/r/20230623144100.34196-6-james.quinlan@broadcom.com
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2023-08-24 17:33:59 +02:00
Jim Quinlan
8eb8c27353 PCI: brcmstb: Assert PERST# on BCM2711
The current PCIe driver assumes PERST# is asserted when probe() is invoked.
Some older versions of the 2711/RPi bootloader left PERST# unasserted, as
the Raspian OS does assert PERST# on probe().  For this reason, we assert
PERST# for BCM2711 SOCs (i.e. RPi).

Link: https://lore.kernel.org/r/20230623144100.34196-5-james.quinlan@broadcom.com
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-08-24 17:33:58 +02:00
Linus Torvalds
b5cc3833f1 Merge tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from wifi, can and netfilter.

  Fixes to fixes:

   - nf_tables:
       - GC transaction race with abort path
       - defer gc run if previous batch is still pending

  Previous releases - regressions:

   - ipv4: fix data-races around inet->inet_id

   - phy: fix deadlocking in phy_error() invocation

   - mdio: fix C45 read/write protocol

   - ipvlan: fix a reference count leak warning in ipvlan_ns_exit()

   - ice: fix NULL pointer deref during VF reset

   - i40e: fix potential NULL pointer dereferencing of pf->vf in
     i40e_sync_vsi_filters()

   - tg3: use slab_build_skb() when needed

   - mtk_eth_soc: fix NULL pointer on hw reset

  Previous releases - always broken:

   - core: validate veth and vxcan peer ifindexes

   - sched: fix a qdisc modification with ambiguous command request

   - devlink: add missing unregister linecard notification

   - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning

   - batman:
      - do not get eth header before batadv_check_management_packet
      - fix batadv_v_ogm_aggr_send memory leak

   - bonding: fix macvlan over alb bond support

   - mlxsw: set time stamp fields also when its type is MIRROR_UTC"

* tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  selftests: bonding: add macvlan over bond testing
  selftest: bond: add new topo bond_topo_2d1c.sh
  bonding: fix macvlan over alb bond support
  rtnetlink: Reject negative ifindexes in RTM_NEWLINK
  netfilter: nf_tables: defer gc run if previous batch is still pending
  netfilter: nf_tables: fix out of memory error handling
  netfilter: nf_tables: use correct lock to protect gc_list
  netfilter: nf_tables: GC transaction race with abort path
  netfilter: nf_tables: flush pending destroy work before netlink notifier
  netfilter: nf_tables: validate all pending tables
  ibmveth: Use dcbf rather than dcbfl
  i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
  net/sched: fix a qdisc modification with ambiguous command request
  igc: Fix the typo in the PTM Control macro
  batman-adv: Hold rtnl lock during MTU update via netlink
  igb: Avoid starting unnecessary workqueues
  can: raw: add missing refcount for memory leak fix
  can: isotp: fix support for transmission of SF without flow control
  bnx2x: new flag for track HW resource allocation
  sfc: allocate a big enough SKB for loopback selftest packet
  ...
2023-08-24 08:23:13 -07:00
Yonghong Song
001fedacc9 selftests/bpf: Add a local kptr test with no special fields
Add a local kptr test with no special fields in the struct. Without the
previous patch, the following warning will hit:

  [   44.683877] WARNING: CPU: 3 PID: 485 at kernel/bpf/syscall.c:660 bpf_obj_free_fields+0x220/0x240
  [   44.684640] Modules linked in: bpf_testmod(OE)
  [   44.685044] CPU: 3 PID: 485 Comm: kworker/u8:5 Tainted: G           OE      6.5.0-rc5-01703-g260d855e9b90 #248
  [   44.685827] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [   44.686693] Workqueue: events_unbound bpf_map_free_deferred
  [   44.687297] RIP: 0010:bpf_obj_free_fields+0x220/0x240
  [   44.687775] Code: e8 55 17 1f 00 49 8b 74 24 08 4c 89 ef e8 e8 14 05 00 e8 a3 da e2 ff e9 55 fe ff ff 0f 0b e9 4e fe ff
                       ff 0f 0b e9 47 fe ff ff <0f> 0b e8 d9 d9 e2 ff 31 f6 eb d5 48 83 c4 10 5b 41 5c e
  [   44.689353] RSP: 0018:ffff888106467cb8 EFLAGS: 00010246
  [   44.689806] RAX: 0000000000000000 RBX: ffff888112b3a200 RCX: 0000000000000001
  [   44.690433] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8881128ad988
  [   44.691094] RBP: 0000000000000002 R08: ffffffff81370bd0 R09: 1ffff110216231a5
  [   44.691643] R10: dffffc0000000000 R11: ffffed10216231a6 R12: ffff88810d68a488
  [   44.692245] R13: ffff88810767c288 R14: ffff88810d68a400 R15: ffff88810d68a418
  [   44.692829] FS:  0000000000000000(0000) GS:ffff8881f7580000(0000) knlGS:0000000000000000
  [   44.693484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   44.693964] CR2: 000055c7f2afce28 CR3: 000000010fee4002 CR4: 0000000000370ee0
  [   44.694513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   44.695102] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   44.695747] Call Trace:
  [   44.696001]  <TASK>
  [   44.696183]  ? __warn+0xfe/0x270
  [   44.696447]  ? bpf_obj_free_fields+0x220/0x240
  [   44.696817]  ? report_bug+0x220/0x2d0
  [   44.697180]  ? handle_bug+0x3d/0x70
  [   44.697507]  ? exc_invalid_op+0x1a/0x50
  [   44.697887]  ? asm_exc_invalid_op+0x1a/0x20
  [   44.698282]  ? btf_find_struct_meta+0xd0/0xd0
  [   44.698634]  ? bpf_obj_free_fields+0x220/0x240
  [   44.699027]  ? bpf_obj_free_fields+0x1e2/0x240
  [   44.699414]  array_map_free+0x1a3/0x260
  [   44.699763]  bpf_map_free_deferred+0x7b/0xe0
  [   44.700154]  process_one_work+0x46d/0x750
  [   44.700523]  worker_thread+0x49e/0x900
  [   44.700892]  ? pr_cont_work+0x270/0x270
  [   44.701224]  kthread+0x1ae/0x1d0
  [   44.701516]  ? kthread_blkcg+0x50/0x50
  [   44.701860]  ret_from_fork+0x34/0x50
  [   44.702178]  ? kthread_blkcg+0x50/0x50
  [   44.702508]  ret_from_fork_asm+0x11/0x20
  [   44.702880]  </TASK>

With the previous patch, there is no warnings.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824063422.203097-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:15:16 -07:00
Yonghong Song
393dc4bd92 bpf: Remove a WARN_ON_ONCE warning related to local kptr
Currently, in function bpf_obj_free_fields(), for local kptr,
a warning will be issued if the struct does not contain any
special fields. But actually the kernel seems totally okay
with a local kptr without any special fields. Permitting
no special fields also aligns with future percpu kptr which
also allows no special fields.

Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824063417.201925-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:15:16 -07:00
Chuck Lever
8073a98e95 NFSD: Fix a thinko introduced by recent trace point changes
The fixed commit erroneously removed a call to nfsd_end_grace(),
which makes calls to write_v4_end_grace() a no-op.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202308241229.68396422-oliver.sang@intel.com
Fixes: 39d432fc76 ("NFSD: trace nfsctl operations")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-24 10:56:28 -04:00
Will Shiu
74f6f59126 locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
As following backtrace, the struct file_lock request , in posix_lock_inode
is free before ftrace function using.
Replace the ftrace function ahead free flow could fix the use-after-free
issue.

[name:report&]===============================================
BUG:KASAN: use-after-free in trace_event_raw_event_filelock_lock+0x80/0x12c
[name:report&]Read at addr f6ffff8025622620 by task NativeThread/16753
[name:report_hw_tags&]Pointer tag: [f6], memory tag: [fe]
[name:report&]
BT:
Hardware name: MT6897 (DT)
Call trace:
 dump_backtrace+0xf8/0x148
 show_stack+0x18/0x24
 dump_stack_lvl+0x60/0x7c
 print_report+0x2c8/0xa08
 kasan_report+0xb0/0x120
 __do_kernel_fault+0xc8/0x248
 do_bad_area+0x30/0xdc
 do_tag_check_fault+0x1c/0x30
 do_mem_abort+0x58/0xbc
 el1_abort+0x3c/0x5c
 el1h_64_sync_handler+0x54/0x90
 el1h_64_sync+0x68/0x6c
 trace_event_raw_event_filelock_lock+0x80/0x12c
 posix_lock_inode+0xd0c/0xd60
 do_lock_file_wait+0xb8/0x190
 fcntl_setlk+0x2d8/0x440
...
[name:report&]
[name:report&]Allocated by task 16752:
...
 slab_post_alloc_hook+0x74/0x340
 kmem_cache_alloc+0x1b0/0x2f0
 posix_lock_inode+0xb0/0xd60
...
 [name:report&]
 [name:report&]Freed by task 16752:
...
  kmem_cache_free+0x274/0x5b0
  locks_dispose_list+0x3c/0x148
  posix_lock_inode+0xc40/0xd60
  do_lock_file_wait+0xb8/0x190
  fcntl_setlk+0x2d8/0x440
  do_fcntl+0x150/0xc18
...

Signed-off-by: Will Shiu <Will.Shiu@mediatek.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-08-24 10:42:19 -04:00
Jakub Wilk
bd4c4680c0 fs/locks: Fix typo
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-08-24 10:42:19 -04:00
Stas Sergeev
bfe2e8f569 selftests: add OFD lock tests
Test the basic locking stuff on 2 fds: multiple read locks,
conflicts between read and write locks, use of len==0 for queries.
Also tests for F_UNLCK F_OFD_GETLK extension.

[ jlayton: fix unlink() pathname in selftest ]

Cc: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-08-24 10:41:47 -04:00
Ian Rogers
f85d120c46 perf jevents: Sort strings in the big C string to reduce faults
Sort the strings within the big C string based on whether they were
for a metric and then by when they were added. This helps group
related strings and reduce minor faults by approximately 10 in 1740,
about 0.57%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:11:09 -03:00
Ian Rogers
8d4b6d37ea perf pmu: Lazily load sysfs aliases
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:10:27 -03:00
Ian Rogers
7b723dbb96 perf pmu: Be lazy about loading event info files from sysfs
Event info is only needed when an event is parsed or when merging data
from an JSON and sysfs event. Be lazy in its loading to reduce file
accesses.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:10:01 -03:00
Ian Rogers
88ed91848d perf pmu: Scan type early to fail an invalid PMU quickly
Scan sysfs PMU's type early so that format and aliases aren't
attempted to be loaded if the PMU name is invalid.

This is the case for event_pmu tokens in parse-events.y where a wildcard
name is first assumed to be a PMU name.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:09:15 -03:00
Ian Rogers
e6ff1eed35 perf pmu: Lazily add JSON events
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.

Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:08:47 -03:00
Ian Rogers
7c52f10c0d perf pmu: Cache JSON events table
Cache the JSON events table so that finding it isn't done per
event/alias.

Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.

Update usage to always use the PMU variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:07:39 -03:00
Ian Rogers
f63a536f03 perf pmu: Merge JSON events with sysfs at load time
Rather than load all sysfs events then parsing all JSON events and
merging with ones that already exist. When a sysfs event is loaded, look
for a corresponding JSON event and merge immediately.

To simplify the logic, early exit the perf_pmu__new_alias function if an
alias is attempted to be added twice - as merging has already been
explicitly handled.

Fix the copying of terms to a merged alias and some ENOMEM paths.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:05:09 -03:00
Ian Rogers
f26d22f1ba perf pmu: Prefer passing pmu to aliases list
The aliases list is part of the PMU. Rather than pass the aliases
list, pass the full PMU simplifying some callbacks.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:04:27 -03:00
Ian Rogers
edb217ff14 perf pmu: Parse sysfs events directly from a file
Rather than read a sysfs events file into a 256 byte char buffer, pass
the FILE* directly to the lex/yacc parser.

This avoids there being a maximum events file size.

While changing the API, constify some arguments to remove unnecessary
casts.

Allocating the read buffer decreases the performance of pmu-scan by
around 3%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:59 -03:00
Ian Rogers
3d5045492a perf pmu-events: Add pmu_events_table__find_event()
jevents stores events sorted by name. Add a find function that will
binary search event names avoiding the need to linearly search through
events.

Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000
is returned. If the event is found but no callback function given, 0 is
returned.

This allows the find function also act as a test for existence.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:22 -03:00
Ian Rogers
e3edd6cf63 perf pmu-events: Reduce processed events by passing PMU
Pass the PMU to pmu_events_table__for_each_event so that entries that
don't match don't need to be processed by callback.

If a NULL PMU is passed then all PMUs are processed.

'perf bench internals pmu-scan's "Average PMU scanning" performance is
reduced by about 5% on an Intel tigerlake.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:00:09 -03:00
Rahul Rameshbabu
197d314352 HID: nvidia-shield: Reference hid_device devm allocation of input_dev name
Use hid_device for devm allocation of the input_dev name to avoid a
use-after-free. input_unregister_device would trigger devres cleanup of all
resources associated with the input_dev, free-ing the name. The name would
subsequently be used in a uevent fired at the end of unregistering the
input_dev.

Reported-by: Maxime Ripard <mripard@kernel.org>
Closes: https://lore.kernel.org/linux-input/ZOZIZCND+L0P1wJc@penguin/T/#m443f3dce92520f74b6cf6ffa8653f9c92643d4ae
Fixes: 09308562d4 ("HID: nvidia-shield: Initial driver implementation with Thunderstrike support")
Suggested-by: Maxime Ripard <mripard@kernel.org>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230824061308.222021-4-sergeantsagara@protonmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-08-24 15:57:58 +02:00
Rahul Rameshbabu
4794394635 HID: multitouch: Correct devm device reference for hidinput input_dev name
Reference the HID device rather than the input device for the devm
allocation of the input_dev name. Referencing the input_dev would lead to a
use-after-free when the input_dev was unregistered and subsequently fires a
uevent that depends on the name. At the point of firing the uevent, the
name would be freed by devres management.

Use devm_kasprintf to simplify the logic for allocating memory and
formatting the input_dev name string.

Reported-by: Maxime Ripard <mripard@kernel.org>
Closes: https://lore.kernel.org/linux-input/ZOZIZCND+L0P1wJc@penguin/T/#m443f3dce92520f74b6cf6ffa8653f9c92643d4ae
Fixes: c08d46aa80 ("HID: multitouch: devm conversion")
Suggested-by: Maxime Ripard <mripard@kernel.org>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rahul Rameshbabu <sergeantsagara@protonmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230824061308.222021-3-sergeantsagara@protonmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-08-24 15:57:57 +02:00
Rahul Rameshbabu
dd613a4e45 HID: uclogic: Correct devm device reference for hidinput input_dev name
Reference the HID device rather than the input device for the devm
allocation of the input_dev name. Referencing the input_dev would lead to a
use-after-free when the input_dev was unregistered and subsequently fires a
uevent that depends on the name. At the point of firing the uevent, the
name would be freed by devres management.

Use devm_kasprintf to simplify the logic for allocating memory and
formatting the input_dev name string.

Reported-by: syzbot+3a0ebe8a52b89c63739d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-input/ZOZIZCND+L0P1wJc@penguin/T/
Reported-by: Maxime Ripard <mripard@kernel.org>
Closes: https://lore.kernel.org/linux-input/ZOZIZCND+L0P1wJc@penguin/T/#m443f3dce92520f74b6cf6ffa8653f9c92643d4ae
Fixes: cce2dbdf25 ("HID: uclogic: name the input nodes based on their tool")
Suggested-by: Maxime Ripard <mripard@kernel.org>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rahul Rameshbabu <sergeantsagara@protonmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230824061308.222021-2-sergeantsagara@protonmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-08-24 15:57:57 +02:00
Ian Rogers
c4ac7f7542 perf s390 s390_cpumcfdg_dump: Don't scan all PMUs
Rather than scanning all PMUs for a counter name, scan the PMU
associated with the evsel of the sample. This is done to remove a
dependence on pmu-events.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:57:05 -03:00
Ian Rogers
9d31cb9395 perf parse-events: Improve error message for double setting
Double setting information for an event would produce an error message
associated with the PMU rather than the term that was double setting.
Improve the error message to be on the term.

Before:

  $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
  event syntax error: 'cpu/inst_retired.any,inst_retired.any/'
                       \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'cpu'
  Run 'perf list' for a list of valid events
  $

After:

  $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
  event syntax error: '..etired.any,inst_retired.any/'
                                    \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'cpu'

  Initial error:

  event syntax error: '..etired.any,inst_retired.any/'
                                    \___ Attempt to set event's scale twice
  Run 'perf list' for a list of valid events

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:52:35 -03:00
Ian Rogers
2e255b4f9f perf jevents: Group events by PMU
Prior to this change a cpuid would map to a list of events where the PMU
would be encoded alongside the event information. This change breaks
apart each group of events so that there is a group per PMU. A new table
is added with the PMU's name and the list of events, the original table
now holding an array of these per PMU tables.

These changes are to make it easier to get per PMU information about
events, rather than the current approach of scanning all events. The
perf binary size with BPF skeletons on x86 is reduced by about 1%. The
unidentified PMU is now always expanded to "cpu".

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:51:03 -03:00
Ian Rogers
4000519eb0 perf pmu-events: Add extra underscore to function names
Add extra underscore before "for" of pmu_events_table_for_each_event
and pmu_metrics_table_for_each_metric.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:43:19 -03:00
Ian Rogers
c3245d2093 perf pmu: Abstract alias/event struct
In order to be able to lazily compute aliases/events for a PMU, move
the struct perf_pmu_alias into pmu.c.

Add perf_pmu__find_event and perf_pmu__for_each_event that take a
callback that is called for the found event or for each event.

The layout of struct pmu and the event/alias list is unchanged but the
API is altered so that aliases are no longer directly accessed, allowing
for later changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:42:46 -03:00
Ian Rogers
5040264121 perf pmu: Make the loading of formats lazy
The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.

Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.

For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.

Before:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
    Average PMU scanning took: 4725.300 usec (+- 260.599 usec)

After:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
    Average PMU scanning took: 4520.960 usec (+- 251.272 usec)

Committer testing:

On a AMD Ryzen 5950x:

Before:

  $ perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
    Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
    Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
    Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
    Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
    Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
    Average PMU scanning took: 1623.213 usec (+- 23.998 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3276.82 msec task-clock:u                     #    0.990 CPUs utilized               ( +-  0.82% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1008      page-faults:u                    #  307.615 /sec                        ( +-  0.04% )
         12049614778      cycles:u                         #    3.677 GHz                         ( +-  0.07% )  (83.34%)
           117507478      stalled-cycles-frontend:u        #    0.98% frontend cycles idle        ( +-  0.33% )  (83.32%)
            27106761      stalled-cycles-backend:u         #    0.22% backend cycles idle         ( +-  9.55% )  (83.36%)
         33294953848      instructions:u                   #    2.76  insn per cycle
                                                           #    0.00  stalled cycles per insn     ( +-  0.03% )  (83.31%)
          6849825049      branches:u                       #    2.090 G/sec                       ( +-  0.03% )  (83.37%)
            71533903      branch-misses:u                  #    1.04% of all branches             ( +-  0.20% )  (83.30%)

              3.3088 +- 0.0302 seconds time elapsed  ( +-  0.91% )

  $

After:

  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
    Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
    Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
    Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
    Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
    Average PMU scanning took: 1562.814 usec (+- 22.729 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3170.86 msec task-clock:u                     #    0.992 CPUs utilized               ( +-  0.22% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1010      page-faults:u                    #  318.526 /sec                        ( +-  0.04% )
         11890047674      cycles:u                         #    3.750 GHz                         ( +-  0.14% )  (83.27%)
           119090499      stalled-cycles-frontend:u        #    1.00% frontend cycles idle        ( +-  0.46% )  (83.40%)
            32502449      stalled-cycles-backend:u         #    0.27% backend cycles idle         ( +-  8.32% )  (83.30%)
         33119141261      instructions:u                   #    2.79  insn per cycle
                                                    #    0.00  stalled cycles per insn     ( +-  0.01% )  (83.37%)
          6812816561      branches:u                       #    2.149 G/sec                       ( +-  0.01% )  (83.29%)
            70157855      branch-misses:u                  #    1.03% of all branches             ( +-  0.28% )  (83.38%)

             3.19710 +- 0.00826 seconds time elapsed  ( +-  0.26% )

  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:38:04 -03:00
Paolo Abeni
9f6708a668 Merge tag 'mlx5-updates-2023-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2023-08-22

1) Patches #1..#13 From Jiri:

The goal of this patchset is to make the SF code cleaner.

Benefit from previously introduced devlink_port struct containerization
to avoid unnecessary lookups in devlink port ops.

Also, benefit from the devlink locking changes and avoid unnecessary
reference counting.

2) Patches #14,#15:

Add ability to configure proto both UDP and TCP selectors in RX and TX
directions.

* tag 'mlx5-updates-2023-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Support IPsec upper TCP protocol selector
  net/mlx5e: Support IPsec upper protocol selector field offload for RX
  net/mlx5: Store vport in struct mlx5_devlink_port and use it in port ops
  net/mlx5: Check vhca_resource_manager capability in each op and add extack msg
  net/mlx5: Relax mlx5_devlink_eswitch_get() return value checking
  net/mlx5: Return -EOPNOTSUPP in mlx5_devlink_port_fn_migratable_set() directly
  net/mlx5: Reduce number of vport lookups passing vport pointer instead of index
  net/mlx5: Embed struct devlink_port into driver structure
  net/mlx5: Don't register ops for non-PF/VF/SF port and avoid checks in ops
  net/mlx5: Remove no longer used mlx5_esw_offloads_sf_vport_enable/disable()
  net/mlx5: Introduce mlx5_eswitch_load/unload_sf_vport() and use it from SF code
  net/mlx5: Allow mlx5_esw_offloads_devlink_port_register() to register SFs
  net/mlx5: Push devlink port PF/VF init/cleanup calls out of devlink_port_register/unregister()
  net/mlx5: Push out SF devlink port init and cleanup code to separate helpers
  net/mlx5: Rework devlink port alloc/free into init/cleanup
====================

Link: https://lore.kernel.org/all/20230823051012.162483-1-saeed@kernel.org/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-24 15:23:49 +02:00
Russell Currey
eac030b22e powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
lppaca_shared_proc() takes a pointer to the lppaca which is typically
accessed through get_lppaca().  With DEBUG_PREEMPT enabled, this leads
to checking if preemption is enabled, for example:

  BUG: using smp_processor_id() in preemptible [00000000] code: grep/10693
  caller is lparcfg_data+0x408/0x19a0
  CPU: 4 PID: 10693 Comm: grep Not tainted 6.5.0-rc3 #2
  Call Trace:
    dump_stack_lvl+0x154/0x200 (unreliable)
    check_preemption_disabled+0x214/0x220
    lparcfg_data+0x408/0x19a0
    ...

This isn't actually a problem however, as it does not matter which
lppaca is accessed, the shared proc state will be the same.
vcpudispatch_stats_procfs_init() already works around this by disabling
preemption, but the lparcfg code does not, erroring any time
/proc/powerpc/lparcfg is accessed with DEBUG_PREEMPT enabled.

Instead of disabling preemption on the caller side, rework
lppaca_shared_proc() to not take a pointer and instead directly access
the lppaca, bypassing any potential preemption checks.

Fixes: f13c13a005 ("powerpc: Stop using non-architected shared_proc field in lppaca")
Signed-off-by: Russell Currey <ruscur@russell.cc>
[mpe: Rework to avoid needing a definition in paca.h and lppaca.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-4-mpe@ellerman.id.au
2023-08-24 22:33:17 +10:00
Michael Ellerman
1aa0006676 powerpc: Don't include lppaca.h in paca.h
By adding a forward declaration for struct lppaca we can untangle paca.h
and lppaca.h. Also move get_lppaca() into lppaca.h for consistency.

Add includes of lppaca.h to some files that need it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-3-mpe@ellerman.id.au
2023-08-24 22:33:16 +10:00
Michael Ellerman
9a6c05fe9a powerpc/pseries: Move hcall_vphn() prototype into vphn.h
Consolidate the two prototypes for hcall_vphn() into vphn.h.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-2-mpe@ellerman.id.au
2023-08-24 22:33:16 +10:00
Michael Ellerman
c040c7488b powerpc/pseries: Move VPHN constants into vphn.h
These don't have any particularly good reason to belong in lppaca.h,
move them into their own header.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-1-mpe@ellerman.id.au
2023-08-24 22:33:16 +10:00