Commit Graph

1172621 Commits

Author SHA1 Message Date
Stefano Garzarella
b465518dc2 vsock/loopback: use only sk_buff_head.lock to protect the packet queue
pkt_list_lock was used before commit 71dc9ec9ac ("virtio/vsock:
replace virtio_vsock_pkt with sk_buff") to protect the packet queue.
After that commit we switched to sk_buff and we are using
sk_buff_head.lock in almost every place to protect the packet queue
except in vsock_loopback_work() when we call skb_queue_splice_init().

As reported by syzbot, this caused unlocked concurrent access to the
packet queue between vsock_loopback_work() and
vsock_loopback_cancel_pkt() since it is not holding pkt_list_lock.

With the introduction of sk_buff_head, pkt_list_lock is redundant and
can cause confusion, so let's remove it and use sk_buff_head.lock
everywhere to protect the packet queue access.

Fixes: 71dc9ec9ac ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Cc: bobby.eshleman@bytedance.com
Reported-and-tested-by: syzbot+befff0a9536049e7902e@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Reviewed-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:14:39 +01:00
David S. Miller
622035847f Merge branch 'constify-sfp-phy-nodes'
Russell King says:

====================
Constify a few sfp/phy fwnodes

This series constifies a bunch of fwnode_handle pointers that are only
used to refer to but not modify the contents of the fwnode structures.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:12:01 +01:00
Russell King (Oracle)
4a0faa02d4 net: phy: constify fwnode_get_phy_node() fwnode argument
fwnode_get_phy_node() does not motify the fwnode structure, so make
the argument const,

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:12:01 +01:00
Russell King (Oracle)
850a8d2dc7 net: sfp: constify sfp-bus internal fwnode uses
Constify sfp-bus internal fwnode uses, since we do not modify the
fwnode structures.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:12:01 +01:00
Russell King (Oracle)
a90ac762d3 net: sfp: make sfp_bus_find_fwnode() take a const fwnode
sfp_bus_find_fwnode() does not write to the fwnode, so let's make it
const.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:12:01 +01:00
Faicker Mo
e3cbdcb0fb net/net_failover: fix txq exceeding warning
The failover txq is inited as 16 queues.
when a packet is transmitted from the failover device firstly,
the failover device will select the queue which is returned from
the primary device if the primary device is UP and running.
If the primary device txq is bigger than the default 16,
it can lead to the following warning:
eth0 selects TX queue 18, but real number of TX queues is 16

The warning backtrace is:
[   32.146376] CPU: 18 PID: 9134 Comm: chronyd Tainted: G            E      6.2.8-1.el7.centos.x86_64 #1
[   32.147175] Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
[   32.147730] Call Trace:
[   32.147971]  <TASK>
[   32.148183]  dump_stack_lvl+0x48/0x70
[   32.148514]  dump_stack+0x10/0x20
[   32.148820]  netdev_core_pick_tx+0xb1/0xe0
[   32.149180]  __dev_queue_xmit+0x529/0xcf0
[   32.149533]  ? __check_object_size.part.0+0x21c/0x2c0
[   32.149967]  ip_finish_output2+0x278/0x560
[   32.150327]  __ip_finish_output+0x1fe/0x2f0
[   32.150690]  ip_finish_output+0x2a/0xd0
[   32.151032]  ip_output+0x7a/0x110
[   32.151337]  ? __pfx_ip_finish_output+0x10/0x10
[   32.151733]  ip_local_out+0x5e/0x70
[   32.152054]  ip_send_skb+0x19/0x50
[   32.152366]  udp_send_skb.isra.0+0x163/0x3a0
[   32.152736]  udp_sendmsg+0xba8/0xec0
[   32.153060]  ? __folio_memcg_unlock+0x25/0x60
[   32.153445]  ? __pfx_ip_generic_getfrag+0x10/0x10
[   32.153854]  ? sock_has_perm+0x85/0xa0
[   32.154190]  inet_sendmsg+0x6d/0x80
[   32.154508]  ? inet_sendmsg+0x6d/0x80
[   32.154838]  sock_sendmsg+0x62/0x70
[   32.155152]  ____sys_sendmsg+0x134/0x290
[   32.155499]  ___sys_sendmsg+0x81/0xc0
[   32.155828]  ? _get_random_bytes.part.0+0x79/0x1a0
[   32.156240]  ? ip4_datagram_release_cb+0x5f/0x1e0
[   32.156649]  ? get_random_u16+0x69/0xf0
[   32.156989]  ? __fget_light+0xcf/0x110
[   32.157326]  __sys_sendmmsg+0xc4/0x210
[   32.157657]  ? __sys_connect+0xb7/0xe0
[   32.157995]  ? __audit_syscall_entry+0xce/0x140
[   32.158388]  ? syscall_trace_enter.isra.0+0x12c/0x1a0
[   32.158820]  __x64_sys_sendmmsg+0x24/0x30
[   32.159171]  do_syscall_64+0x38/0x90
[   32.159493]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fix that by reducing txq number as the non-existent primary-dev does.

Fixes: cfc80d9a11 ("net: Introduce net_failover driver")
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27 08:08:57 +01:00
Christophe JAILLET
02bcba0b9f regulator: Handle deferred clk
devm_clk_get() can return -EPROBE_DEFER. So it is better to return the
error code from devm_clk_get(), instead of a hard coded -ENOENT.

This gives more opportunities to successfully probe the driver.

Fixes: 8959e53244 ("regulator: fixed: add possibility to enable by clock")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/18459fae3d017a66313699c7c8456b28158b2dd0.1679819354.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-03-27 01:42:01 +01:00
Ivan Orlov
707823e7f2 9P FS: Fix wild-memory-access write in v9fs_get_acl
KASAN reported the following issue:
[   36.825817][ T5923] BUG: KASAN: wild-memory-access in v9fs_get_acl+0x1a4/0x390
[   36.827479][ T5923] Write of size 4 at addr 9fffeb37f97f1c00 by task syz-executor798/5923
[   36.829303][ T5923]
[   36.829846][ T5923] CPU: 0 PID: 5923 Comm: syz-executor798 Not tainted 6.2.0-syzkaller-18302-g596b6b709632 #0
[   36.832110][ T5923] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
[   36.834464][ T5923] Call trace:
[   36.835196][ T5923]  dump_backtrace+0x1c8/0x1f4
[   36.836229][ T5923]  show_stack+0x2c/0x3c
[   36.837100][ T5923]  dump_stack_lvl+0xd0/0x124
[   36.838103][ T5923]  print_report+0xe4/0x4c0
[   36.839068][ T5923]  kasan_report+0xd4/0x130
[   36.840052][ T5923]  kasan_check_range+0x264/0x2a4
[   36.841199][ T5923]  __kasan_check_write+0x2c/0x3c
[   36.842216][ T5923]  v9fs_get_acl+0x1a4/0x390
[   36.843232][ T5923]  v9fs_mount+0x77c/0xa5c
[   36.844163][ T5923]  legacy_get_tree+0xd4/0x16c
[   36.845173][ T5923]  vfs_get_tree+0x90/0x274
[   36.846137][ T5923]  do_new_mount+0x25c/0x8c8
[   36.847066][ T5923]  path_mount+0x590/0xe58
[   36.848147][ T5923]  __arm64_sys_mount+0x45c/0x594
[   36.849273][ T5923]  invoke_syscall+0x98/0x2c0
[   36.850421][ T5923]  el0_svc_common+0x138/0x258
[   36.851397][ T5923]  do_el0_svc+0x64/0x198
[   36.852398][ T5923]  el0_svc+0x58/0x168
[   36.853224][ T5923]  el0t_64_sync_handler+0x84/0xf0
[   36.854293][ T5923]  el0t_64_sync+0x190/0x194

Calling '__v9fs_get_acl' method in 'v9fs_get_acl' creates the
following chain of function calls:

__v9fs_get_acl
	v9fs_fid_get_acl
		v9fs_fid_xattr_get
			p9_client_xattrwalk

Function p9_client_xattrwalk accepts a pointer to u64-typed
variable attr_size and puts some u64 value into it. However,
after the executing the p9_client_xattrwalk, in some circumstances
we assign the value of u64-typed variable 'attr_size' to the
variable 'retval', which we will return. However, the type of
'retval' is ssize_t, and if the value of attr_size is larger
than SSIZE_MAX, we will face the signed type overflow. If the
overflow occurs, the result of v9fs_fid_xattr_get may be
negative, but not classified as an error. When we try to allocate
an acl with 'broken' size we receive an error, but don't process
it. When we try to free this acl, we face the 'wild-memory-access'
error (because it wasn't allocated).

This patch will add new condition to the 'v9fs_fid_xattr_get'
function, so it will return an EOVERFLOW error if the 'attr_size'
is larger than SSIZE_MAX.

In this version of the patch I simplified the condition.

In previous (v2) version of the patch I removed explicit type conversion
and added separate condition to check the possible overflow and return
an error (in v1 version I've just modified the existing condition).

Tested via syzkaller.

Suggested-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Reported-by: syzbot+cb1d16facb3cc90de5fb@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fbbef66d9e4d096242f3617de5d14d12705b4659
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-03-27 00:34:16 +00:00
Linus Torvalds
197b6b60ae Linux 6.3-rc4 2023-03-26 14:40:20 -07:00
Linus Torvalds
0ec57cfa72 Merge tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt driver fixes from Greg KH:
 "Here are a small set of USB and Thunderbolt driver fixes for reported
  problems and a documentation update, for 6.3-rc4.

  Included in here are:

   - documentation update for uvc gadget driver

   - small thunderbolt driver fixes

   - cdns3 driver fixes

   - dwc3 driver fixes

   - dwc2 driver fixes

   - chipidea driver fixes

   - typec driver fixes

   - onboard_usb_hub device id updates

   - quirk updates

  All of these have been in linux-next with no reported problems"

* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
  usb: dwc2: fix a race, don't power off/on phy for dual-role mode
  usb: dwc2: fix a devres leak in hw_enable upon suspend resume
  usb: chipidea: core: fix possible concurrent when switch role
  usb: chipdea: core: fix return -EINVAL if request role is the same with current role
  thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
  thunderbolt: Disable interrupt auto clear for rings
  thunderbolt: Use const qualifier for `ring_interrupt_index`
  usb: gadget: Use correct endianness of the wLength field for WebUSB
  uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
  usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
  usb: cdns3: Fix issue with using incorrect PCI device function
  usb: cdnsp: Fixes issue with redundant Status Stage
  MAINTAINERS: make me a reviewer of USB/IP
  thunderbolt: Use scale field when allocating USB3 bandwidth
  thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
  thunderbolt: Call tb_check_quirks() after initializing adapters
  thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
  thunderbolt: Fix memory leak in margining
  usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
  docs: usb: Add documentation for the UVC Gadget
  ...
2023-03-26 10:22:44 -07:00
Linus Torvalds
18940c888c Merge tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:

 - Fix a corner case where vruntime of a task is not being sanitized

* tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Sanitize vruntime of entity being migrated
2023-03-26 09:18:30 -07:00
Linus Torvalds
974fc94336 Merge tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:

 - Properly clear perf event status tracking in the AMD perf event
   overflow handler

* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/core: Always clear status for idx
2023-03-26 09:13:35 -07:00
Linus Torvalds
f6cdaeb08b Merge tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Borislav Petkov:

 - Do the delayed RCU wakeup for kthreads in the proper order so that
   former doesn't get ignored

 - A noinstr warning fix

* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
  entry: Fix noinstr warning in __enter_from_user_mode()
2023-03-26 09:06:20 -07:00
Linus Torvalds
986c63741d Merge tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Add a AMX ptrace self test

 - Prevent a false-positive warning when retrieving the (invalid)
   address of dynamic FPU features in their init state which are not
   saved in init_fpstate at all

 - Randomize per-CPU entry areas only when KASLR is enabled

* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86/amx: Add a ptrace test
  x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
  x86/mm: Do not shuffle CPU entry areas without KASLR
2023-03-26 09:01:24 -07:00
Linus Torvalds
6485ac65af Merge tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
 "Twelve cifs/smb3 client fixes (most also for stable)

   - forced umount fix

   - fix for two perf regressions

   - reconnect fixes

   - small debugging improvements

   - multichannel fixes"

* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix unusable share after force unmount failure
  cifs: fix dentry lookups in directory handle cache
  smb3: lower default deferred close timeout to address perf regression
  cifs: fix missing unload_nls() in smb2_reconnect()
  cifs: avoid race conditions with parallel reconnects
  cifs: append path to open_enter trace event
  cifs: print session id while listing open files
  cifs: dump pending mids for all channels in DebugData
  cifs: empty interface list when server doesn't support query interfaces
  cifs: do not poll server interfaces too regularly
  cifs: lock chan_lock outside match_session
  cifs: check only tcon status on tcon related functions
2023-03-26 08:56:09 -07:00
Nuno Gonçalves
5f5a7d8d8b xsk: allow remap of fill and/or completion rings
The remap of fill and completion rings was frowned upon as they
control the usage of UMEM which does not support concurrent use.
At the same time this would disallow the remap of these rings
into another process.

A possible use case is that the user wants to transfer the socket/
UMEM ownership to another process (via SYS_pidfd_getfd) and so
would need to also remap these rings.

This will have no impact on current usages and just relaxes the
remap limitation.

Signed-off-by: Nuno Gonçalves <nunog@fr24.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230324100222.13434-1-nunog@fr24.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 21:07:35 -07:00
Dave Thaler
8cfee11071 bpf, docs: Add extended call instructions
Add extended call instructions.  Uses the term "program-local" for
call by offset.  And there are instructions for calling helper functions
by "address" (the old way of using integer values), and for calling
helper functions by BTF ID (for kfuncs).

V1 -> V2: addressed comments from David Vernet

V2 -> V3: make descriptions in table consistent with updated names

V3 -> V4: addressed comments from Alexei

V4 -> V5: fixed alignment

Signed-off-by: Dave Thaler <dthaler@microsoft.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230326033117.1075-1-dthaler1968@googlemail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 20:58:36 -07:00
Alexei Starovoitov
8d2759608c Merge branch 'bpf: Use bpf_mem_cache_alloc/free in bpf_local_storage'
Martin KaFai Lau says:

====================

From: Martin KaFai Lau <martin.lau@kernel.org>

This set is a continuation of the effort in using
bpf_mem_cache_alloc/free in bpf_local_storage [1]

Major change is only using bpf_mem_alloc for task and cgrp storage
while sk and inode stay with kzalloc/kfree. The details is
in patch 2.

[1]: https://lore.kernel.org/bpf/20230308065936.1550103-1-martin.lau@linux.dev/

v3:
- Only use bpf_mem_alloc for task and cgrp storage.
- sk and inode storage stay with kzalloc/kfree.
- Check NULL and add comments in bpf_mem_cache_raw_free() in patch 1.
- Added test and benchmark for task storage.

v2:
- Added bpf_mem_cache_alloc_flags() and bpf_mem_cache_raw_free()
  to hide the internal data structure of the bpf allocator.
- Fixed a typo bug in bpf_selem_free()
- Simplified the test_local_storage test by directly using
  err returned from libbpf
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:52 -07:00
Martin KaFai Lau
cbe9d93d58 selftests/bpf: Add bench for task storage creation
This patch adds a task storage benchmark to the existing
local-storage-create benchmark.

For task storage,
./bench --storage-type task --batch-size 32:
   bpf_ma: Summary: creates   30.456 ± 0.507k/s ( 30.456k/prod), 6.08 kmallocs/create
no bpf_ma: Summary: creates   31.962 ± 0.486k/s ( 31.962k/prod), 6.13 kmallocs/create

./bench --storage-type task --batch-size 64:
   bpf_ma: Summary: creates   30.197 ± 1.476k/s ( 30.197k/prod), 6.08 kmallocs/create
no bpf_ma: Summary: creates   31.103 ± 0.297k/s ( 31.103k/prod), 6.13 kmallocs/create

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-6-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:52 -07:00
Martin KaFai Lau
d8db84d71c selftests/bpf: Test task storage when local_storage->smap is NULL
The current sk storage test ensures the memory free works when
the local_storage->smap is NULL.

This patch adds a task storage test to ensure the memory free
code path works when local_storage->smap is NULL.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-5-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:52 -07:00
Martin KaFai Lau
6ae9d5e99e bpf: Use bpf_mem_cache_alloc/free for bpf_local_storage
This patch uses bpf_mem_cache_alloc/free for allocating and freeing
bpf_local_storage for task and cgroup storage.

The changes are similar to the previous patch. A few things that
worth to mention for bpf_local_storage:

The local_storage is freed when the last selem is deleted.
Before deleting a selem from local_storage, it needs to retrieve the
local_storage->smap because the bpf_selem_unlink_storage_nolock()
may have set it to NULL. Note that local_storage->smap may have
already been NULL when the selem created this local_storage has
been removed. In this case, call_rcu will be used to free the
local_storage.
Also, the bpf_ma (true or false) value is needed before calling
bpf_local_storage_free(). The bpf_ma can either be obtained from
the local_storage->smap (if available) or any of its selem's smap.
A new helper check_storage_bpf_ma() is added to obtain
bpf_ma for a deleting bpf_local_storage.

When bpf_local_storage_alloc getting a reused memory, all
fields are either in the correct values or will be initialized.
'cache[]' must already be all NULLs. 'list' must be empty.
Others will be initialized.

Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:52 -07:00
Martin KaFai Lau
08a7ce384e bpf: Use bpf_mem_cache_alloc/free in bpf_local_storage_elem
This patch uses bpf_mem_alloc for the task and cgroup local storage that
the bpf prog can easily get a hold of the storage owner's PTR_TO_BTF_ID.
eg. bpf_get_current_task_btf() can be used in some of the kmalloc code
path which will cause deadlock/recursion. bpf_mem_cache_alloc is
deadlock free and will solve a legit use case in [1].

For sk storage, its batch creation benchmark shows a few percent
regression when the sk create/destroy batch size is larger than 32.
The sk creation/destruction happens much more often and
depends on external traffic. Considering it is hypothetical
to be able to cause deadlock with sk storage, it can cross
the bridge to use bpf_mem_alloc till a legit (ie. useful)
use case comes up.

For inode storage, bpf_local_storage_destroy() is called before
waiting for a rcu gp and its memory cannot be reused immediately.
inode stays with kmalloc/kfree after the rcu [or tasks_trace] gp.

A 'bool bpf_ma' argument is added to bpf_local_storage_map_alloc().
Only task and cgroup storage have 'bpf_ma == true' which
means to use bpf_mem_cache_alloc/free(). This patch only changes
selem to use bpf_mem_alloc for task and cgroup. The next patch
will change the local_storage to use bpf_mem_alloc also for
task and cgroup.

Here is some more details on the changes:

* memory allocation:
After bpf_mem_cache_alloc(), the SDATA(selem)->data is zero-ed because
bpf_mem_cache_alloc() could return a reused selem. It is to keep
the existing bpf_map_kzalloc() behavior. Only SDATA(selem)->data
is zero-ed. SDATA(selem)->data is the visible part to the bpf prog.
No need to use zero_map_value() to do the zeroing because
bpf_selem_free(..., reuse_now = true) ensures no bpf prog is using
the selem before returning the selem through bpf_mem_cache_free().
For the internal fields of selem, they will be initialized when
linking to the new smap and the new local_storage.

When 'bpf_ma == false', nothing changes in this patch. It will
stay with the bpf_map_kzalloc().

* memory free:
The bpf_selem_free() and bpf_selem_free_rcu() are modified to handle
the bpf_ma == true case.

For the common selem free path where its owner is also being destroyed,
the mem is freed in bpf_local_storage_destroy(), the owner (task
and cgroup) has gone through a rcu gp. The memory can be reused
immediately, so bpf_local_storage_destroy() will call
bpf_selem_free(..., reuse_now = true) which will do
bpf_mem_cache_free() for immediate reuse consideration.

An exception is the delete elem code path. The delete elem code path
is called from the helper bpf_*_storage_delete() and the syscall
bpf_map_delete_elem(). This path is an unusual case for local
storage because the common use case is to have the local storage
staying with its owner life time so that the bpf prog and the user
space does not have to monitor the owner's destruction. For the delete
elem path, the selem cannot be reused immediately because there could
be bpf prog using it. It will call bpf_selem_free(..., reuse_now = false)
and it will wait for a rcu tasks trace gp before freeing the elem. The
rcu callback is changed to do bpf_mem_cache_raw_free() instead of kfree().

When 'bpf_ma == false', it should be the same as before.
__bpf_selem_free() is added to do the kfree_rcu and call_tasks_trace_rcu().
A few words on the 'reuse_now == true'. When 'reuse_now == true',
it is still racing with bpf_local_storage_map_free which is under rcu
protection, so it still needs to wait for a rcu gp instead of kfree().
Otherwise, the selem may be reused by slab for a totally different struct
while the bpf_local_storage_map_free() is still using it (as a
rcu reader). For the inode case, there may be other rcu readers also.
In short, when bpf_ma == false and reuse_now == true => vanilla rcu.

[1]: https://lore.kernel.org/bpf/20221118190109.1512674-1-namhyung@kernel.org/

Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-3-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:52 -07:00
Martin KaFai Lau
e65a5c6edb bpf: Add a few bpf mem allocator functions
This patch adds a few bpf mem allocator functions which will
be used in the bpf_local_storage in a later patch.

bpf_mem_cache_alloc_flags(..., gfp_t flags) is added. When the
flags == GFP_KERNEL, it will fallback to __alloc(..., GFP_KERNEL).
bpf_local_storage knows its running context is sleepable (GFP_KERNEL)
and provides a better guarantee on memory allocation.

bpf_local_storage has some uncommon cases that its selem
cannot be reused immediately. It handles its own
rcu_head and goes through a rcu_trace gp and then free it.
bpf_mem_cache_raw_free() is added for direct free purpose
without leaking the LLIST_NODE_SZ internal knowledge.
During free time, the 'struct bpf_mem_alloc *ma' is no longer
available. However, the caller should know if it is
percpu memory or not and it can call different raw_free functions.
bpf_local_storage does not support percpu value, so only
the non-percpu 'bpf_mem_cache_raw_free()' is added in
this patch.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 19:52:51 -07:00
Alexei Starovoitov
e99360762a Merge branch 'First set of verifier/*.c migrated to inline assembly'
Eduard Zingerman says:

====================

This is a follow up for RFC [1]. It migrates a first batch of 38
verifier/*.c tests to inline assembly and use of ./test_progs for
actual execution. The migration is done by a python script (see [2]).

Each migrated verifier/xxx.c file is mapped to progs/verifier_xxx.c
plus an entry in the prog_tests/verifier.c. One patch per each file.

A few patches at the beginning of the patch-set extend test_loader
with necessary functionality, mainly:
- support for tests execution in unprivileged mode;
- support for test runs for test programs.

Migrated tests could be selected for execution using the following filter:

  ./test_progs -a verifier_*

An example of the migrated test:

  SEC("xdp")
  __description("XDP pkt read, pkt_data' > pkt_end, corner case, good access")
  __success __retval(0) __flag(BPF_F_ANY_ALIGNMENT)
  __naked void end_corner_case_good_access_1(void)
  {
          asm volatile ("                                 \
          r2 = *(u32*)(r1 + %[xdp_md_data]);              \
          r3 = *(u32*)(r1 + %[xdp_md_data_end]);          \
          r1 = r2;                                        \
          r1 += 8;                                        \
          if r1 > r3 goto l0_%=;                          \
          r0 = *(u64*)(r1 - 8);                           \
  l0_%=:  r0 = 0;                                         \
          exit;                                           \
  "       :
          : __imm_const(xdp_md_data, offsetof(struct xdp_md, data)),
            __imm_const(xdp_md_data_end, offsetof(struct xdp_md, data_end))
          : __clobber_all);
  }

Changes compared to RFC:
- test_loader.c is extended to support test program runs;
- capabilities handling now matches behavior of test_verifier;
- BPF_ST_MEM instructions are automatically replaced by BPF_STX_MEM
  instructions to overcome current clang limitations;
- tests styling updates according to RFC feedback;
- 38 migrated files are included instead of 1.

I used the following means for testing:
- migration tool itself has a set of self-tests;
- migrated tests are passing;
- manually compared each old/new file side-by-side.

While doing side-by-side comparison I've noted a few defects in the
original tests:
- and.c:
  - One of the jump targets is off by one;
  - BPF_ST_MEM wrong OFF/IMM ordering;
- array_access.c:
  - BPF_ST_MEM wrong OFF/IMM ordering;
- value_or_null.c:
  - BPF_ST_MEM wrong OFF/IMM ordering.

These defects would be addressed separately.

[1] RFC
    https://lore.kernel.org/bpf/20230123145148.2791939-1-eddyz87@gmail.com/
[2] Migration tool
    https://github.com/eddyz87/verifier-tests-migrator
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
ffb515c933 selftests/bpf: verifier/xdp.c converted to inline assembly
Test verifier/xdp.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-43-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
a8036aea2d selftests/bpf: verifier/xadd.c converted to inline assembly
Test verifier/xadd.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-42-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
d15f5b68b6 selftests/bpf: verifier/var_off.c converted to inline assembly
Test verifier/var_off.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-41-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
d330528617 selftests/bpf: verifier/value_or_null.c converted to inline assembly
Test verifier/value_or_null.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-40-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
8f59e87a3b selftests/bpf: verifier/value.c converted to inline assembly
Test verifier/value.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-39-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
033914942d selftests/bpf: verifier/value_adj_spill.c converted to inline assembly
Test verifier/value_adj_spill.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-38-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:06 -07:00
Eduard Zingerman
ab839a5819 selftests/bpf: verifier/uninit.c converted to inline assembly
Test verifier/uninit.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-37-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
edff37b2f2 selftests/bpf: verifier/stack_ptr.c converted to inline assembly
Test verifier/stack_ptr.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-36-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
f4fe3cfe6c selftests/bpf: verifier/spill_fill.c converted to inline assembly
Test verifier/spill_fill.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-35-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
b7e4203086 selftests/bpf: verifier/ringbuf.c converted to inline assembly
Test verifier/ringbuf.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-34-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
18cdc2b531 selftests/bpf: verifier/raw_tp_writable.c converted to inline assembly
Test verifier/raw_tp_writable.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-33-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
5a77a01f33 selftests/bpf: verifier/raw_stack.c converted to inline assembly
Test verifier/raw_stack.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-32-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
65428312e3 selftests/bpf: verifier/meta_access.c converted to inline assembly
Test verifier/meta_access.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-31-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
ade3f08fc2 selftests/bpf: verifier/masking.c converted to inline assembly
Test verifier/masking.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-30-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
05e474ecbb selftests/bpf: verifier/map_ret_val.c converted to inline assembly
Test verifier/map_ret_val.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-29-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:05 -07:00
Eduard Zingerman
caf345cf12 selftests/bpf: verifier/map_ptr.c converted to inline assembly
Test verifier/map_ptr.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-28-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
583c7ce5be selftests/bpf: verifier/leak_ptr.c converted to inline assembly
Test verifier/leak_ptr.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-27-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
e297875580 selftests/bpf: verifier/ld_ind.c converted to inline assembly
Test verifier/ld_ind.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-26-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
01481e67dd selftests/bpf: verifier/int_ptr.c converted to inline assembly
Test verifier/int_ptr.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-25-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
ecc424827b selftests/bpf: verifier/helper_value_access.c converted to inline assembly
Test verifier/helper_value_access.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-24-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
77aa2563cb selftests/bpf: verifier/helper_restricted.c converted to inline assembly
Test verifier/helper_restricted.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-23-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
fb179fe69e selftests/bpf: verifier/helper_packet_access.c converted to inline assembly
Test verifier/helper_packet_access.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-22-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
b37d776b43 selftests/bpf: verifier/helper_access_var_len.c converted to inline assembly
Test verifier/helper_access_var_len.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-21-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
9553de70a8 selftests/bpf: verifier/div_overflow.c converted to inline assembly
Test verifier/div_overflow.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-20-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
01a0925531 selftests/bpf: verifier/div0.c converted to inline assembly
Test verifier/div0.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-19-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:04 -07:00
Eduard Zingerman
84988478fb selftests/bpf: verifier/direct_stack_access_wraparound.c converted to inline assembly
Test verifier/direct_stack_access_wraparound.c automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-18-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25 17:02:03 -07:00