PoP stands for Packet on Packet that can improve performance in noisy
environment, but it could get RX stuck suddenly. In normal mode, firmware
can help to resolve the stuck, but firmware doesn't work in monitor mode.
Therefore, turn off PoP to avoid RX stuck.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221125072416.94752-4-pkshih@realtek.com
This patch fixes slab-out-of-bounds reads in brcmfmac that occur in
brcmf_construct_chaninfo() and brcmf_enable_bw40_2g() when the count
value of channel specifications provided by the device is greater than
the length of 'list->element[]', decided by the size of the 'list'
allocated with kzalloc(). The patch adds checks that make the functions
free the buffer and return -EINVAL if that is the case. Note that the
negative return is handled by the caller, brcmf_setup_wiphybands() or
brcmf_cfg80211_attach().
Found by a modified version of syzkaller.
Crash Report from brcmf_construct_chaninfo():
==================================================================
BUG: KASAN: slab-out-of-bounds in brcmf_setup_wiphybands+0x1238/0x1430
Read of size 4 at addr ffff888115f24600 by task kworker/0:2/1896
CPU: 0 PID: 1896 Comm: kworker/0:2 Tainted: G W O 5.14.0+ #132
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Workqueue: usb_hub_wq hub_event
Call Trace:
dump_stack_lvl+0x57/0x7d
print_address_description.constprop.0.cold+0x93/0x334
kasan_report.cold+0x83/0xdf
brcmf_setup_wiphybands+0x1238/0x1430
brcmf_cfg80211_attach+0x2118/0x3fd0
brcmf_attach+0x389/0xd40
brcmf_usb_probe+0x12de/0x1690
usb_probe_interface+0x25f/0x710
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_set_configuration+0x984/0x1770
usb_generic_driver_probe+0x69/0x90
usb_probe_device+0x9c/0x220
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_new_device.cold+0x463/0xf66
hub_event+0x10d5/0x3330
process_one_work+0x873/0x13e0
worker_thread+0x8b/0xd10
kthread+0x379/0x450
ret_from_fork+0x1f/0x30
Allocated by task 1896:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
kmem_cache_alloc_trace+0x19e/0x330
brcmf_setup_wiphybands+0x290/0x1430
brcmf_cfg80211_attach+0x2118/0x3fd0
brcmf_attach+0x389/0xd40
brcmf_usb_probe+0x12de/0x1690
usb_probe_interface+0x25f/0x710
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_set_configuration+0x984/0x1770
usb_generic_driver_probe+0x69/0x90
usb_probe_device+0x9c/0x220
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_new_device.cold+0x463/0xf66
hub_event+0x10d5/0x3330
process_one_work+0x873/0x13e0
worker_thread+0x8b/0xd10
kthread+0x379/0x450
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff888115f24000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1536 bytes inside of
2048-byte region [ffff888115f24000, ffff888115f24800)
Memory state around the buggy address:
ffff888115f24500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888115f24580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888115f24600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888115f24680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888115f24700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Crash Report from brcmf_enable_bw40_2g():
==================================================================
BUG: KASAN: slab-out-of-bounds in brcmf_cfg80211_attach+0x3d11/0x3fd0
Read of size 4 at addr ffff888103787600 by task kworker/0:2/1896
CPU: 0 PID: 1896 Comm: kworker/0:2 Tainted: G W O 5.14.0+ #132
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Workqueue: usb_hub_wq hub_event
Call Trace:
dump_stack_lvl+0x57/0x7d
print_address_description.constprop.0.cold+0x93/0x334
kasan_report.cold+0x83/0xdf
brcmf_cfg80211_attach+0x3d11/0x3fd0
brcmf_attach+0x389/0xd40
brcmf_usb_probe+0x12de/0x1690
usb_probe_interface+0x25f/0x710
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_set_configuration+0x984/0x1770
usb_generic_driver_probe+0x69/0x90
usb_probe_device+0x9c/0x220
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_new_device.cold+0x463/0xf66
hub_event+0x10d5/0x3330
process_one_work+0x873/0x13e0
worker_thread+0x8b/0xd10
kthread+0x379/0x450
ret_from_fork+0x1f/0x30
Allocated by task 1896:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
kmem_cache_alloc_trace+0x19e/0x330
brcmf_cfg80211_attach+0x3302/0x3fd0
brcmf_attach+0x389/0xd40
brcmf_usb_probe+0x12de/0x1690
usb_probe_interface+0x25f/0x710
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_set_configuration+0x984/0x1770
usb_generic_driver_probe+0x69/0x90
usb_probe_device+0x9c/0x220
really_probe+0x1be/0xa90
__driver_probe_device+0x2ab/0x460
driver_probe_device+0x49/0x120
__device_attach_driver+0x18a/0x250
bus_for_each_drv+0x123/0x1a0
__device_attach+0x207/0x330
bus_probe_device+0x1a2/0x260
device_add+0xa61/0x1ce0
usb_new_device.cold+0x463/0xf66
hub_event+0x10d5/0x3330
process_one_work+0x873/0x13e0
worker_thread+0x8b/0xd10
kthread+0x379/0x450
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff888103787000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1536 bytes inside of
2048-byte region [ffff888103787000, ffff888103787800)
Memory state around the buggy address:
ffff888103787500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888103787580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888103787600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888103787680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888103787700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Reported-by: Dokyung Song <dokyungs@yonsei.ac.kr>
Reported-by: Jisoo Jang <jisoo.jang@yonsei.ac.kr>
Reported-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr>
Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221116142952.518241-1-linuxlovemin@yonsei.ac.kr
Kuniyuki Iwashima says:
====================
af_unix: Fix a NULL deref in sk_diag_dump_uid().
The first patch fixes a NULL deref when we dump a AF_UNIX socket's UID,
and the second patch adds a repro/test for such a case.
====================
Link: https://lore.kernel.org/r/20221127012412.37969-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The test prog dumps a single AF_UNIX socket's UID with and without
unshare(CLONE_NEWUSER) and checks if it matches the result of getuid().
Without the preceding patch, the test prog is killed by a NULL deref
in sk_diag_dump_uid().
# ./diag_uid
TAP version 13
1..2
# Starting 2 tests from 3 test cases.
# RUN diag_uid.uid.1 ...
BUG: kernel NULL pointer dereference, address: 0000000000000270
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 105212067 P4D 105212067 PUD 1051fe067 PMD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014
RIP: 0010:sk_diag_fill (./include/net/sock.h:920 net/unix/diag.c:119 net/unix/diag.c:170)
...
# 1: Test terminated unexpectedly by signal 9
# FAIL diag_uid.uid.1
not ok 1 diag_uid.uid.1
# RUN diag_uid.uid_unshare.1 ...
# 1: Test terminated by timeout
# FAIL diag_uid.uid_unshare.1
not ok 2 diag_uid.uid_unshare.1
# FAILED: 0 / 2 tests passed.
# Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
With the patch, the test succeeds.
# ./diag_uid
TAP version 13
1..2
# Starting 2 tests from 3 test cases.
# RUN diag_uid.uid.1 ...
# OK diag_uid.uid.1
ok 1 diag_uid.uid.1
# RUN diag_uid.uid_unshare.1 ...
# OK diag_uid.uid_unshare.1
ok 2 diag_uid.uid_unshare.1
# PASSED: 2 / 2 tests passed.
# Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Saeed Mahameed says:
====================
mlx5-updates-2022-11-29
Misc update for mlx5 driver
1) Various trivial cleanups
2) Maor Dickman, Adds support for trap offload with additional actions
3) From Tariq, UMR (device memory registrations) cleanups,
UMR WQE must be aligned to 64B per device spec, (not a bug fix).
* tag 'mlx5-updates-2022-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Support devlink reload of IPsec core
net/mlx5e: TC, Add offload support for trap with additional actions
net/mlx5e: Do early return when setup vports dests for slow path flow
net/mlx5: Remove redundant check
net/mlx5e: Delete always true DMA check
net/mlx5e: Don't access directly DMA device pointer
net/mlx5e: Don't use termination table when redundant
net/mlx5: Fix orthography errors in documentation
net/mlx5: Use generic definition for UMR KLM alignment
net/mlx5: Generalize name of UMR alignment definition
net/mlx5: Remove unused UMR MTT definitions
net/mlx5e: Add padding when needed in UMR WQEs
net/mlx5: Remove unused ctx variables
net/mlx5e: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
net/mlx5e: Remove unneeded io-mapping.h #include
====================
Link: https://lore.kernel.org/r/20221130051152.479480-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the external phy used by current mac interface is
managed by another mac interface, it means that this
network port cannot work independently, especially
when the system suspends and resumes, the following
trace may appear, so we should create a device link
between phy dev and mac dev.
WARNING: CPU: 0 PID: 24 at drivers/net/phy/phy.c:983 phy_error+0x20/0x68
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:2 Not tainted 6.1.0-rc3-00011-g5aaef24b5c6d-dirty #34
Hardware name: Freescale i.MX6 SoloX (Device Tree)
Workqueue: events_power_efficient phy_state_machine
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x68/0x90
dump_stack_lvl from __warn+0xb4/0x24c
__warn from warn_slowpath_fmt+0x5c/0xd8
warn_slowpath_fmt from phy_error+0x20/0x68
phy_error from phy_state_machine+0x22c/0x23c
phy_state_machine from process_one_work+0x288/0x744
process_one_work from worker_thread+0x3c/0x500
worker_thread from kthread+0xf0/0x114
kthread from ret_from_fork+0x14/0x28
Exception stack(0xf0951fb0 to 0xf0951ff8)
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221130021216.1052230-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Check for interval validity in all concatenation fields in
nft_set_pipapo, from Stefano Brivio.
2) Missing preemption disabled in conntrack and flowtable stat
updates, from Xin Long.
3) Fix compilation warning when CONFIG_NF_CONNTRACK_MARK=n.
Except for 3) which was a bug introduced in a recent fix in 6.1-rc
- anything else, broken for several releases.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark
netfilter: conntrack: fix using __this_cpu_add in preemptible
netfilter: flowtable_offload: fix using __this_cpu_add in preemptible
netfilter: nft_set_pipapo: Actually validate intervals in fields after the first one
====================
Link: https://lore.kernel.org/r/20221130121934.1125-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vincent Mailhol says:
====================
net: devlink: return the driver name in devlink_nl_info_fill
The driver name is available in device_driver::name. Right now,
drivers still have to report this piece of information themselves in
their devlink_ops::info_get callback function.
The goal of this series is to have the devlink core to report this
information instead of the drivers.
The first patch fulfills the actual goal of this series: modify
devlink core to report the driver name and clean-up all drivers. Both
have to be done in an atomic change to avoid attribute
duplication. This same patch also removes the
devlink_info_driver_name_put() function to prevent future drivers from
reporting the driver name themselves.
The second patch allows the core to call devlink_nl_info_fill() even
if the devlink_ops::info_get() callback is NULL. This leads to the
third and final patch which cleans up the drivers which have an empty
info_get().
====================
Link: https://lore.kernel.org/r/20221129095140.3913303-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devlink_ops::info_get() is now optional and devlink will continue to
report information even if that callback gets removed.
Remove all the empty devlink_ops::info_get() callbacks from the
drivers.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some drivers only reported the driver name in their
devlink_ops::info_get() callback. Now that the core provides this
information, the callback became empty. For such drivers, just
removing the callback would prevent the core from executing
devlink_nl_info_fill() meaning that "devlink dev info" would not
return anything.
Make the callback function optional by executing
devlink_nl_info_fill() even if devlink_ops::info_get() is NULL.
N.B.: the drivers with devlink support which previously did not
implement devlink_ops::info_get() will now also be able to report
the driver name.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver name is available in device_driver::name. Right now,
drivers still have to report this piece of information themselves in
their devlink_ops::info_get callback function.
In order to factorize code, make devlink_nl_info_fill() add the driver
name attribute.
Now that the core sets the driver name attribute, drivers are not
supposed to call devlink_info_driver_name_put() anymore. Remove
devlink_info_driver_name_put() and clean-up all the drivers using this
function in their callback.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The am65-cpsw driver supports configuring all RGMII variants at interface
speed of 10 Mbps. However, in the process of shifting to the PHYLINK
framework, the support for all variants of RGMII except the
PHY_INTERFACE_MODE_RGMII variant was accidentally removed.
Fix this by using phy_interface_mode_is_rgmii() to check for all variants
of RGMII mode.
Fixes: e8609e6947 ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK")
Reported-by: Schuyler Patton <spatton@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://lore.kernel.org/r/20221129050639.111142-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacob Keller says:
====================
support direct read from region
A long time ago when initially implementing devlink regions in ice I
proposed the ability to allow reading from a region without taking a
snapshot [1]. I eventually dropped this work from the original series due to
size. Then I eventually lost track of submitting this follow up.
This can be useful when interacting with some region that has some
definitive "contents" from which snapshots are made. For example the ice
driver has regions representing the contents of the device flash.
If userspace wants to read the contents today, it must first take a snapshot
and then read from that snapshot. This makes sense if you want to read a
large portion of data or you want to be sure reads are consistently from the
same recording of the flash.
However if user space only wants to read a small chunk, it must first
generate a snapshot of the entire contents, perform a read from the
snapshot, and then delete the snapshot after reading.
For such a use case, a direct read from the region makes more sense. This
can be achieved by allowing the devlink region read command to work without
a snapshot. Instead the portion to be read can be forwarded directly to the
driver via a new .read callback.
This avoids the need to read the entire region contents into memory first
and avoids the software overhead of creating a snapshot and then deleting
it.
This series implements such behavior and hooks up the ice NVM and shadow RAM
regions to allow it.
[1] https://lore.kernel.org/netdev/20200130225913.1671982-1-jacob.e.keller@intel.com/
====================
Link: https://lore.kernel.org/r/20221128203647.1198669-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement the .read handler for the NVM and Shadow RAM regions. This
enables user space to read a small chunk of the flash without needing the
overhead of creating a full snapshot.
Update the documentation for ice to detail which regions have direct read
support.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
78ad87da99 ("ice: devlink: add shadow-ram region to snapshot Shadow RAM")
added support for the 'shadow-ram' devlink region, but did not document it
in the ice devlink documentation. Fix this.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ice driver supports a region for both the flat NVM contents as well as
the Shadow RAM which is a layer built on top of the flash during device
initialization.
These regions use an almost identical read function, except that the NVM
needs to set the direct flag when reading, while Shadow RAM needs to read
without the direct flag set. They each call ice_read_flat_nvm with the only
difference being whether to set the direct flash flag.
The NVM region read function also was fixed to read the NVM in blocks to
avoid a situation where the firmware reclaims the lock due to taking too
long.
Note that the region snapshot function takes the ops pointer so the
function can easily determine which region to read. Make use of this and
re-use the NVM snapshot function for both the NVM and Shadow RAM regions.
This makes Shadow RAM benefit from the same block approach as the NVM
region. It also reduces code in the ice driver.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To read from a region, user space must currently request a new snapshot of
the region and then read from that snapshot. This can sometimes be overkill
if user space only reads a tiny portion. They first create the snapshot,
then request a read, then destroy the snapshot.
For regions which have a single underlying "contents", it makes sense to
allow supporting direct reading of the region data.
Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if
requested via the new DEVLINK_ATTR_REGION_DIRECT. If this attribute is set,
then perform a direct read instead of using a snapshot. Direct read is
mutually exclusive with DEVLINK_ATTR_REGION_SNAPSHOT_ID, and care is taken
to ensure that we reject commands which provide incorrect attributes.
Regions must enable support for direct read by implementing the .read()
callback function. If a region does not support such direct reads, a
suitable extended error message is reported.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The devlink_nl_region_read_snapshot_fill is used to copy the contents of
a snapshot into a message for reporting to userspace via the
DEVLINK_CMG_REGION_READ netlink message.
A future change is going to add support for directly reading from
a region. Almost all of the logic for this new capability is identical.
To help reduce code duplication and make this logic more generic,
refactor the function to take a cb and cb_priv pointer for doing the
actual copy.
Add a devlink_region_snapshot_fill implementation that will simply copy
the relevant chunk of the region. This does require allocating some
storage for the chunk as opposed to simply passing the correct address
forward to the devlink_nl_cmg_region_read_chunk_fill function.
A future change to implement support for directly reading from a region
without a snapshot will provide a separate implementation that calls the
newly added devlink region operation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The devlink parameter of the devlink_nl_cmd_region_read_chunk_fill
function is not used. Remove it, to simplify the function signature.
Once removed, it is also obvious that the devlink parameter is not
necessary for the devlink_nl_region_read_snapshot_fill either.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The snapshot pointer is obtained inside of the function
devlink_nl_region_read_snapshot_fill. Simplify this function by locating
the snapshot upfront in devlink_nl_cmd_region_read_dumpit instead. This
aligns with how other netlink attributes are handled, and allows us to
exit slightly earlier if an invalid snapshot ID is provided.
It also allows us to pass the snapshot pointer directly to the
devlink_nl_region_read_snapshot_fill, and remove the now unused attrs
parameter.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Report extended error details in the devlink_nl_cmd_region_read_dumpit()
function, by using the extack structure from the netlink_callback.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The calculation for the data_size in the devlink_nl_read_snapshot_fill
function uses an if statement that is better expressed using the min_t
macro.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently whenever a new rule id is generated, it picks up the next
number bigger than previous id. So it would always be 1, 2, 3, etc.
When the rule with id 1 will be deleted and a new rule will be added,
it will have the id 4 and not id 1.
In theory this can be a problem if at some point a rule will be added
and removed ~0 times. Then no more rules can be added because there
are no more ids.
Change this such that when a new rule is added, search for an empty
rule id starting with value of 1 as value 0 is reserved.
Fixes: c9da1ac1c2 ("net: microchip: sparx5: Adding initial tc flower support for VCAP API")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20221128142959.8325-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This allocation is really spurious.
The size of the bitmap is 'tot_ids' and it is used as such in the driver.
So we could expect something like:
table->id_bmap = devm_kcalloc(rvu->dev, BITS_TO_LONGS(table->tot_ids),
sizeof(long), GFP_KERNEL);
However, when the bitmap is allocated, we allocate:
BITS_TO_LONGS(table->tot_ids) * table->tot_ids ~=
table->tot_ids / 32 * table->tot_ids ~=
table->tot_ids^2 / 32
It is proportional to the square of 'table->tot_ids' which seems to
potentially be big.
Allocate the expected amount of memory, and switch to the bitmap API to
have it more straightforward.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/ce2710771939065d68f95d86a27cf7cea7966365.1669378798.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
commit 8d820bc9d1 ("net: broadcom: Fix BCMGENET Kconfig") fixes the build
that contain 99addbe31f ("net: broadcom: Select BROADCOM_PHY for BCMGENET")
and enable BCMGENET=y but PTP_1588_CLOCK_OPTIONAL=m, which otherwise
leads to a link failure. However this may trigger a runtime failure.
Fix the original issue by propagating the PTP_1588_CLOCK_OPTIONAL dependency
of BROADCOM_PHY down to BCMGENET.
Fixes: 8d820bc9d1 ("net: broadcom: Fix BCMGENET Kconfig")
Fixes: 99addbe31f ("net: broadcom: Select BROADCOM_PHY for BCMGENET")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20221125115003.30308-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building the kernel with clang lto (CONFIG_LTO_CLANG_FULL=y), the
following compilation error will appear:
$ make LLVM=1 LLVM_IAS=1 -j
...
ld.lld: error: ld-temp.o <inline asm>:26889:1: symbol 'cgroup_storage_map_btf_ids' is already defined
cgroup_storage_map_btf_ids:;
^
make[1]: *** [/.../bpf-next/scripts/Makefile.vmlinux_o:61: vmlinux.o] Error 1
In local_storage.c, we have
BTF_ID_LIST_SINGLE(cgroup_storage_map_btf_ids, struct, bpf_local_storage_map)
Commit c4bcfb38a9 ("bpf: Implement cgroup storage available to
non-cgroup-attached bpf progs") added the above identical BTF_ID_LIST_SINGLE
definition in bpf_cgrp_storage.c. With duplicated definitions, llvm linker
complains with lto build.
Also, extracting btf_id of 'struct bpf_local_storage_map' is defined four times
for sk, inode, task and cgrp local storages. Let us define a single global one
with a different name than cgroup_storage_map_btf_ids, which also fixed
the lto compilation error.
Fixes: c4bcfb38a9 ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221130052147.1591625-1-yhs@fb.com
When redirecting, we use sk_msg_to_ingress() to get the BPF_F_INGRESS
flag from the msg->flags. If apply_bytes is used and it is larger than
the current data being processed, sk_psock_msg_verdict() will not be
called when sendmsg() is called again. At this time, the msg->flags is 0,
and we lost the BPF_F_INGRESS flag.
So we need to save the BPF_F_INGRESS flag in sk_psock and use it when
redirection.
Fixes: 8934ce2fd0 ("bpf: sockmap redirect ingress support")
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/1669718441-2654-3-git-send-email-yangpc@wangsu.com
In tcp_bpf_send_verdict() redirection, the eval variable is assigned to
__SK_REDIRECT after the apply_bytes data is sent, if msg has more_data,
sock_put() will be called multiple times.
We should reset the eval variable to __SK_NONE every time more_data
starts.
This causes:
IPv4: Attempt to release TCP socket in state 1 00000000b4c925d7
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 5 PID: 4482 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0x110
Modules linked in:
CPU: 5 PID: 4482 Comm: sockhash_bypass Kdump: loaded Not tainted 6.0.0 #1
Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
Call Trace:
<TASK>
__tcp_transmit_skb+0xa1b/0xb90
? __alloc_skb+0x8c/0x1a0
? __kmalloc_node_track_caller+0x184/0x320
tcp_write_xmit+0x22a/0x1110
__tcp_push_pending_frames+0x32/0xf0
do_tcp_sendpages+0x62d/0x640
tcp_bpf_push+0xae/0x2c0
tcp_bpf_sendmsg_redir+0x260/0x410
? preempt_count_add+0x70/0xa0
tcp_bpf_send_verdict+0x386/0x4b0
tcp_bpf_sendmsg+0x21b/0x3b0
sock_sendmsg+0x58/0x70
__sys_sendto+0xfa/0x170
? xfd_validate_state+0x1d/0x80
? switch_fpu_return+0x59/0xe0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: cd9733f5d7 ("tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function")
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/1669718441-2654-2-git-send-email-yangpc@wangsu.com
Pull clk fixes from Stephen Boyd:
"A set of clk driver fixes that resolve issues for various SoCs.
Most of these are incorrect clk data, like bad parent descriptions.
When the clk tree is improperly described things don't work, like USB
and UFS controllers, because clk frequencies are wonky. Here are the
extra details:
- Fix the parent of UFS reference clks on Qualcomm SC8280XP so that
UFS works properly
- Fix the clk ID for USB on AT91 RM9200 so the USB driver continues
to probe
- Stop using of_device_get_match_data() on the wrong device for a
Samsung Exynos driver so it gets the proper clk data
- Fix ExynosAutov9 binding
- Fix the parent of the div4 clk on Exynos7885
- Stop calling runtime PM APIs from the Qualcomm GDSC driver directly
as it leads to a lockdep splat and is just plain wrong because it
violates runtime PM semantics by calling runtime PM APIs when the
device has been runtime PM disabled"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks
ARM: at91: rm9200: fix usb device clock id
clk: samsung: Revert "clk: samsung: exynos-clkout: Use of_device_get_match_data()"
dt-bindings: clock: exynosautov9: fix reference to CMU_FSYS1
clk: qcom: gdsc: Remove direct runtime PM calls
clk: samsung: exynos7885: Correct "div4" clock parents
The networking programs typically don't require CAP_PERFMON, but through kfuncs
like bpf_cast_to_kern_ctx() they can access memory through PTR_TO_BTF_ID. In
such case enforce CAP_PERFMON.
Also make sure that only GPL programs can access kernel data structures.
All kfuncs require GPL already.
Also remove allow_ptr_to_map_access. It's the same as allow_ptr_leaks and
different name for the same check only causes confusion.
Fixes: fd264ca020 ("bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx")
Fixes: 50c6b8a9ae ("selftests/bpf: Add a test for btf_type_tag "percpu"")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221125220617.26846-1-alexei.starovoitov@gmail.com
BPF CI fails for arm64 and s390x each with the following result:
[...]
All error logs:
serial_test_kprobe_multi_bench_attach:PASS:get_syms 0 nsec
serial_test_kprobe_multi_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
libbpf: prog 'test_kprobe_empty': failed to attach: Operation not supported
serial_test_kprobe_multi_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -95
#92 kprobe_multi_bench_attach:FAIL
[...]
Add the test to the deny list.
Fixes: 5b6c7e5c44 ("selftests/bpf: Add attach bench test")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
It causes build failures with unusual CC/HOSTCC combinations.
Quoting
https://lkml.kernel.org/r/A222B1E6-69B8-4085-AD1B-27BDB72CA971@goldelico.com:
HOSTCC scripts/mod/modpost.o - due to target missing
In file included from include/linux/string.h:5,
from scripts/mod/../../include/linux/license.h:5,
from scripts/mod/modpost.c:24:
include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory
246 | #include <asm/rwonce.h>
| ^~~~~~~~~~~~~~
compilation terminated.
...
The problem is that HOSTCC is not necessarily the same compiler or even
architecture as CC and pulling in <linux/compiler.h> or <asm/rwonce.h>
files indirectly isn't a good idea then.
My toolchain is providing HOSTCC = gcc (MacPorts) and CC = arm-linux-gnueabihf
(built from gcc source) and all running on Darwin.
If I change the include to <string.h> I can then "HOSTCC scripts/mod/modpost.c"
but then it fails for "CC kernel/module/main.c" not finding <string.h>:
CC kernel/module/main.o - due to target missing
In file included from kernel/module/main.c:43:0:
./include/linux/license.h:5:20: fatal error: string.h: No such file or directory
#include <string.h>
^
compilation terminated.
Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com>
Cc: Sam James <sam@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>