[ Upstream commit 9403001ad65ae4f4c5de368bdda3a0636b51d51a ]
Patch series "nilfs2: fix potential issues with empty b-tree nodes".
This series addresses three potential issues with empty b-tree nodes that
can occur with corrupted filesystem images, including one recently
discovered by syzbot.
This patch (of 3):
If a b-tree is broken on the device, and the b-tree height is greater than
2 (the level of the root node is greater than 1) even if the number of
child nodes of the b-tree root is 0, a NULL pointer dereference occurs in
nilfs_btree_prepare_insert(), which is called from nilfs_btree_insert().
This is because, when the number of child nodes of the b-tree root is 0,
nilfs_btree_do_lookup() does not set the block buffer head in any of
path[x].bp_bh, leaving it as the initial value of NULL, but if the level
of the b-tree root node is greater than 1, nilfs_btree_get_nonroot_node(),
which accesses the buffer memory of path[x].bp_bh, is called.
Fix this issue by adding a check to nilfs_btree_root_broken(), which
performs sanity checks when reading the root node from the device, to
detect this inconsistency.
Thanks to Lizhi Xu for trying to solve the bug and clarifying the cause
early on.
Link: https://lkml.kernel.org/r/20240904081401.16682-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240902084101.138971-1-lizhi.xu@windriver.com
Link: https://lkml.kernel.org/r/20240904081401.16682-2-konishi.ryusuke@gmail.com
Fixes: 17c76b0104 ("nilfs2: B-tree based block mapping")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+9bff4c7b992038a7409f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9bff4c7b992038a7409f
Cc: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f22cde4371f3c624e947a35b075c06c771442a43 ]
Problem statement:
Since commit fc137c0dda ("sched/numa: enhance vma scanning logic"), the
Numa vma scan overhead has been reduced a lot. Meanwhile, the reducing of
the vma scan might create less Numa page fault information. The
insufficient information makes it harder for the Numa balancer to make
decision. Later, commit b7a5b537c55c08 ("sched/numa: Complete scanning of
partial VMAs regardless of PID activity") and commit 84db47ca7146d7
("sched/numa: Fix mm numa_scan_seq based unconditional scan") are found to
bring back part of the performance.
Recently when running SPECcpu omnetpp_r on a 320 CPUs/2 Sockets system, a
long duration of remote Numa node read was observed by PMU events: A few
cores having ~500MB/s remote memory access for ~20 seconds. It causes
high core-to-core variance and performance penalty. After the
investigation, it is found that many vmas are skipped due to the active
PID check. According to the trace events, in most cases,
vma_is_accessed() returns false because the history access info stored in
pids_active array has been cleared.
Proposal:
The main idea is to adjust vma_is_accessed() to let it return true easier.
Thus compare the diff between mm->numa_scan_seq and
vma->numab_state->prev_scan_seq. If the diff has exceeded the threshold,
scan the vma.
This patch especially helps the cases where there are small number of
threads, like the process-based SPECcpu. Without this patch, if the
SPECcpu process access the vma at the beginning, then sleeps for a long
time, the pid_active array will be cleared. A a result, if this process
is woken up again, it never has a chance to set prot_none anymore.
Because only the first 2 times of access is granted for vma scan:
(current->mm->numa_scan_seq) - vma->numab_state->start_scan_seq) < 2 to be
worse, no other threads within the task can help set the prot_none. This
causes information lost.
Raghavendra helped test current patch and got the positive result
on the AMD platform:
autonumabench NUMA01
base patched
Amean syst-NUMA01 194.05 ( 0.00%) 165.11 * 14.92%*
Amean elsp-NUMA01 324.86 ( 0.00%) 315.58 * 2.86%*
Duration User 380345.36 368252.04
Duration System 1358.89 1156.23
Duration Elapsed 2277.45 2213.25
autonumabench NUMA02
Amean syst-NUMA02 1.12 ( 0.00%) 1.09 * 2.93%*
Amean elsp-NUMA02 3.50 ( 0.00%) 3.56 * -1.84%*
Duration User 1513.23 1575.48
Duration System 8.33 8.13
Duration Elapsed 28.59 29.71
kernbench
Amean user-256 22935.42 ( 0.00%) 22535.19 * 1.75%*
Amean syst-256 7284.16 ( 0.00%) 7608.72 * -4.46%*
Amean elsp-256 159.01 ( 0.00%) 158.17 * 0.53%*
Duration User 68816.41 67615.74
Duration System 21873.94 22848.08
Duration Elapsed 506.66 504.55
Intel 256 CPUs/2 Sockets:
autonuma benchmark also shows improvements:
v6.10-rc5 v6.10-rc5
+patch
Amean syst-NUMA01 245.85 ( 0.00%) 230.84 * 6.11%*
Amean syst-NUMA01_THREADLOCAL 205.27 ( 0.00%) 191.86 * 6.53%*
Amean syst-NUMA02 18.57 ( 0.00%) 18.09 * 2.58%*
Amean syst-NUMA02_SMT 2.63 ( 0.00%) 2.54 * 3.47%*
Amean elsp-NUMA01 517.17 ( 0.00%) 526.34 * -1.77%*
Amean elsp-NUMA01_THREADLOCAL 99.92 ( 0.00%) 100.59 * -0.67%*
Amean elsp-NUMA02 15.81 ( 0.00%) 15.72 * 0.59%*
Amean elsp-NUMA02_SMT 13.23 ( 0.00%) 12.89 * 2.53%*
v6.10-rc5 v6.10-rc5
+patch
Duration User 1064010.16 1075416.23
Duration System 3307.64 3104.66
Duration Elapsed 4537.54 4604.73
The SPECcpu remote node access issue disappears with the patch applied.
Link: https://lkml.kernel.org/r/20240827112958.181388-1-yu.c.chen@intel.com
Fixes: fc137c0dda ("sched/numa: enhance vma scanning logic")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reported-by: Xiaoping Zhou <xiaoping.zhou@intel.com>
Reviewed-and-tested-by: Raghavendra K T <raghavendra.kt@amd.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f169c62ff7cd1acf8bac8ae17bfeafa307d9e6fa ]
VMAs are skipped if there is no recent fault activity but this represents
a chicken-and-egg problem as there may be no fault activity if the PTEs
are never updated to trap NUMA hints. There is an indirect reliance on
scanning to be forced early in the lifetime of a task but this may fail
to detect changes in phase behaviour. Force inactive VMAs to be scanned
when all other eligible VMAs have been updated within the same scan
sequence.
Test results in general look good with some changes in performance, both
negative and positive, depending on whether the additional scanning and
faulting was beneficial or not to the workload. The autonuma benchmark
workload NUMA01_THREADLOCAL was picked for closer examination. The workload
creates two processes with numerous threads and thread-local storage that
is zero-filled in a loop. It exercises the corner case where unrelated
threads may skip VMAs that are thread-local to another thread and still
has some VMAs that inactive while the workload executes.
The VMA skipping activity frequency with and without the patch:
6.6.0-rc2-sched-numabtrace-v1
=============================
649 reason=scan_delay
9,094 reason=unsuitable
48,915 reason=shared_ro
143,919 reason=inaccessible
193,050 reason=pid_inactive
6.6.0-rc2-sched-numabselective-v1
=============================
146 reason=seq_completed
622 reason=ignore_pid_inactive
624 reason=scan_delay
6,570 reason=unsuitable
16,101 reason=shared_ro
27,608 reason=inaccessible
41,939 reason=pid_inactive
Note that with the patch applied, the PID activity is ignored
(ignore_pid_inactive) to ensure a VMA with some activity is completely
scanned. In addition, a small number of VMAs are scanned when no other
eligible VMA is available during a single scan window (seq_completed).
The number of times a VMA is skipped due to no PID activity from the
scanning task (pid_inactive) drops dramatically. It is expected that
this will increase the number of PTEs updated for NUMA hinting faults
as well as hinting faults but these represent PTEs that would otherwise
have been missed. The tradeoff is scan+fault overhead versus improving
locality due to migration.
On a 2-socket Cascade Lake test machine, the time to complete the
workload is as follows;
6.6.0-rc2 6.6.0-rc2
sched-numabtrace-v1 sched-numabselective-v1
Min elsp-NUMA01_THREADLOCAL 174.22 ( 0.00%) 117.64 ( 32.48%)
Amean elsp-NUMA01_THREADLOCAL 175.68 ( 0.00%) 123.34 * 29.79%*
Stddev elsp-NUMA01_THREADLOCAL 1.20 ( 0.00%) 4.06 (-238.20%)
CoeffVar elsp-NUMA01_THREADLOCAL 0.68 ( 0.00%) 3.29 (-381.70%)
Max elsp-NUMA01_THREADLOCAL 177.18 ( 0.00%) 128.03 ( 27.74%)
The time to complete the workload is reduced by almost 30%:
6.6.0-rc2 6.6.0-rc2
sched-numabtrace-v1 sched-numabselective-v1 /
Duration User 91201.80 63506.64
Duration System 2015.53 1819.78
Duration Elapsed 1234.77 868.37
In this specific case, system CPU time was not increased but it's not
universally true.
From vmstat, the NUMA scanning and fault activity is as follows;
6.6.0-rc2 6.6.0-rc2
sched-numabtrace-v1 sched-numabselective-v1
Ops NUMA base-page range updates 64272.00 26374386.00
Ops NUMA PTE updates 36624.00 55538.00
Ops NUMA PMD updates 54.00 51404.00
Ops NUMA hint faults 15504.00 75786.00
Ops NUMA hint local faults % 14860.00 56763.00
Ops NUMA hint local percent 95.85 74.90
Ops NUMA pages migrated 1629.00 6469222.00
Both the number of PTE updates and hint faults is dramatically
increased. While this is superficially unfortunate, it represents
ranges that were simply skipped without the patch. As a result
of the scanning and hinting faults, many more pages were also
migrated but as the time to completion is reduced, the overhead
is offset by the gain.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Raghavendra K T <raghavendra.kt@amd.com>
Link: https://lore.kernel.org/r/20231010083143.19593-7-mgorman@techsingularity.net
Stable-dep-of: f22cde4371f3 ("sched/numa: Fix the vma scan starving issue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7a5b537c55c088d891ae554103d1b281abef781 ]
NUMA Balancing skips VMAs when the current task has not trapped a NUMA
fault within the VMA. If the VMA is skipped then mm->numa_scan_offset
advances and a task that is trapping faults within the VMA may never
fully update PTEs within the VMA.
Force tasks to update PTEs for partially scanned PTEs. The VMA will
be tagged for NUMA hints by some task but this removes some of the
benefit of tracking PID activity within a VMA. A follow-on patch
will mitigate this problem.
The test cases and machines evaluated did not trigger the corner case so
the performance results are neutral with only small changes within the
noise from normal test-to-test variance. However, the next patch makes
the corner case easier to trigger.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Raghavendra K T <raghavendra.kt@amd.com>
Link: https://lore.kernel.org/r/20231010083143.19593-6-mgorman@techsingularity.net
Stable-dep-of: f22cde4371f3 ("sched/numa: Fix the vma scan starving issue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e2675db1906ac04809f5399bf1f5e30d56a6f3e ]
Recent NUMA hinting faulting activity is reset approximately every
VMA_PID_RESET_PERIOD milliseconds. However, if the current task has not
accessed a VMA then the reset check is missed and the reset is potentially
deferred forever. Check if the PID activity information should be reset
before checking if the current task recently trapped a NUMA hinting fault.
[ mgorman@techsingularity.net: Rewrite changelog ]
Suggested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231010083143.19593-5-mgorman@techsingularity.net
Stable-dep-of: f22cde4371f3 ("sched/numa: Fix the vma scan starving issue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed2da8b725b932b1e2b2f4835bb664d47ed03031 ]
NUMA balancing skips or scans VMAs for a variety of reasons. In preparation
for completing scans of VMAs regardless of PID access, trace the reasons
why a VMA was skipped. In a later patch, the tracing will be used to track
if a VMA was forcibly scanned.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231010083143.19593-4-mgorman@techsingularity.net
Stable-dep-of: f22cde4371f3 ("sched/numa: Fix the vma scan starving issue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3a6c97940fbd25d6c84c2d5642338fc99a9b35b ]
The access_pids[] field name is somewhat ambiguous as no PIDs are accessed.
Similarly, it's not clear that next_pid_reset is related to access_pids[].
Rename the fields to more accurately reflect their purpose.
[ mingo: Rename in the comments too. ]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231010083143.19593-3-mgorman@techsingularity.net
Stable-dep-of: f22cde4371f3 ("sched/numa: Fix the vma scan starving issue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d231b91a944f3cab355fce65af5871fb5d7735b ]
In case of errors when reading an inode from disk or traversing inline
directory entries, return an error-encoded ERR_PTR instead of returning
NULL. ext4_find_inline_entry only caller, __ext4_find_entry already returns
such encoded errors.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://patch.msgid.link/20240821152324.3621860-3-cascardo@igalia.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: c6b72f5d82b1 ("ext4: avoid OOB when system.data xattr changes underneath the filesystem")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb0a12c3439b10d88412fd3102df5b9a6e3cd6dc ]
min_clusters is signed integer and will be converted to unsigned
integer when compared with unsigned number stats.free_clusters.
If min_clusters is negative, it will be converted to a huge unsigned
value in which case all groups may not meet the actual desired free
clusters.
Set negative min_clusters to 0 to avoid unexpected behavior.
Fixes: ac27a0ec11 ("[PATCH] ext4: initial copy of files from ext3")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://patch.msgid.link/20240820132234.2759926-4-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 227d31b9214d1b9513383cf6c7180628d4b3b61f ]
If a group is marked EXT4_GROUP_INFO_IBITMAP_CORRUPT after it's inode
bitmap buffer_head was successfully verified, then __ext4_new_inode()
will get a valid inode_bitmap_bh of a corrupted group from
ext4_read_inode_bitmap() in which case inode_bitmap_bh misses a release.
Hnadle "IS_ERR(inode_bitmap_bh)" and group corruption separately like
how ext4_free_inode() does to avoid buffer_head leak.
Fixes: 9008a58e5d ("ext4: make the bitmap read routines return real error codes")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://patch.msgid.link/20240820132234.2759926-3-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2749749afa071f8a0e405605de9da615e771a7ce ]
In the `smk_set_cipso` function, the `skp->smk_netlabel.attr.mls.cat`
field is directly assigned to a new value without using the appropriate
RCU pointer assignment functions. According to RCU usage rules, this is
illegal and can lead to unpredictable behavior, including data
inconsistencies and impossible-to-diagnose memory corruption issues.
This possible bug was identified using a static analysis tool developed
by myself, specifically designed to detect RCU-related issues.
To address this, the assignment is now done using rcu_assign_pointer(),
which ensures that the pointer assignment is done safely, with the
necessary memory barriers and synchronization. This change prevents
potential RCU dereference issues by ensuring that the `cat` field is
safely updated while still adhering to RCU's requirements.
Fixes: 0817534ff9 ("smackfs: Fix use-after-free in netlbl_catmap_walk()")
Signed-off-by: Jiawei Ye <jiawei.ye@foxmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20cee68f5b44fdc2942d20f3172a262ec247b117 ]
Commit 3d56b8d2c7 ("ext4: Speed up FITRIM by recording flags in
ext4_group_info") speed up fstrim by skipping trim trimmed group. We
also has the chance to clear trimmed once there exists some block free
for this group(mount without discard), and the next trim for this group
will work well too.
For mount with discard, we will issue dicard when we free blocks, so
leave trimmed flag keep alive to skip useless trim trigger from
userspace seems reasonable. But for some case like ext4 build on
dm-thinpool(ext4 blocksize 4K, pool blocksize 128K), discard from ext4
maybe unaligned for dm thinpool, and thinpool will just finish this
discard(see process_discard_bio when begein equals to end) without
actually process discard. For this case, trim from userspace can really
help us to free some thinpool block.
So convert to clear trimmed flag for all case no matter mounted with
discard or not.
Fixes: 3d56b8d2c7 ("ext4: Speed up FITRIM by recording flags in ext4_group_info")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240817085510.2084444-1-yangerkun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e16c7b07784f3fb03025939c4590b9a7c64970a7 ]
When analyzing a kernel waring message, Peter pointed out that there is a
race condition when the kworker is being frozen and falls into
try_to_freeze() with TASK_INTERRUPTIBLE, which could trigger a
might_sleep() warning in try_to_freeze(). Although the root cause is not
related to freeze()[1], it is still worthy to fix this issue ahead.
One possible race scenario:
CPU 0 CPU 1
----- -----
// kthread_worker_fn
set_current_state(TASK_INTERRUPTIBLE);
suspend_freeze_processes()
freeze_processes
static_branch_inc(&freezer_active);
freeze_kernel_threads
pm_nosig_freezing = true;
if (work) { //false
__set_current_state(TASK_RUNNING);
} else if (!freezing(current)) //false, been frozen
freezing():
if (static_branch_unlikely(&freezer_active))
if (pm_nosig_freezing)
return true;
schedule()
}
// state is still TASK_INTERRUPTIBLE
try_to_freeze()
might_sleep() <--- warning
Fix this by explicitly set the TASK_RUNNING before entering
try_to_freeze().
Link: https://lore.kernel.org/lkml/Zs2ZoAcUsZMX2B%2FI@chenyu5-mobl2/ [1]
Link: https://lkml.kernel.org/r/20240827112308.181081-1-yu.c.chen@intel.com
Fixes: b56c0d8937 ("kthread: implement kthread_worker")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: David Gow <davidgow@google.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d2786d65aaa954ebd3fcc033ada433e10da21c4 ]
In case of malformed relocation record of kind BPF_CORE_TYPE_ID_LOCAL
referencing a non-existing BTF type, function bpf_core_calc_relo_insn
would cause a null pointer deference.
Fix this by adding a proper check upper in call stack, as malformed
relocation records could be passed from user space.
Simplest reproducer is a program:
r0 = 0
exit
With a single relocation record:
.insn_off = 0, /* patch first instruction */
.type_id = 100500, /* this type id does not exist */
.access_str_off = 6, /* offset of string "0" */
.kind = BPF_CORE_TYPE_ID_LOCAL,
See the link for original reproducer or next commit for a test case.
Fixes: 74753e1462 ("libbpf: Replace btf__type_by_id() with btf_type_by_id().")
Reported-by: Liu RuiTong <cnitlrt@gmail.com>
Closes: https://lore.kernel.org/bpf/CAK55_s6do7C+DVwbwY_7nKfUz0YLDoiA1v6X3Y9+p0sWzipFSA@mail.gmail.com/
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240822080124.2995724-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdf1c728fac541891ef1aa773bfd42728626769c ]
Currently, compiling the bpf programs will result the compilation errors
with the cf-protection option as follows in arm64 and loongarch64 machine
when using gcc 12.3.1 and clang 17.0.6. This commit fixes the compilation
errors by limited the cf-protection option only used in x86 platform.
[root@localhost linux]# make M=samples/bpf
......
CLANG-bpf samples/bpf/xdp2skb_meta_kern.o
error: option 'cf-protection=return' cannot be specified on this target
error: option 'cf-protection=branch' cannot be specified on this target
2 errors generated.
CLANG-bpf samples/bpf/syscall_tp_kern.o
error: option 'cf-protection=return' cannot be specified on this target
error: option 'cf-protection=branch' cannot be specified on this target
2 errors generated.
......
Fixes: 34f6e38f58 ("samples/bpf: fix warning with ignored-attributes")
Reported-by: Jiangshan Yi <yijiangshan@kylinos.cn>
Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Qiang Wang <wangqiang1@kylinos.cn>
Link: https://lore.kernel.org/bpf/20240815135524.140675-1-13667453960@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21c5f4f55da759c7444a1ef13e90b6e6f674eeeb ]
Linux 5.1 implemented 64-bit time types and related syscalls to address the
Y2038 problem generally across archs. Userspace handling of Y2038 varies
with the libc however. While musl libc uses 64-bit time across all 32-bit
and 64-bit platforms, GNU glibc uses 64-bit time on 64-bit platforms but
defaults to 32-bit time on 32-bit platforms unless they "opt-in" to 64-bit
time or explicitly use 64-bit syscalls and time structures.
One specific area is the standard setsockopt() call, SO_TIMESTAMPNS option
used for timestamping, and the related output 'struct timespec'. GNU glibc
defaults as above, also exposing the SO_TIMESTAMPNS_NEW flag to explicitly
use a 64-bit call and 'struct __kernel_timespec'. Since these are not
exposed or needed with musl libc, their use in tc_redirect.c leads to
compile errors building for mips64el/musl:
tc_redirect.c: In function 'rcv_tstamp':
tc_redirect.c:425:32: error: 'SO_TIMESTAMPNS_NEW' undeclared (first use in this function); did you mean 'SO_TIMESTAMPNS'?
425 | cmsg->cmsg_type == SO_TIMESTAMPNS_NEW)
| ^~~~~~~~~~~~~~~~~~
| SO_TIMESTAMPNS
tc_redirect.c:425:32: note: each undeclared identifier is reported only once for each function it appears in
tc_redirect.c: In function 'test_inet_dtime':
tc_redirect.c:491:49: error: 'SO_TIMESTAMPNS_NEW' undeclared (first use in this function); did you mean 'SO_TIMESTAMPNS'?
491 | err = setsockopt(listen_fd, SOL_SOCKET, SO_TIMESTAMPNS_NEW,
| ^~~~~~~~~~~~~~~~~~
| SO_TIMESTAMPNS
However, using SO_TIMESTAMPNS_NEW isn't strictly needed, nor is Y2038 being
explicitly tested. The timestamp checks in tc_redirect.c are simple: the
packet receive timestamp is non-zero and processed/handled in less than 5
seconds.
Switch to using the standard setsockopt() call and SO_TIMESTAMPNS option to
ensure compatibility across glibc and musl libc. In the worst-case, there
is a 5-second window 14 years from now where tc_redirect tests may fail on
32-bit systems. However, we should reasonably expect glibc to adopt a
64-bit mandate rather than the current "opt-in" policy before the Y2038
roll-over.
Fixes: ce6f6cffaeaa ("selftests/bpf: Wait for the netstamp_needed_key static key to be turned on")
Fixes: c803475fd8 ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/031d656c058b4e55ceae56ef49c4e1729b5090f3.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c9a83e76b5a96801a2c7ea0a79ca77c356d8b38d ]
Include GNU <execinfo.h> header only with glibc and provide weak, stubbed
backtrace functions as a fallback in test_progs.c. This allows for non-GNU
replacements while avoiding compile errors (e.g. with musl libc) like:
test_progs.c:13:10: fatal error: execinfo.h: No such file or directory
13 | #include <execinfo.h> /* backtrace */
| ^~~~~~~~~~~~
test_progs.c: In function 'crash_handler':
test_progs.c:1034:14: error: implicit declaration of function 'backtrace' [-Werror=implicit-function-declaration]
1034 | sz = backtrace(bt, ARRAY_SIZE(bt));
| ^~~~~~~~~
test_progs.c:1045:9: error: implicit declaration of function 'backtrace_symbols_fd' [-Werror=implicit-function-declaration]
1045 | backtrace_symbols_fd(bt, sz, STDERR_FILENO);
| ^~~~~~~~~~~~~~~~~~~~
Fixes: 9fb156bb82 ("selftests/bpf: Print backtrace on SIGSEGV in test_progs")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/aa6dc8e23710cb457b278039d0081de7e7b4847d.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 16b795cc59528cf280abc79af3c70bda42f715b9 ]
Compiling lwt_reroute.c with GCC 12.3 for mips64el/musl-libc yields errors:
In file included from .../include/arpa/inet.h:9,
from ./test_progs.h:18,
from tools/testing/selftests/bpf/prog_tests/lwt_helpers.h:11,
from tools/testing/selftests/bpf/prog_tests/lwt_reroute.c:52:
.../include/netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
23 | struct in6_addr {
| ^~~~~~~~
In file included from .../include/linux/icmp.h:24,
from tools/testing/selftests/bpf/prog_tests/lwt_helpers.h:9:
.../include/linux/in6.h:33:8: note: originally defined here
33 | struct in6_addr {
| ^~~~~~~~
.../include/netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
34 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../include/linux/in6.h:50:8: note: originally defined here
50 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../include/netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
42 | struct ipv6_mreq {
| ^~~~~~~~~
.../include/linux/in6.h:60:8: note: originally defined here
60 | struct ipv6_mreq {
| ^~~~~~~~~
These errors occur because <linux/in6.h> is included before <netinet/in.h>,
bypassing the Linux uapi/libc compat mechanism's partial musl support. As
described in [1] and [2], fix these errors by including <netinet/in.h> in
lwt_reroute.c before any uapi headers.
[1]: commit c0bace7984 ("uapi libc compat: add fallback for unsupported libcs")
[2]: https://git.musl-libc.org/cgit/musl/commit/?id=04983f227238
Fixes: 6c77997bc6 ("selftests/bpf: Add lwt_xmit tests for BPF_REROUTE")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/bd2908aec0755ba8b75f5dc41848b00585f5c73e.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7f31873176a345d72ca77c7b4da48493ccd9efd ]
Recently, when running './test_progs -j', I occasionally hit the
following errors:
test_lwt_redirect:PASS:pthread_create 0 nsec
test_lwt_redirect_run:FAIL:netns_create unexpected error: 256 (errno 0)
#142/2 lwt_redirect/lwt_redirect_normal_nomac:FAIL
#142 lwt_redirect:FAIL
test_lwt_reroute:PASS:pthread_create 0 nsec
test_lwt_reroute_run:FAIL:netns_create unexpected error: 256 (errno 0)
test_lwt_reroute:PASS:pthread_join 0 nsec
#143/2 lwt_reroute/lwt_reroute_qdisc_dropped:FAIL
#143 lwt_reroute:FAIL
The netns_create() definition looks like below:
#define NETNS "ns_lwt"
static inline int netns_create(void)
{
return system("ip netns add " NETNS);
}
One possibility is that both lwt_redirect and lwt_reroute create
netns with the same name "ns_lwt" which may cause conflict. I tried
the following example:
$ sudo ip netns add abc
$ echo $?
0
$ sudo ip netns add abc
Cannot create namespace file "/var/run/netns/abc": File exists
$ echo $?
1
$
The return code for above netns_create() is 256. The internet search
suggests that the return value for 'ip netns add ns_lwt' is 1, which
matches the above 'sudo ip netns add abc' example.
This patch tried to use different netns names for two tests to avoid
'ip netns add <name>' failure.
I ran './test_progs -j' 10 times and all succeeded with
lwt_redirect/lwt_reroute tests.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240205052914.1742687-1-yonghong.song@linux.dev
Stable-dep-of: 16b795cc5952 ("selftests/bpf: Fix redefinition errors compiling lwt_reroute.c")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa95073fd290b5b3e45f067fa22bb25e59e1ff7c ]
While building, bpftool makes a skeleton from test_core_extern.c, which
itself includes <stdbool.h> and uses the 'bool' type. However, the skeleton
test_core_extern.skel.h generated *does not* include <stdbool.h> or use the
'bool' type, instead using the C-only '_Bool' type. Compiling test_cpp.cpp
with g++ 12.3 for mips64el/musl-libc then fails with error:
In file included from test_cpp.cpp:9:
test_core_extern.skel.h:45:17: error: '_Bool' does not name a type
45 | _Bool CONFIG_BOOL;
| ^~~~~
This was likely missed previously because glibc uses a GNU extension for
<stdbool.h> with C++ (#define _Bool bool), not supported by musl libc.
Normally, a C fragment would include <stdbool.h> and use the 'bool' type,
and thus cleanly work after import by C++. The ideal fix would be for
'bpftool gen skeleton' to output the correct type/include supporting C++,
but in the meantime add a conditional define as above.
Fixes: 7c8dce4b16 ("bpftool: Make skeleton C code compilable with C++ compiler")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6fc1dd28b8bda49e51e4f610bdc9d22f4455632d.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cacf2a5a78cd1f5f616eae043ebc6f024104b721 ]
Although the post-increment in macro 'CPU_SET(next++, &cpuset)' seems safe,
the sequencing can raise compile errors, so move the increment outside the
macro. This avoids an error seen using gcc 12.3.0 for mips64el/musl-libc:
In file included from test_lru_map.c:11:
test_lru_map.c: In function 'sched_next_online':
test_lru_map.c:129:29: error: operation on 'next' may be undefined [-Werror=sequence-point]
129 | CPU_SET(next++, &cpuset);
| ^
cc1: all warnings being treated as errors
Fixes: 3fbfadce60 ("bpf: Fix test_lru_sanity5() in test_lru_map.c")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/22993dfb11ccf27925a626b32672fd3324cb76c4.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03bfcda1fbc37ef34aa21d2b9e09138335afc6ee ]
Current code parses arguments with strtok_r() using a construct like
char *state = NULL;
while ((next = strtok_r(state ? NULL : input, ",", &state))) {
...
}
where logic assumes the 'state' var can distinguish between first and
subsequent strtok_r() calls, and adjusts parameters accordingly. However,
'state' is strictly internal context for strtok_r() and no such assumptions
are supported in the man page. Moreover, the exact behaviour of 'state'
depends on the libc implementation, making the above code fragile.
Indeed, invoking "./test_progs -t <test_name>" on mips64el/musl will hang,
with the above code in an infinite loop.
Similarly, we see strange behaviour running 'veristat' on mips64el/musl:
$ ./veristat -e file,prog,verdict,insns -C two-ok add-failure
Can't specify more than 9 stats
Rewrite code using a counter to distinguish between strtok_r() calls.
Fixes: 61ddff373f ("selftests/bpf: Improve by-name subtest selection logic in prog_tests")
Fixes: 394169b079 ("selftests/bpf: add comparison mode to veristat")
Fixes: c8bc5e0509 ("selftests/bpf: Add veristat tool for mass-verifying BPF object files")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/392d8bf5559f85fa37926c1494e62312ef252c3d.1722244708.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 730561d3c08d4a327cceaabf11365958a1c00cec ]
Remove a redundant include of '<asm/types.h>', whose needed definitions are
already included (via '<linux/types.h>') in cg_storage_multi_egress_only.c,
cg_storage_multi_isolated.c, and cg_storage_multi_shared.c. This avoids
redefinition errors seen compiling for mips64el/musl-libc like:
In file included from progs/cg_storage_multi_egress_only.c:13:
In file included from progs/cg_storage_multi.h:6:
In file included from /usr/mips64el-linux-gnuabi64/include/asm/types.h:23:
/usr/include/asm-generic/int-l64.h:29:25: error: typedef redefinition with different types ('long' vs 'long long')
29 | typedef __signed__ long __s64;
| ^
/usr/include/asm-generic/int-ll64.h:30:44: note: previous definition is here
30 | __extension__ typedef __signed__ long long __s64;
| ^
Fixes: 9e5bd1f763 ("selftests/bpf: Test CGROUP_STORAGE map can't be used by multiple progs")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/4f4702e9f6115b7f84fea01b2326ca24c6df7ba8.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b00f355130a5dfc38a01ad02458ae2cb2ebe609 ]
Remove a redundant include of '<linux/in6.h>', whose needed definitions are
already provided by 'test_progs.h'. This avoids errors seen compiling for
mips64el/musl-libc:
In file included from .../arpa/inet.h:9,
from ./test_progs.h:17,
from prog_tests/decap_sanity.c:9:
.../netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
23 | struct in6_addr {
| ^~~~~~~~
In file included from decap_sanity.c:7:
.../linux/in6.h:33:8: note: originally defined here
33 | struct in6_addr {
| ^~~~~~~~
.../netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
34 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../linux/in6.h:50:8: note: originally defined here
50 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
42 | struct ipv6_mreq {
| ^~~~~~~~~
.../linux/in6.h:60:8: note: originally defined here
60 | struct ipv6_mreq {
| ^~~~~~~~~
Fixes: 70a00e2f1d ("selftests/bpf: Test bpf_skb_adjust_room on CHECKSUM_PARTIAL")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/e986ba2d7edccd254b54f7cd049b98f10bafa8c3.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27c4797ce51c8dd51e35e68e9024a892f62d78b2 ]
Remove a redundant include of '<linux/icmp.h>' which is already provided in
'lwt_helpers.h'. This avoids errors seen compiling for mips64el/musl-libc:
In file included from .../arpa/inet.h:9,
from lwt_redirect.c:51:
.../netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
23 | struct in6_addr {
| ^~~~~~~~
In file included from .../linux/icmp.h:24,
from lwt_redirect.c:50:
.../linux/in6.h:33:8: note: originally defined here
33 | struct in6_addr {
| ^~~~~~~~
.../netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
34 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../linux/in6.h:50:8: note: originally defined here
50 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
.../netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
42 | struct ipv6_mreq {
| ^~~~~~~~~
.../linux/in6.h:60:8: note: originally defined here
60 | struct ipv6_mreq {
| ^~~~~~~~~
Fixes: 43a7c3ef8a ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/3869dda876d5206d2f8d4dd67331c739ceb0c7f8.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit debfa4f628f271f72933bf38d581cc53cfe1def5 ]
The type 'loff_t' is a GNU extension and not exposed by the musl 'fcntl.h'
header unless _GNU_SOURCE is defined. Add this definition to fix errors
seen compiling for mips64el/musl-libc:
In file included from tools/testing/selftests/bpf/prog_tests/core_reloc.c:4:
./bpf_testmod/bpf_testmod.h:10:9: error: unknown type name 'loff_t'
10 | loff_t off;
| ^~~~~~
./bpf_testmod/bpf_testmod.h:16:9: error: unknown type name 'loff_t'
16 | loff_t off;
| ^~~~~~
Fixes: 6bcd39d366 ("selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/11c3af75a7eb6bcb7ad9acfae6a6f470c572eb82.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18826fb0b79c3c3cd1fe765d85f9c6f1a902c722 ]
The GNU version of 'struct tcp_info' in 'netinet/tcp.h' is not exposed by
musl headers unless _GNU_SOURCE is defined.
Add this definition to fix errors seen compiling for mips64el/musl-libc:
tcp_rtt.c: In function 'wait_for_ack':
tcp_rtt.c:24:25: error: storage size of 'info' isn't known
24 | struct tcp_info info;
| ^~~~
tcp_rtt.c:24:25: error: unused variable 'info' [-Werror=unused-variable]
cc1: all warnings being treated as errors
Fixes: 1f4f80fed2 ("selftests/bpf: test_progs: convert test_tcp_rtt")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/f2329767b15df206f08a5776d35a47c37da855ae.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e4c43bcb85973243d7274e0058b6e8f5810e4f7 ]
The GNU version of 'struct tcphdr' has members 'doff', 'source' and 'dest',
which are not exposed by musl libc headers unless _GNU_SOURCE is defined.
Add this definition to fix errors seen compiling for mips64el/musl-libc:
flow_dissector.c:118:30: error: 'struct tcphdr' has no member named 'doff'
118 | .tcp.doff = 5,
| ^~~~
flow_dissector.c:119:30: error: 'struct tcphdr' has no member named 'source'
119 | .tcp.source = 80,
| ^~~~~~
flow_dissector.c:120:30: error: 'struct tcphdr' has no member named 'dest'
120 | .tcp.dest = 8080,
| ^~~~
Fixes: ae173a9157 ("selftests/bpf: support BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/8f7ab21a73f678f9cebd32b26c444a686e57414d.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bae9a5ce7d3a9b3a9e07b31ab9e9c58450e3e9fd ]
The GNU version of 'struct tcphdr' with member 'doff' is not exposed by
musl headers unless _GNU_SOURCE is defined. Add this definition to fix
errors seen compiling for mips64el/musl-libc:
In file included from kfree_skb.c:2:
kfree_skb.c: In function 'on_sample':
kfree_skb.c:45:30: error: 'struct tcphdr' has no member named 'doff'
45 | if (CHECK(pkt_v6->tcp.doff != 5, "check_tcp",
| ^
Fixes: 580d656d80 ("selftests/bpf: Add kfree_skb raw_tp test")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/e2d8cedc790959c10d6822a51f01a7a3616bea1b.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c329b99ef9c118343379bde9f97e8ce5cac9fc9 ]
The GNU version of 'struct tcphdr', with members 'doff' and 'urg_ptr', is
not exposed by musl headers unless _GNU_SOURCE is defined.
Add this definition to fix errors seen compiling for mips64el/musl-libc:
parse_tcp_hdr_opt.c:18:21: error: 'struct tcphdr' has no member named 'urg_ptr'
18 | .pk6_v6.tcp.urg_ptr = 123,
| ^~~~~~~
parse_tcp_hdr_opt.c:19:21: error: 'struct tcphdr' has no member named 'doff'
19 | .pk6_v6.tcp.doff = 9, /* 16 bytes of options */
| ^~~~
Fixes: cfa7b01189 ("selftests/bpf: tests for using dynptrs to parse skb and xdp buffers")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/ac5440213c242c62cb4e0d9e0a9cd5058b6a31f6.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d4bd29e363c467752536f874a2cba10a5923c59 ]
Refactor some functions in both user space code and bpf program
as these functions are used by later cgroup/sk_msg tests.
Another change is to mark tp program optional loading as later
patches will use optional loading as well since they have quite
different attachment and testing logic.
There is no functionality change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184904.2976123-1-yonghong.song@linux.dev
Stable-dep-of: 21f0b0af9772 ("selftests/bpf: Fix include of <sys/fcntl.h>")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 84239a24d10174fcfc7d6760cb120435a6ff69af ]
Replace CHECK in selftest ns_current_pid_tgid with recommended ASSERT_* style.
I also shortened subtest name as the prefix of subtest name is covered
by the test name already.
This patch does fix a testing issue. Currently even if bss->user_{pid,tgid}
is not correct, the test still passed since the clone func returns 0.
I fixed it to return a non-zero value if bss->user_{pid,tgid} is incorrect.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184859.2975543-1-yonghong.song@linux.dev
Stable-dep-of: 21f0b0af9772 ("selftests/bpf: Fix include of <sys/fcntl.h>")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6495eb79ca7d15bd87c38d77307e8f9b6b7bf4ef ]
Explicitly include '<linux/build_bug.h>' to fix errors seen compiling with
gcc targeting mips64el/musl-libc:
user_ringbuf.c: In function 'test_user_ringbuf_loop':
user_ringbuf.c:426:9: error: implicit declaration of function 'BUILD_BUG_ON' [-Werror=implicit-function-declaration]
426 | BUILD_BUG_ON(total_samples <= c_max_entries);
| ^~~~~~~~~~~~
cc1: all warnings being treated as errors
Fixes: e5a9df51c7 ("selftests/bpf: Add selftests validating the user ringbuf")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/b28575f9221ec54871c46a2e87612bb4bbf46ccd.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a2c155131b710959beb508ca6a54769b6b1bd488 ]
Include <limits.h> in 'bench.h' to provide a UINT_MAX definition and avoid
multiple compile errors against mips64el/musl-libc like:
benchs/bench_local_storage.c: In function 'parse_arg':
benchs/bench_local_storage.c:40:38: error: 'UINT_MAX' undeclared (first use in this function)
40 | if (ret < 1 || ret > UINT_MAX) {
| ^~~~~~~~
benchs/bench_local_storage.c:11:1: note: 'UINT_MAX' is defined in header '<limits.h>'; did you forget to '#include <limits.h>'?
10 | #include <test_btf.h>
+++ |+#include <limits.h>
11 |
seen with bench_local_storage.c, bench_local_storage_rcu_tasks_trace.c, and
bench_bpf_hashmap_lookup.c.
Fixes: 7308748925 ("selftests/bpf: Add benchmark for local_storage get")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/8f64a9d9fcff40a7fca090a65a68a9b62a468e16.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d44c93fc2f5a0c47b23fa03d374e45259abd92d2 ]
Add a "bpf_util.h" include to avoid the following error seen compiling for
mips64el with musl libc:
bench.c: In function 'find_benchmark':
bench.c:590:25: error: implicit declaration of function 'ARRAY_SIZE' [-Werror=implicit-function-declaration]
590 | for (i = 0; i < ARRAY_SIZE(benchs); i++) {
| ^~~~~~~~~~
cc1: all warnings being treated as errors
Fixes: 8e7c2a023a ("selftests/bpf: Add benchmark runner infrastructure")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/bc4dde77dfcd17a825d8f28f72f3292341966810.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69f409469c9b1515a5db40d5a36fda372376fa2d ]
The addition of general support for unprivileged tests in test_loader.c
breaks building test_verifier on non-glibc (e.g. musl) systems, due to the
inclusion of glibc extension '<error.h>' in 'unpriv_helpers.c'. However,
the header is actually not needed, so remove it to restore building.
Similarly for sk_lookup.c and flow_dissector.c, error.h is not necessary
and causes problems, so drop them.
Fixes: 1d56ade032 ("selftests/bpf: Unprivileged tests for test_loader.c")
Fixes: 0ab5539f85 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Fixes: 0905beec9f ("selftests/bpf: run flow dissector tests in skb-less mode")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/5664367edf5fea4f3f4b4aec3b182bcfc6edff9c.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90a695c3d31e1c9f0adb8c4c80028ed4ea7ed5ab ]
Introduce a new function called get_hw_size that retrieves both the
current and maximum size of the interface and stores this information
in the 'ethtool_ringparam' structure.
Remove ethtool_channels struct from xdp_hw_metadata.c due to redefinition
error. Remove unused linux/if.h include from flow_dissector BPF test to
address CI pipeline failure.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-4-tushar.vyavahare@intel.com
Stable-dep-of: 69f409469c9b ("selftests/bpf: Drop unneeded error.h includes")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b10f0c227ce3fa055d601f058dc411092a62a78 ]
Existing code calls getsockname() with a 'struct sockaddr_in6 *' argument
where a 'struct sockaddr *' argument is declared, yielding compile errors
when building for mips64el/musl-libc:
bpf_iter_setsockopt.c: In function 'get_local_port':
bpf_iter_setsockopt.c:98:30: error: passing argument 2 of 'getsockname' from incompatible pointer type [-Werror=incompatible-pointer-types]
98 | if (!getsockname(fd, &addr, &addrlen))
| ^~~~~
| |
| struct sockaddr_in6 *
In file included from .../netinet/in.h:10,
from .../arpa/inet.h:9,
from ./test_progs.h:17,
from bpf_iter_setsockopt.c:5:
.../sys/socket.h:391:23: note: expected 'struct sockaddr * restrict' but argument is of type 'struct sockaddr_in6 *'
391 | int getsockname (int, struct sockaddr *__restrict, socklen_t *__restrict);
| ^
cc1: all warnings being treated as errors
This compiled under glibc only because the argument is declared to be a
"funky" transparent union which includes both types above. Explicitly cast
the argument to allow compiling for both musl and glibc.
Fixes: eed92afdd1 ("bpf: selftest: Test batching and bpf_(get|set)sockopt in bpf tcp iter")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Geliang Tang <geliang@kernel.org>
Link: https://lore.kernel.org/bpf/f41def0f17b27a23b1709080e4e3f37f4cc11ca9.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d393f9479d4aaab0fa4c3caf513f28685e831f13 ]
Cast 'rlim_t' argument to match expected type of printf() format and avoid
compile errors seen building for mips64el/musl-libc:
In file included from map_tests/sk_storage_map.c:20:
map_tests/sk_storage_map.c: In function 'test_sk_storage_map_stress_free':
map_tests/sk_storage_map.c:414:56: error: format '%lu' expects argument of type 'long unsigned int', but argument 2 has type 'rlim_t' {aka 'long long unsigned int'} [-Werror=format=]
414 | CHECK(err, "setrlimit(RLIMIT_NOFILE)", "rlim_new:%lu errno:%d",
| ^~~~~~~~~~~~~~~~~~~~~~~
415 | rlim_new.rlim_cur, errno);
| ~~~~~~~~~~~~~~~~~
| |
| rlim_t {aka long long unsigned int}
./test_maps.h:12:24: note: in definition of macro 'CHECK'
12 | printf(format); \
| ^~~~~~
map_tests/sk_storage_map.c:414:68: note: format string is defined here
414 | CHECK(err, "setrlimit(RLIMIT_NOFILE)", "rlim_new:%lu errno:%d",
| ~~^
| |
| long unsigned int
| %llu
cc1: all warnings being treated as errors
Fixes: 51a0e301a5 ("bpf: Add BPF_MAP_TYPE_SK_STORAGE test to test_maps")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1e00a1fa7acf91b4ca135c4102dc796d518bad86.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec4fe2f0fa12fd2d0115df7e58414dc26899cc5e ]
Use pid_t rather than __pid_t when allocating memory for 'worker_pids' in
'struct test_env', as this is its declared type and also avoids compile
errors seen building against musl libc on mipsel64:
test_progs.c:1738:49: error: '__pid_t' undeclared (first use in this function); did you mean 'pid_t'?
1738 | env.worker_pids = calloc(sizeof(__pid_t), env.workers);
| ^~~~~~~
| pid_t
test_progs.c:1738:49: note: each undeclared identifier is reported only once for each function it appears in
Fixes: 91b2c0afd0 ("selftests/bpf: Add parallelism to test_progs")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Geliang Tang <geliang@kernel.org>
Link: https://lore.kernel.org/bpf/c6447da51a94babc1931711a43e2ceecb135c93d.1721713597.git.tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ece93a4087b2db7b99ebb2412bd60cf26bbbb51 ]
Make log output incorrectly shows 'test_maps' as the binary name for every
'CLNG-BPF' build step, apparently picking up the last value defined for the
$(TRUNNER_BINARY) variable. Update the 'CLANG_BPF_BUILD_RULE' variants to
fix this confusing output.
Current output:
CLNG-BPF [test_maps] access_map_in_map.bpf.o
GEN-SKEL [test_progs] access_map_in_map.skel.h
...
CLNG-BPF [test_maps] access_map_in_map.bpf.o
GEN-SKEL [test_progs-no_alu32] access_map_in_map.skel.h
...
CLNG-BPF [test_maps] access_map_in_map.bpf.o
GEN-SKEL [test_progs-cpuv4] access_map_in_map.skel.h
After fix:
CLNG-BPF [test_progs] access_map_in_map.bpf.o
GEN-SKEL [test_progs] access_map_in_map.skel.h
...
CLNG-BPF [test_progs-no_alu32] access_map_in_map.bpf.o
GEN-SKEL [test_progs-no_alu32] access_map_in_map.skel.h
...
CLNG-BPF [test_progs-cpuv4] access_map_in_map.bpf.o
GEN-SKEL [test_progs-cpuv4] access_map_in_map.skel.h
Fixes: a5d0c26a27 ("selftests/bpf: Add a cpuv4 test runner for cpu=v4 testing")
Fixes: 89ad7420b2 ("selftests/bpf: Drop the need for LLVM's llc")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240720052535.2185967-1-tony.ambardar@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>