[ Upstream commit dbc48c8f41c208082cfa95e973560134489e3309 ]
nr_pages is unsigned long but gets passed to rb_alloc_aux() as an int,
and is stored as an int.
Only power-of-2 values are accepted, so if nr_pages is a 64_bit value, it
will be passed to rb_alloc_aux() as zero.
That is not ideal because:
1. the value is incorrect
2. rb_alloc_aux() is at risk of misbehaving, although it manages to
return -ENOMEM in that case, it is a result of passing zero to get_order()
even though the get_order() result is documented to be undefined in that
case.
Fix by simply validating the maximum supported value in the first place.
Use -ENOMEM error code for consistency with the current error code that
is returned in that case.
Fixes: 45bfb2e504 ("perf: Add AUX area to ring buffer for raw data streams")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240624201101.60186-6-adrian.hunter@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3520b251dcae2b4a27b95cd6f745c54fd658bda5 ]
Currently, perf allocates an array of page pointers which is limited in
size by MAX_PAGE_ORDER. That in turn limits the maximum Intel PT buffer
size to 2GiB. Should that limitation be lifted, the Intel PT driver can
support larger sizes, except for one calculation in
pt_topa_entry_for_page(), which is limited to 32-bits.
Fix pt_topa_entry_for_page() address calculation by adding a cast.
Fixes: 39152ee51b ("perf/x86/intel/pt: Get rid of reverse lookup table for ToPA")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240624201101.60186-4-adrian.hunter@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53796b03295cf7ab1fc8600016fa6dfbf4a494a0 ]
In the context of the SCTP SNAT/DNAT handler, these calls can only
return true.
Fixes: e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")
Signed-off-by: Ismael Luceno <iluceno@suse.de>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc5083d1f3881624ad2de1f3cbb3a07e152cb254 ]
I encountered an issue when building the test_progs from the repository [1]:
$ pwd
/work/Qemu/x86_64/linux-6.10-rc2/tools/testing/selftests/bpf/
$ make test_progs V=1
[...]
./tools/sbin/bpftool gen object ./ip_check_defrag.bpf.linked2.o ./ip_check_defrag.bpf.linked1.o
libbpf: failed to find symbol for variable 'bpf_dynptr_slice' in section '.ksyms'
Error: failed to link './ip_check_defrag.bpf.linked1.o': No such file or directory (2)
[...]
Upon investigation, I discovered that the btf_types referenced in the '.ksyms'
section had a kind of BTF_KIND_FUNC instead of BTF_KIND_VAR:
$ bpftool btf dump file ./ip_check_defrag.bpf.linked1.o
[...]
[2] DATASEC '.ksyms' size=0 vlen=2
type_id=16 offset=0 size=0 (FUNC 'bpf_dynptr_from_skb')
type_id=17 offset=0 size=0 (FUNC 'bpf_dynptr_slice')
[...]
[16] FUNC 'bpf_dynptr_from_skb' type_id=82 linkage=extern
[17] FUNC 'bpf_dynptr_slice' type_id=85 linkage=extern
[...]
For a detailed analysis, please refer to [2]. We can add a kind checking to
fix the issue.
[1] https://github.com/eddyz87/bpf/tree/binsort-btf-dedup
[2] https://lore.kernel.org/all/0c0ef20c-c05e-4db9-bad7-2cbc0d6dfae7@oracle.com/
Fixes: 8fd27bf69b ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240619122355.426405-1-dolinux.peng@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c32fe1986f27cac329767d3497986e306cad1d5e ]
FEC_ECR_EN1588 bit gets cleared after MAC reset in `fec_stop()`, which
makes all 1588 functionality shut down, and all the extended registers
disappear, on link-down, making the adapter fall back to compatibility
"dumb mode". However, some functionality needs to be retained (e.g. PPS)
even without link.
Fixes: 6605b730c0 ("FEC: Add time stamping code and a PTP hardware clock")
Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/netdev/5fa9fadc-a89d-467a-aae9-c65469ff5fe1@lunn.ch/
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bcbd771cd5d68c0c52567556097d75f9fc4e7cd6 ]
Currently NL80211_RATE_INFO_HE_RU_ALLOC_2x996 is not handled in
cfg80211_calculate_bitrate_he(), leading to below warning:
kernel: invalid HE MCS: bw:6, ru:6
kernel: WARNING: CPU: 0 PID: 2312 at net/wireless/util.c:1501 cfg80211_calculate_bitrate_he+0x22b/0x270 [cfg80211]
Fix it by handling 2x996 RU allocation in the same way as 160 MHz bandwidth.
Fixes: c4cbaf7973 ("cfg80211: Add support for HE")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Link: https://msgid.link/20240606020653.33205-3-quic_bqiang@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2b0ca38d362ebf16ca79cd7f309d5bb8b581deb ]
Currently for CCMP256, GCMP128 and GCMP256 ciphers, in ath11k_install_key()
IEEE80211_KEY_FLAG_GENERATE_IV_MGMT is not set. And in ath11k_mac_mgmt_tx_wmi()
a length of IEEE80211_CCMP_MIC_LEN is reserved for all ciphers.
This results in unexpected management frame drop in case either of above 3 ciphers
is used. The reason is, without IEEE80211_KEY_FLAG_GENERATE_IV_MGMT set, mac80211
will not generate CCMP/GCMP headers in frame for ath11k. Also MIC length reserved
is wrong. Such frame is dropped later by hardware:
ath11k_pci 0000:5a:00.0: mac tx mgmt frame, buf id 0
ath11k_pci 0000:5a:00.0: mgmt tx compl ev pdev_id 1, desc_id 0, status 1
From user point of view, we have observed very low throughput due to this issue:
action frames are all dropped so ADDBA response from DUT never reaches AP. AP
can not use aggregation thus throughput is low.
Fix this by setting IEEE80211_KEY_FLAG_GENERATE_IV_MGMT flag and by reserving proper
MIC length for those ciphers.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f2 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Tested-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Closes: https://lore.kernel.org/all/CADS+iDX5=JtJr0apAtAQ02WWBxgOFEv8G063vuGYwDTC8AVZaw@mail.gmail.com
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240605014826.22498-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 83ab38ef0a0b2407d43af9575bb32333fdd74fb2 ]
The commit which tried to fix the concurrency issues of concurrent
static_key_slow_inc() failed to fix the equivalent issues
vs. static_key_slow_dec():
CPU0 CPU1
static_key_slow_dec()
static_key_slow_try_dec()
key->enabled == 1
val = atomic_fetch_add_unless(&key->enabled, -1, 1);
if (val == 1)
return false;
jump_label_lock();
if (atomic_dec_and_test(&key->enabled)) {
--> key->enabled == 0
__jump_label_update()
static_key_slow_dec()
static_key_slow_try_dec()
key->enabled == 0
val = atomic_fetch_add_unless(&key->enabled, -1, 1);
--> key->enabled == -1 <- FAIL
There is another bug in that code, when there is a concurrent
static_key_slow_inc() which enables the key as that sets key->enabled to -1
so on the other CPU
val = atomic_fetch_add_unless(&key->enabled, -1, 1);
will succeed and decrement to -2, which is invalid.
Cure all of this by replacing the atomic_fetch_add_unless() with a
atomic_try_cmpxchg() loop similar to static_key_fast_inc_not_disabled().
[peterz: add WARN_ON_ONCE for the -1 race]
Fixes: 4c5ea0a9cd ("locking/static_key: Fix concurrent static_key_slow_inc()")
Reported-by: Yue Sun <samsun1006219@gmail.com>
Reported-by: Xingwei Lee <xrivendell7@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240610124406.422897838@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb8c507296 ]
1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
protection against key->enabled refcounter overflow.
2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
still may turn the refcounter negative as (v + 1) may overflow.
key->enabled is indeed a ref-counter as it's documented in multiple
places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
etc.
As -1 is reserved for static key that's in process of being enabled,
functions would break with negative key->enabled refcount:
- for CONFIG_JUMP_LABEL=n negative return of static_key_count()
breaks static_key_false(), static_key_true()
- the ref counter may become 0 from negative side by too many
static_key_slow_inc() calls and lead to use-after-free issues.
These flaws result in that some users have to introduce an additional
mutex and prevent the reference counter from overflowing themselves,
see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
Prevent the reference counter overflow by checking if (v + 1) > 0.
Change functions API to return whether the increment was successful.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d0c006402e ]
Use atomic_try_cmpxchg() instead of atomic_cmpxchg (*ptr, old, new) ==
old in static_key_slow_inc_cpuslocked(). x86 CMPXCHG instruction
returns success in ZF flag, so this change saves a compare after
cmpxchg (and related move instruction in front of cmpxchg).
Also, atomic_try_cmpxchg() implicitly assigns old *ptr value to "old" when
cmpxchg fails, enabling further code simplifications.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221019140850.3395-1-ubizjak@gmail.com
Stable-dep-of: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb9bb45f746b0f9457de9c3fc4da143a6351bdc9 ]
Yue and Xingwei reported a jump label failure. It's caused by the lack of
serialization in set_attr_rdpmc():
CPU0 CPU1
Assume: x86_pmu.attr_rdpmc == 0
if (val != x86_pmu.attr_rdpmc) {
if (val == 0)
...
else if (x86_pmu.attr_rdpmc == 0)
static_branch_dec(&rdpmc_never_available_key);
if (val != x86_pmu.attr_rdpmc) {
if (val == 0)
...
else if (x86_pmu.attr_rdpmc == 0)
FAIL, due to imbalance ---> static_branch_dec(&rdpmc_never_available_key);
The reported BUG() is a consequence of the above and of another bug in the
jump label core code. The core code needs a separate fix, but that cannot
prevent the imbalance problem caused by set_attr_rdpmc().
Prevent this by serializing set_attr_rdpmc() locally.
Fixes: a66734297f ("perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks")
Closes: https://lore.kernel.org/r/CAEkJfYNzfW1vG=ZTMdz_Weoo=RXY1NDunbxnDaLyj8R4kEoE_w@mail.gmail.com
Reported-by: Yue Sun <samsun1006219@gmail.com>
Reported-by: Xingwei Lee <xrivendell7@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240610124406.359476013@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75d8d7a63065b18df9555dbaab0b42d4c6f20943 ]
ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and
newer ASICs can share the same mask if their masks only differ in up to
8 consecutive bits. For example, consider the following filters:
# tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop
# tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop
The second filter can use the same mask as the first (dst_ip/24) with a
delta of 1 bit.
However, the above only works because the two filters have different
values in the common unmasked part (dst_ip/24). When entries have the
same value in the common unmasked part they create undesired collisions
in the device since many entries now have the same key. This leads to
firmware errors such as [1] and to a reduced scale.
Fix by adjusting the hash table key to only include the value in the
common unmasked part. That is, without including the delta bits. That
way the driver will detect the collision during filter insertion and
spill the filter into the circuit TCAM (C-TCAM).
Add a test case that fails without the fix and adjust existing cases
that check C-TCAM spillage according to the above limitation.
[1]
mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available))
Fixes: c22291f7cf ("mlxsw: spectrum: acl: Implement delta for ERP")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97d833ceb27dc19f8777d63f90be4a27b5daeedf ]
ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
(A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
contain more ACLs (i.e., tc filters), but the number of masks in each
region (i.e., tc chain) is limited.
In order to mitigate the effects of the above limitation, the device
allows filters to share a single mask if their masks only differ in up
to 8 consecutive bits. For example, dst_ip/25 can be represented using
dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
number of masks being used (and therefore does not support mask
aggregation), but can contain a limited number of filters.
The driver uses the "objagg" library to perform the mask aggregation by
passing it objects that consist of the filter's mask and whether the
filter is to be inserted into the A-TCAM or the C-TCAM since filters in
different TCAMs cannot share a mask.
The set of created objects is dependent on the insertion order of the
filters and is not necessarily optimal. Therefore, the driver will
periodically ask the library to compute a more optimal set ("hints") by
looking at all the existing objects.
When the library asks the driver whether two objects can be aggregated
the driver only compares the provided masks and ignores the A-TCAM /
C-TCAM indication. This is the right thing to do since the goal is to
move as many filters as possible to the A-TCAM. The driver also forbids
two identical masks from being aggregated since this can only happen if
one was intentionally put in the C-TCAM to avoid a conflict in the
A-TCAM.
The above can result in the following set of hints:
H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
After getting the hints from the library the driver will start migrating
filters from one region to another while consulting the computed hints
and instructing the device to perform a lookup in both regions during
the transition.
Assuming a filter with mask X is being migrated into the A-TCAM in the
new region, the hints lookup will return H1. Since H2 is the parent of
H1, the library will try to find the object associated with it and
create it if necessary in which case another hints lookup (recursive)
will be performed. This hints lookup for {mask Y, A-TCAM} will either
return H2 or H3 since the driver passes the library an object comparison
function that ignores the A-TCAM / C-TCAM indication.
This can eventually lead to nested objects which are not supported by
the library [1].
Fix by removing the object comparison function from both the driver and
the library as the driver was the only user. That way the lookup will
only return exact matches.
I do not have a reliable reproducer that can reproduce the issue in a
timely manner, but before the fix the issue would reproduce in several
minutes and with the fix it does not reproduce in over an hour.
Note that the current usefulness of the hints is limited because they
include the C-TCAM indication and represent aggregation that cannot
actually happen. This will be addressed in net-next.
[1]
WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
Modules linked in:
CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
[...]
Call Trace:
<TASK>
__objagg_obj_get+0x2bb/0x580
objagg_obj_get+0xe/0x80
mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
process_one_work+0x151/0x370
Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b4a3a89fffcdf09702b1f161b914e52abca1894d ]
The library supports aggregation of objects into other objects only if
the parent object does not have a parent itself. That is, nesting is not
supported.
Aggregation happens in two cases: Without and with hints, where hints
are a pre-computed recommendation on how to aggregate the provided
objects.
Nesting is not possible in the first case due to a check that prevents
it, but in the second case there is no check because the assumption is
that nesting cannot happen when creating objects based on hints. The
violation of this assumption leads to various warnings and eventually to
a general protection fault [1].
Before fixing the root cause, error out when nesting happens and warn.
[1]
general protection fault, probably for non-canonical address 0xdead000000000d90: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1083 Comm: kworker/1:9 Tainted: G W 6.9.0-rc6-custom-gd9b4f1cca7fb #7
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:mlxsw_sp_acl_erp_bf_insert+0x25/0x80
[...]
Call Trace:
<TASK>
mlxsw_sp_acl_atcam_entry_add+0x256/0x3c0
mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
process_one_work+0x151/0x370
worker_thread+0x2cb/0x3e0
kthread+0xd0/0x100
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1a/0x30
</TASK>
Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ac14b9dfbd345e891d48d89f6c2fa519848f0f4 ]
SG_MAX_SINGLE_ALLOC is used to limit maximum number of entries that
will be allocated in one piece of scatterlist. When the entries of
scatterlist exceeds SG_MAX_SINGLE_ALLOC, sg chain will be used. From
commit 7c703e54cc ("arch: switch the default on ARCH_HAS_SG_CHAIN"),
we can know that the macro CONFIG_ARCH_NO_SG_CHAIN is used to identify
whether sg chain is supported. So, SMC-R's rmb buffer should be limited
by SG_MAX_SINGLE_ALLOC only when the macro CONFIG_ARCH_NO_SG_CHAIN is
defined.
Fixes: a3fe3d01bd ("net/smc: introduce sg-logic for RMBs")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Co-developed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fde6f897f2a184546bf5516ac736523ef24dc6a7 ]
These functions have races when they:
1) Write sk->sk_err
2) call sk_error_report(sk)
3) call tcp_done(sk)
As described in prior patches in this series:
An smp_wmb() is missing.
We should call tcp_done() before sk_error_report(sk)
to have consistent tcp_poll() results on SMP hosts.
Use tcp_done_with_error() where we centralized the
correct sequence.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 853c3bd7b7917670224c9fe5245bd045cac411dd ]
I noticed flakes in a packetdrill test, expecting an epoll_wait()
to return EPOLLERR | EPOLLHUP on a failed connect() attempt,
after multiple SYN retransmits. It sometimes return EPOLLERR only.
The issue is that tcp_write_err():
1) writes an error in sk->sk_err,
2) calls sk_error_report(),
3) then calls tcp_done().
tcp_done() is writing SHUTDOWN_MASK into sk->sk_shutdown,
among other things.
Problem is that the awaken user thread (from 2) sk_error_report())
might call tcp_poll() before tcp_done() has written sk->sk_shutdown.
tcp_poll() only sees a non zero sk->sk_err and returns EPOLLERR.
This patch fixes the issue by making sure to call sk_error_report()
after tcp_done().
tcp_write_err() also lacks an smp_wmb().
We can reuse tcp_done_with_error() to factor out the details,
as Neal suggested.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e13ec3da05 ]
tcp_poll() reads sk->sk_err without socket lock held/owned.
We should used READ_ONCE() here, and update writers
to use WRITE_ONCE().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 853c3bd7b791 ("tcp: fix race in tcp_write_err()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cee1af825d ]
This field can be read/written without lock synchronization.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 853c3bd7b791 ("tcp: fix race in tcp_write_err()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96f887a612e4cda89efc3f54bc10c1997e3ab0e9 ]
xmit() functions should consume skb or return error codes in error
paths.
When the configuration "CONFIG_INET_ESPINTCP" is not set, the
implementation of the function "esp_output_tail_tcp" violates this rule.
The function frees the skb and returns the error code.
This change removes the kfree_skb from both functions, for both
esp4 and esp6.
WARN_ON is added because esp_output_tail_tcp() should never be called if
CONFIG_INET_ESPINTCP is not set.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Hagar Hemdan <hagarhem@amazon.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c8d7598dfed759bf1d9d0322b4c2b42eb7252d8 ]
bpf_prog5 and bpf_prog7 are removed from progs/test_sockmap_kern.h in
commit d79a32129b ("bpf: Selftests, remove prints from sockmap tests"),
now there are only 9 progs in it, not 11:
SEC("sk_skb1")
int bpf_prog1(struct __sk_buff *skb)
SEC("sk_skb2")
int bpf_prog2(struct __sk_buff *skb)
SEC("sk_skb3")
int bpf_prog3(struct __sk_buff *skb)
SEC("sockops")
int bpf_sockmap(struct bpf_sock_ops *skops)
SEC("sk_msg1")
int bpf_prog4(struct sk_msg_md *msg)
SEC("sk_msg2")
int bpf_prog6(struct sk_msg_md *msg)
SEC("sk_msg3")
int bpf_prog8(struct sk_msg_md *msg)
SEC("sk_msg4")
int bpf_prog9(struct sk_msg_md *msg)
SEC("sk_msg5")
int bpf_prog10(struct sk_msg_md *msg)
This patch updates the array sizes of prog_fd[], prog_attach_type[] and
prog_type[] from 11 to 9 accordingly.
Fixes: d79a32129b ("bpf: Selftests, remove prints from sockmap tests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/9c10d9f974f07fcb354a43a8eca67acb2fafc587.1715926605.git.tanggeliang@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49e24c80d3c81c43e2a56101449e1eea32fcf292 ]
Initialize the completion before the mailbox channel is requested.
Fixes: 389711b374 ("firmware: Add Turris Mox rWTM firmware driver")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8467cfe821ac3526f7598682ad5f90689fa8cc49 ]
The wait_for_completion_timeout() function returns 0 if timed out, and a
positive value if completed. Fix the usage of this function.
Fixes: 389711b374 ("firmware: Add Turris Mox rWTM firmware driver")
Fixes: 2eab59cf0d ("firmware: turris-mox-rwtm: fail probing when firmware does not support hwrng")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0bafb172b111ab27251af0eb684e7bde9570ce4c ]
Do not complete the "command done" completion if there are no waiters.
This can happen if a wait_for_completion() timed out or was interrupted.
Fixes: 389711b374 ("firmware: Add Turris Mox rWTM firmware driver")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a7b7326d587c9a5e8ff067e70d6aaf0333f4bb3 ]
Commit 9a427556fb ("vmlinux.lds.h: catch compound literals into
data and BSS") added catches for .data..L* and .rodata..L* but missed
.bss..L*
Since commit 5431fdd2c181 ("ptrace: Convert ptrace_attach() to use
lock guards") the following appears at build:
LD .tmp_vmlinux.kallsyms1
powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
NM .tmp_vmlinux.kallsyms1.syms
KSYMS .tmp_vmlinux.kallsyms1.S
AS .tmp_vmlinux.kallsyms1.S
LD .tmp_vmlinux.kallsyms2
powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
NM .tmp_vmlinux.kallsyms2.syms
KSYMS .tmp_vmlinux.kallsyms2.S
AS .tmp_vmlinux.kallsyms2.S
LD vmlinux
powerpc64-linux-ld: warning: orphan section `.bss..Lubsan_data33' from `kernel/ptrace.o' being placed in section `.bss..Lubsan_data33'
Lets add .bss..L* to BSS_MAIN macro to catch those sections into BSS.
Fixes: 9a427556fb ("vmlinux.lds.h: catch compound literals into data and BSS")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404031349.nmKhyuUG-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78ab3d352f2982bf3f7e506bfbaba7afee1ed8a9 ]
GPIOs controlling backlight on Spitz and Akita are coming from GPIO
expanders, not the pxa27xx-gpio block, correct it.
Additionally GPIO lookup tables operate with pin numbers rather than
legacy GPIO numbers, fix that as well. Use raw numbers instead of legacy
GPIO names to avoid confusion.
Fixes: ee0c8e494c ("backlight: corgi: Convert to use GPIO descriptors")
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Link: https://lore.kernel.org/r/20240628180852.1738922-2-dmitry.torokhov@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d42c3ee3ce37cdad6f98c9e77bfbd0d791ac7da ]
The iommu@fe043e00 on RK356x SoC shares the VOP power domain, but the
power-domains property was not provided when the node has been added.
The consequence is that an attempt to reload the rockchipdrm module will
freeze the entire system. That is because on probe time,
pm_runtime_get_suppliers() gets called for vop@fe040000, which blocks
when pm_runtime_get_sync() is being invoked for iommu@fe043e00.
Fix the issue by adding the missing property.
Fixes: 9d6c6d978f ("arm64: dts: rockchip: rk356x: Add VOP2 nodes")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://lore.kernel.org/r/20240702-rk356x-fix-vop-mmu-v1-1-a66d1a0c45ea@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68c9c53d45 ]
Complete the description of the Cortex-A76 CPU cores and L3 cache
controllers on the Renesas R-Car V4H (R8A779G0) SoC, including CPU
topology and PSCI support for enabling CPU cores.
R-Car V4H has 4 Cortex-A76 cores, grouped in 2 clusters.
Based on a patch in the BSP by Takeshi Kihara.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/ccb55458bd87f8ba70d28c61bcc254f22184824c.1668429870.git.geert+renesas@glider.be
Stable-dep-of: 6fca24a07e1d ("arm64: dts: renesas: r8a779a0: Add missing hypervisor virtual timer IRQ")
Signed-off-by: Sasha Levin <sashal@kernel.org>