When a sockmap is free'd and a socket in the map is enabled with tls
we tear down the bpf context on the socket, the psock struct and state,
and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform
the tls stack it needs to update its saved sock ops so that when the tls
socket is later destroyed it doesn't try to call the now destroyed psock
hooks.
This is about keeping stacked ULPs in good shape so they always have
the right set of stacked ops.
However, recently unhash() hook was removed from TLS side. But, the
sockmap/bpf side is not doing any extra work to update the unhash op
when is torn down instead expecting TLS side to manage it. So both
TLS and sockmap believe the other side is managing the op and instead
no one updates the hook so it continues to point at tcp_bpf_unhash().
When unhash hook is called we call tcp_bpf_unhash() which detects the
psock has already been destroyed and calls sk->sk_prot_unhash() which
calls tcp_bpf_unhash() yet again and so on looping and hanging the core.
To fix have sockmap tear down logic fixup the stale pointer.
Fixes: 5d92e631b8 ("net/tls: partially revert fix transition through disconnect with close")
Reported-by: syzbot+83979935eb6304f8cd46@syzkaller.appspotmail.com
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-2-john.fastabend@gmail.com
To account for parts of the chip that are "harvested" (disabled) due to
silicon flaws, caches on some AMD GPUs must be initialized before ATS is
enabled.
ATS is normally enabled by the IOMMU driver before the GPU driver loads, so
this cache initialization would have to be done in a quirk, but that's too
complex to be practical.
For Navi14 (device ID 0x7340), this initialization is done by the VBIOS,
but apparently some boards went to production with an older VBIOS that
doesn't do it. Disable ATS for those boards.
Link: https://lore.kernel.org/r/20200114205523.1054271-3-alexander.deucher@amd.com
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1015
See-also: d28ca864c4 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken")
See-also: 9b44b0b09d ("PCI: Mark AMD Stoney GPU ATS as broken")
[bhelgaas: squash into one patch, simplify slightly, commit log]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Review of the recently added documentation file for the qed driver
noticed a couple of typos. Fix them now.
Noticed-by: Michal Kalderon <mkalderon@marvell.com>
Fixes: 0f261c3ca0 ("devlink: add a driver-specific file for the qed driver")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
use sk_bound_dev_if for route lookup as already done
in most of the other ip_route_output_ports() calls.
Since most PPPoA providers use 10.0.0.138 as default gateway IP
this will allow connections to multiple PPTP providers with the
same IP address over different interfaces.
Signed-off-by: Ulrich Weber <ulrich.weber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu says:
====================
net: stmmac: Fix selftests in Synopsys AXS101 board
Set of fixes for sefltests so that they work in Synopsys AXS101 board.
Final output:
$ ethtool -t eth0
The test result is PASS
The test extra info:
1. MAC Loopback 0
2. PHY Loopback -95
3. MMC Counters 0
4. EEE -95
5. Hash Filter MC 0
6. Perfect Filter UC 0
7. MC Filter 0
8. UC Filter 0
9. Flow Control -95
10. RSS -95
11. VLAN Filtering -95
12. VLAN Filtering (perf) -95
13. Double VLAN Filter -95
14. Double VLAN Filter (perf) -95
15. Flexible RX Parser -95
16. SA Insertion (desc) -95
17. SA Replacement (desc) -95
18. SA Insertion (reg) -95
19. SA Replacement (reg) -95
20. VLAN TX Insertion -95
21. SVLAN TX Insertion -95
22. L3 DA Filtering -95
23. L3 SA Filtering -95
24. L4 DA TCP Filtering -95
25. L4 SA TCP Filtering -95
26. L4 DA UDP Filtering -95
27. L4 SA UDP Filtering -95
28. ARP Offload -95
29. Jumbo Frame 0
30. Multichannel Jumbo -95
31. Split Header -95
Description:
1) Fixes the unaligned accesses that caused CPU halt in Synopsys AXS101
boards.
2) Fixes the VLAN tests when filtering failed to work.
3) Fixes the VLAN Perfect tests when filtering is not available in HW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When HW does not support perfect filtering the feature will not be
enabled in the net_device. Add a check for this to prevent failures.
Fixes: 1b2250a04c ("net: stmmac: selftests: Add tests for VLAN Perfect Filtering")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the VLAN ID does not match the expected one it means filter failed
in HW. Fix it.
Fixes: 94e1838200 ("net: stmmac: selftests: Add selftest for VLAN TX Offload")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synopsys AXS101 boards do not support unaligned memory loads or stores.
Change the selftests mechanism to explicity:
- Not add extra alignment in TX SKB
- Use the unaligned version of ether_addr_equal()
Fixes: 091810dbde ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Vazquez says:
====================
This patch series introduce batch ops that can be added to bpf maps to
lookup/lookup_and_delete/update/delete more than 1 element at the time,
this is specially useful when syscall overhead is a problem and in case
of hmap it will provide a reliable way of traversing them.
The implementation inclues a generic approach that could potentially be
used by any bpf map and adds it to arraymap, it also includes the specific
implementation of hashmaps which are traversed using buckets instead
of keys.
The bpf syscall subcommands introduced are:
BPF_MAP_LOOKUP_BATCH
BPF_MAP_LOOKUP_AND_DELETE_BATCH
BPF_MAP_UPDATE_BATCH
BPF_MAP_DELETE_BATCH
The UAPI attribute is:
struct { /* struct used by BPF_MAP_*_BATCH commands */
__aligned_u64 in_batch; /* start batch,
* NULL to start from beginning
*/
__aligned_u64 out_batch; /* output: next start batch */
__aligned_u64 keys;
__aligned_u64 values;
__u32 count; /* input/output:
* input: # of key/value
* elements
* output: # of filled elements
*/
__u32 map_fd;
__u64 elem_flags;
__u64 flags;
} batch;
in_batch and out_batch are only used for lookup and lookup_and_delete since
those are the only two operations that attempt to traverse the map.
update/delete batch ops should provide the keys/values that user wants
to modify.
Here are the previous discussions on the batch processing:
- https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
- https://lore.kernel.org/bpf/20190829064502.2750303-1-yhs@fb.com/
- https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/
Changelog sinve v4:
- Remove unnecessary checks from libbpf API (Andrii Nakryiko)
- Move DECLARE_LIBBPF_OPTS with all var declarations (Andrii Nakryiko)
- Change bucket internal buffer size to 5 entries (Yonghong Song)
- Fix some minor bugs in hashtab batch ops implementation (Yonghong Song)
Changelog sinve v3:
- Do not use copy_to_user inside atomic region (Yonghong Song)
- Use _opts approach on libbpf APIs (Andrii Nakryiko)
- Drop generic_map_lookup_and_delete_batch support
- Free malloc-ed memory in tests (Yonghong Song)
- Reverse christmas tree (Yonghong Song)
- Add acked labels
Changelog sinve v2:
- Add generic batch support for lpm_trie and test it (Yonghong Song)
- Use define MAP_LOOKUP_RETRIES for retries (John Fastabend)
- Return errors directly and remove labels (Yonghong Song)
- Insert new API functions into libbpf alphabetically (Yonghong Song)
- Change hlist_nulls_for_each_entry_rcu to
hlist_nulls_for_each_entry_safe in htab batch ops (Yonghong Song)
Changelog since v1:
- Fix SOB ordering and remove Co-authored-by tag (Alexei Starovoitov)
Changelog since RFC:
- Change batch to in_batch and out_batch to support more flexible opaque
values to iterate the bpf maps.
- Remove update/delete specific batch ops for htab and use the generic
implementations instead.
====================
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements
however up to UCC_MAX_NUM (8) elements are potentially being written
to it. Currently we have an array out-of-bounds write error on the
last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in
size.
Addresses-Coverity: ("Out-of-bounds write")
Fixes: c19b6d246a ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- fix typo and kerneldocs, by Sven Eckelmann
- use WiFi txbitrate for B.A.T.M.A.N. V as fallback, by René Treffer
- silence some endian sparse warnings by adding annotations,
by Sven Eckelmann
- Update copyright years to 2020, by Sven Eckelmann
- Disable deprecated sysfs configuration by default, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
[Why]
Noticed this while testing MST with the 4 ports MST hub from
StarTech.com. Sometimes can't light up monitors normally and get the
error message as 'sideband msg build failed'.
Look into aux transactions, found out that source sometimes will send
out another down request before receiving the down reply of the
previous down request. On the other hand, in drm_dp_get_one_sb_msg(),
current code doesn't handle the interleaved replies case. Hence, source
can't build up message completely and can't light up monitors.
[How]
For good compatibility, enforce source to send out one down request at a
time. Add a flag, is_waiting_for_dwn_reply, to determine if the source
can send out a down request immediately or not.
- Check the flag before calling process_single_down_tx_qlock to send out
a msg
- Set the flag when successfully send out a down request
- Clear the flag when successfully build up a down reply
- Clear the flag when find erros during sending out a down request
- Clear the flag when find errors during building up a down reply
- Clear the flag when timeout occurs during waiting for a down reply
- Use drm_dp_mst_kick_tx() to try to send another down request in queue
at the end of drm_dp_mst_wait_tx_reply() (attempt to send out messages
in queue when errors occur)
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113093649.11755-1-Wayne.Lin@amd.com
Added four libbpf API functions to support map batch operations:
. int bpf_map_delete_batch( ... )
. int bpf_map_lookup_batch( ... )
. int bpf_map_lookup_and_delete_batch( ... )
. int bpf_map_update_batch( ... )
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com
htab can't use generic batch support due some problematic behaviours
inherent to the data structre, i.e. while iterating the bpf map a
concurrent program might delete the next entry that batch was about to
use, in that case there's no easy solution to retrieve the next entry,
the issue has been discussed multiple times (see [1] and [2]).
The only way hmap can be traversed without the problem previously
exposed is by making sure that the map is traversing entire buckets.
This commit implements those strict requirements for hmap, the
implementation follows the same interaction that generic support with
some exceptions:
- If keys/values buffer are not big enough to traverse a bucket,
ENOSPC will be returned.
- out_batch contains the value of the next bucket in the iteration, not
the next key, but this is transparent for the user since the user
should never use out_batch for other than bpf batch syscalls.
This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new
command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete
batch ops it is possible to use the generic implementations.
[1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
[2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com
This commit adds generic support for update and delete batch ops that
can be used for almost all the bpf maps. These commands share the same
UAPI attr that lookup and lookup_and_delete batch ops use and the
syscall commands are:
BPF_MAP_UPDATE_BATCH
BPF_MAP_DELETE_BATCH
The main difference between update/delete and lookup batch ops is that
for update/delete keys/values must be specified for userspace and
because of that, neither in_batch nor out_batch are used.
Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-4-brianvv@google.com
This commit introduces generic support for the bpf_map_lookup_batch.
This implementation can be used by almost all the bpf maps since its core
implementation is relying on the existing map_get_next_key and
map_lookup_elem. The bpf syscall subcommand introduced is:
BPF_MAP_LOOKUP_BATCH
The UAPI attribute is:
struct { /* struct used by BPF_MAP_*_BATCH commands */
__aligned_u64 in_batch; /* start batch,
* NULL to start from beginning
*/
__aligned_u64 out_batch; /* output: next start batch */
__aligned_u64 keys;
__aligned_u64 values;
__u32 count; /* input/output:
* input: # of key/value
* elements
* output: # of filled elements
*/
__u32 map_fd;
__u64 elem_flags;
__u64 flags;
} batch;
in_batch/out_batch are opaque values use to communicate between
user/kernel space, in_batch/out_batch must be of key_size length.
To start iterating from the beginning in_batch must be null,
count is the # of key/value elements to retrieve. Note that the 'keys'
buffer must be a buffer of key_size * count size and the 'values' buffer
must be value_size * count, where value_size must be aligned to 8 bytes
by userspace if it's dealing with percpu maps. 'count' will contain the
number of keys/values successfully retrieved. Note that 'count' is an
input/output variable and it can contain a lower value after a call.
If there's no more entries to retrieve, ENOENT will be returned. If error
is ENOENT, count might be > 0 in case it copied some values but there were
no more entries to retrieve.
Note that if the return code is an error and not -EFAULT,
count indicates the number of elements successfully processed.
Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-3-brianvv@google.com
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- Fix DAT candidate selection on little endian systems,
by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In f2fs_rename(), new_page is gone after f2fs_set_link(), but it tries
to put again when whiteout is failed and jumped to put_out_dir.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
META_MAPPING is used to move blocks for both encrypted and verity files.
So the META_MAPPING invalidation condition in do_checkpoint() should
consider verity too, not just encrypt.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In low memory scenario, we can allocate multiple bios without
submitting any of them.
- f2fs_write_checkpoint()
- block_operations()
- f2fs_sync_node_pages()
step 1) flush cold nodes, allocate new bio from mempool
- bio_alloc()
- mempool_alloc()
step 2) flush hot nodes, allocate a bio from mempool
- bio_alloc()
- mempool_alloc()
step 3) flush warm nodes, be stuck in below call path
- bio_alloc()
- mempool_alloc()
- loop to wait mempool element release, as we only
reserved memory for two bio allocation, however above
allocated two bios may never be submitted.
So we need avoid using default bioset, in this patch we introduce a
private bioset, in where we enlarg mempool element count to total
number of log header, so that we can make sure we have enough
backuped memory pool in scenario of allocating/holding multiple
bios.
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Remove duplicate sbi->aw_cnt stats counter that tracks
the number of atomic files currently opened (it also shows
incorrect value sometimes). Use more relit lable sbi->atomic_files
to show in the stats.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
To catch f2fs bugs in write pointer handling code for zoned block
devices, check write pointers of non-open zones that current segments do
not point to. Do this check at mount time, after the fsync data recovery
and current segments' write pointer consistency fix. Or when fsync data
recovery is disabled by mount option, do the check when there is no fsync
data.
Check two items comparing write pointers with valid block maps in SIT.
The first item is check for zones with no valid blocks. When there is no
valid blocks in a zone, the write pointer should be at the start of the
zone. If not, next write operation to the zone will cause unaligned write
error. If write pointer is not at the zone start, reset the write pointer
to place at the zone start.
The second item is check between the write pointer position and the last
valid block in the zone. It is unexpected that the last valid block
position is beyond the write pointer. In such a case, report as a bug.
Fix is not required for such zone, because the zone is not selected for
next write operation until the zone get discarded.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
On sudden f2fs shutdown, write pointers of zoned block devices can go
further but f2fs meta data keeps current segments at positions before the
write operations. After remounting the f2fs, this inconsistency causes
write operations not at write pointers and "Unaligned write command"
error is reported.
To avoid the error, compare current segments with write pointers of open
zones the current segments point to, during mount operation. If the write
pointer position is not aligned with the current segment position, assign
a new zone to the current segment. Also check the newly assigned zone has
write pointer at zone start. If not, reset write pointer of the zone.
Perform the consistency check during fsync recovery. Not to lose the
fsync data, do the check after fsync data gets restored and before
checkpoint commit which flushes data at current segment positions. Not to
cause conflict with kworker's dirfy data/node flush, do the fix within
SBI_POR_DOING protection.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Anatoly has been fuzzing with kBdysch harness and reported a hang in one
of the outcomes:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (85) call bpf_get_socket_cookie#46
1: R0_w=invP(id=0) R10=fp0
1: (57) r0 &= 808464432
2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
2: (14) w0 -= 810299440
3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
3: (c4) w0 s>>= 1
4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
4: (76) if w0 s>= 0x30303030 goto pc+216
221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
221: (95) exit
processed 6 insns (limit 1000000) [...]
Taking a closer look, the program was xlated as follows:
# ./bpftool p d x i 12
0: (85) call bpf_get_socket_cookie#7800896
1: (bf) r6 = r0
2: (57) r6 &= 808464432
3: (14) w6 -= 810299440
4: (c4) w6 s>>= 1
5: (76) if w6 s>= 0x30303030 goto pc+216
6: (05) goto pc-1
7: (05) goto pc-1
8: (05) goto pc-1
[...]
220: (05) goto pc-1
221: (05) goto pc-1
222: (95) exit
Meaning, the visible effect is very similar to f54c7898ed ("bpf: Fix
precision tracking for unbounded scalars"), that is, the fall-through
branch in the instruction 5 is considered to be never taken given the
conclusion from the min/max bounds tracking in w6, and therefore the
dead-code sanitation rewrites it as goto pc-1. However, real-life input
disagrees with verification analysis since a soft-lockup was observed.
The bug sits in the analysis of the ARSH. The definition is that we shift
the target register value right by K bits through shifting in copies of
its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the
register into 32 bit mode, same happens after simulating the operation.
However, for the case of simulating the actual ARSH, we don't take the
mode into account and act as if it's always 64 bit, but location of sign
bit is different:
dst_reg->smin_value >>= umin_val;
dst_reg->smax_value >>= umin_val;
dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val);
Consider an unknown R0 where bpf_get_socket_cookie() (or others) would
for example return 0xffff. With the above ARSH simulation, we'd see the
following results:
[...]
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (57) r0 &= 808464432
-> R0_runtime = 0x3030
3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
3: (14) w0 -= 810299440
-> R0_runtime = 0xcfb40000
4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
(0xffffffff)
4: (c4) w0 s>>= 1
-> R0_runtime = 0xe7da0000
5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0x7ffbfff8)
[...]
In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011
0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that
is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly
retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into
0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000'
and means here that the simulation didn't retain the sign bit. With above
logic, the updates happen on the 64 bit min/max bounds and given we coerced
the register, the sign bits of the bounds are cleared as well, meaning, we
need to force the simulation into s32 space for 32 bit alu mode.
Verification after the fix below. We're first analyzing the fall-through branch
on 32 bit signed >= test eventually leading to rejection of the program in this
specific case:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r2 = 808464432
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (bf) r6 = r0
3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0
3: (57) r6 &= 808464432
4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
4: (14) w6 -= 810299440
5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
5: (c4) w6 s>>= 1
6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0xfffbfff8)
6: (76) if w6 s>= 0x30303030 goto pc+216
7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
7: (30) r0 = *(u8 *)skb[808464432]
BPF_LD_[ABS|IND] uses reserved fields
processed 8 insns (limit 1000000) [...]
Fixes: 9cbe1f5a32 ("bpf/verifier: improve register value range tracking with ARSH")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net
kmemleak detects the following memory leak when hot removing
a network device:
unreferenced object 0xffff888083f63600 (size 256):
comm "kworker/0:1", pid 12, jiffies 4294831717 (age 1113.676s)
hex dump (first 32 bytes):
00 40 c7 33 80 88 ff ff 00 00 00 00 10 00 00 00 .@.3............
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
backtrace:
[<00000000d4a8f5be>] rndis_filter_device_add+0x117/0x11c0 [hv_netvsc]
[<000000009c02d75b>] netvsc_probe+0x5e7/0xbf0 [hv_netvsc]
[<00000000ddafce23>] vmbus_probe+0x74/0x170 [hv_vmbus]
[<00000000046e64f1>] really_probe+0x22f/0xb50
[<000000005cc35eb7>] driver_probe_device+0x25e/0x370
[<0000000043c642b2>] bus_for_each_drv+0x11f/0x1b0
[<000000005e3d09f0>] __device_attach+0x1c6/0x2f0
[<00000000a72c362f>] bus_probe_device+0x1a6/0x260
[<0000000008478399>] device_add+0x10a3/0x18e0
[<00000000cf07b48c>] vmbus_device_register+0xe7/0x1e0 [hv_vmbus]
[<00000000d46cf032>] vmbus_add_channel_work+0x8ab/0x1770 [hv_vmbus]
[<000000002c94bb64>] process_one_work+0x919/0x17d0
[<0000000096de6781>] worker_thread+0x87/0xb40
[<00000000fbe7397e>] kthread+0x333/0x3f0
[<000000004f844269>] ret_from_fork+0x3a/0x50
rndis_filter_device_add() allocates an instance of struct rndis_device
which never gets deallocated as rndis_filter_device_remove() sets
net_device->extension which points to the rndis_device struct to NULL,
leaving the rndis_device dangling.
Since net_device->extension is eventually freed in free_netvsc_device(),
we refrain from setting it to NULL inside rndis_filter_device_remove()
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the retry of btsoc initialization when it fails. There are
reports that the btsoc initialization may fail on some platforms but the
repro ratio is very low. The symptoms is the firmware downloading failed
due to the UART write timed out. The failure may be caused by UART,
platform HW or the btsoc itself but it's very difficlut to root cause,
given the repro ratio is very low. Add a retry for the btsoc initialization
can work around most of the failures and make Bluetooth finally works.
Signed-off-by: Rocky Liao <rjliao@codeaurora.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Current qca_power_shutdown() only supports wcn399x, this patch adds Rome
power off support to it. For Rome it just needs to pull down the bt_en
GPIO to power off it. This patch also replaces all the power off operation
in qca_close() with the unified qca_power_shutdown() call.
Signed-off-by: Rocky Liao <rjliao@codeaurora.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When the packet pointed to by retransmit_skb_hint is unlinked by ACK,
retransmit_skb_hint will be set to NULL in tcp_clean_rtx_queue().
If packet loss is detected at this time, retransmit_skb_hint will be set
to point to the current packet loss in tcp_verify_retransmit_hint(),
then the packets that were previously marked lost but not retransmitted
due to the restriction of cwnd will be skipped and cannot be
retransmitted.
To fix this, when retransmit_skb_hint is NULL, retransmit_skb_hint can
be reset only after all marked lost packets are retransmitted
(retrans_out >= lost_out), otherwise we need to traverse from
tcp_rtx_queue_head in tcp_xmit_retransmit_queue().
Packetdrill to demonstrate:
// Disable RACK and set max_reordering to keep things simple
0 `sysctl -q net.ipv4.tcp_recovery=0`
+0 `sysctl -q net.ipv4.tcp_max_reordering=3`
// Establish a connection
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <...>
+.01 < . 1:1(0) ack 1 win 257
+0 accept(3, ..., ...) = 4
// Send 8 data segments
+0 write(4, ..., 8000) = 8000
+0 > P. 1:8001(8000) ack 1
// Enter recovery and 1:3001 is marked lost
+.01 < . 1:1(0) ack 1 win 257 <sack 3001:4001,nop,nop>
+0 < . 1:1(0) ack 1 win 257 <sack 5001:6001 3001:4001,nop,nop>
+0 < . 1:1(0) ack 1 win 257 <sack 5001:7001 3001:4001,nop,nop>
// Retransmit 1:1001, now retransmit_skb_hint points to 1001:2001
+0 > . 1:1001(1000) ack 1
// 1001:2001 was ACKed causing retransmit_skb_hint to be set to NULL
+.01 < . 1:1(0) ack 2001 win 257 <sack 5001:8001 3001:4001,nop,nop>
// Now retransmit_skb_hint points to 4001:5001 which is now marked lost
// BUG: 2001:3001 was not retransmitted
+0 > . 2001:3001(1000) ack 1
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for getting IRQ directly from DT instead of relying on
converting a GPIO to IRQ. This is needed for platforms with GPIO
controllers that that do not support gpiod_to_irq().
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Guillaume La Roque <glaroque@baylibre.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
add interrupts and interrupt-names as optional properties
to support host-wakeup by interrupt properties instead of
host-wakeup-gpios.
Signed-off-by: Guillaume La Roque <glaroque@baylibre.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Check for valid packet type before calling hci_recv_frame which is
inline with what other drivers are doing.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This enables H5 driver to properly handle ISO packets.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This enables H4 driver to properly handle ISO packets.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This reuse __check_timeout on hci_sched_le following the same logic
used hci_sched_acl.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This enables passing ISO packets to the monitor socket.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This make virtual controllers to pass ISO packets around.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
These adds the HCI definitions for handling CIS connections along with
ISO data packets.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
MGMT command is added to receive the list of blocked keys from
user-space.
The list is used to:
1) Block keys from being distributed by the device during
the ke distribution phase of SMP.
2) Filter out any keys that were previously saved so
they are no longer used.
Signed-off-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Investigation on the AXP803 shows that VBUS_PLUGIN/VBUS_REMOVAL IRQs are
triggered on the rising/falling edge of AXP20X_PWR_STATUS_VBUS_USED. The
reason IRQs do not arrive while N_VBUSEN/DRIVEVBUS is high is because
AXP20X_PWR_STATUS_VBUS_USED also never goes high.
This also means that if VBUS is online, a VBUS_REMOVAL IRQ is received
immediately on setting N_VBUSEN/DRIVEVBUS high (and VBUS_PLUGIN shortly
after it is set back low). This was also verified to be the case when
manually offlining VBUS through AXP20X_VBUS_PATH_SELECT.
As long as VBUS is online, a present->absent transition necessarily
implies an online->offline transition. Since will cause an IRQ, there is
no need to poll while VBUS is online.
To ensure the driver's view of VBUS online status remains accurate,
unconditionally poll once when receiving an IRQ and when resuming. If
VBUS is still online at that time, polling will cease until the next
VBUS_REMOVAL IRQ.
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The USB power supply input can be used as a wakeup source. Hook up the
VBUS_PLUGIN IRQ to trigger wakeup based on userspace configuration.
To do this, we must remember the list of IRQs for the life of the
device. To know how much space to allocate for the flexible array
member, we switch from using a NULL sentinel to using an array length.
Because we now depend on the specific order of the IRQs (we assume
VBUS_PLUGIN is first and always present), failing to acquire an IRQ
during probe must be a fatal error.
To avoid spuriously waking up the system when the USB power supply is
not configured as a wakeup source, we must explicitly disable all non-
wake IRQs during system suspend. This is because the SoC's NMI input is
shared among all IRQs on the AXP PMIC. Due to the use of regmap-irq, the
individual IRQs within the PMIC are nested threaded interrupts, and are
therefore not automatically disabled during system suspend.
The upshot is that if any other device within the MFD (such as the power
key) is an enabled wakeup source, all enabled IRQs within the PMIC will
cause wakeup. We still need to call enable_irq_wake() when we *do* want
wakeup, in case those other wakeup sources on the PMIC are all disabled.
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>