linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 02:21:52 +09:00

Author	SHA1	Message	Date
Zhengchao Shao	93f3f2eaa4	selftests/tc-testings: add selftests for bpf filter Test 23c3: Add cBPF filter with valid bytecode Test 1563: Add cBPF filter with invalid bytecode Test 2334: Add eBPF filter with valid object-file Test 2373: Add eBPF filter with invalid object-file Test 4423: Replace cBPF bytecode Test 5122: Delete cBPF filter Test e0a9: List cBPF filters Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 15:54:17 -07:00
Zhengchao Shao	5508ff7cf3	net/sched: use tc_cls_stats_dump() in filter use tc_cls_stats_dump() in filter. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 15:54:13 -07:00
Zhengchao Shao	fe0df81df5	net/sched: cls_api: add helper for tc cls walker stats dump The walk implementation of most tc cls modules is basically the same. That is, the values of count and skip are checked first. If count is greater than or equal to skip, the registered fn function is executed. Otherwise, increase the value of count. So we can reconstruct them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 15:54:07 -07:00
Rafał Miłecki	e93a766da5	net: broadcom: bcm4908_enet: handle -EPROBE_DEFER when getting MAC Reading MAC from OF may return -EPROBE_DEFER if underlaying NVMEM device isn't ready yet. In such case pass that error code up and "wait" to be probed later. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Link: https://lore.kernel.org/r/20220915133013.2243-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 14:34:03 -07:00
Lukas Bulwahn	caddb4e0d6	net: make NET_(DEV\|NS)_REFCNT_TRACKER depend on NET It makes little sense to ask if networking namespace or net device refcount tracking shall be enabled for debug kernel builds without network support. This is similar to the commit `eb0b39efb7` ("net: CONFIG_DEBUG_NET depends on CONFIG_NET"). Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220915124256.32512-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 14:23:56 -07:00
Jakub Kicinski	c3194a6714	Merge branch 'small-tc-taprio-improvements' Vladimir Oltean says: ==================== Small tc-taprio improvements This series contains: - the proper protected variant of rcu_dereference() of admin and oper schedules for accesses from the slow path - a removal of an extra function pointer indirection for qdisc->dequeue() and qdisc->peek() - a removal of WARN_ON_ONCE() checks that can never trigger - the addition of netlink extack messages to some qdisc->init() failures These were split from an earlier patch set, hence the v2. ==================== Link: https://lore.kernel.org/r/20220915105046.2404072-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:54:39 -07:00
Vladimir Oltean	2c08a4f898	net/sched: taprio: replace safety precautions with comments The WARN_ON_ONCE() checks introduced in commit `13511704f8` ("net: taprio offload: enforce qdisc to netdev queue mapping") take a small toll on performance, but otherwise, the conditions are never expected to happen. Replace them with comments, such that the information is still conveyed to developers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:34 -07:00
Vladimir Oltean	026de64d7b	net/sched: taprio: add extack messages in taprio_init Stop contributing to the proverbial user unfriendliness of tc, and tell the user what is wrong wherever possible. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:34 -07:00
Vladimir Oltean	25becba629	net/sched: taprio: stop going through private ops for dequeue and peek Since commit `13511704f8` ("net: taprio offload: enforce qdisc to netdev queue mapping"), taprio_dequeue_soft() and taprio_peek_soft() are de facto the only implementations for Qdisc_ops :: dequeue and Qdisc_ops :: peek that taprio provides. This is because in full offload mode, __dev_queue_xmit() will select a txq->qdisc which is never root taprio qdisc. So if nothing is enqueued in the root qdisc, it will never be run and nothing will get dequeued from it. Therefore, we can remove the private indirection from taprio, and always point Qdisc_ops :: dequeue to taprio_dequeue_soft (now simply named taprio_dequeue) and Qdisc_ops :: peek to taprio_peek_soft (now simply named taprio_peek). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:34 -07:00
Vladimir Oltean	fa65edde5e	net/sched: taprio: remove redundant FULL_OFFLOAD_IS_ENABLED check in taprio_enqueue Since commit `13511704f8` ("net: taprio offload: enforce qdisc to netdev queue mapping"), __dev_queue_xmit() will select a txq->qdisc for the full offload case of taprio which isn't the root taprio qdisc, so qdisc enqueues will never pass through taprio_enqueue(). That commit already introduced one safety precaution check for FULL_OFFLOAD_IS_ENABLED(); a second one is really not needed, so simplify the conditional for entering into the GSO segmentation logic. Also reword the comment a little, to appear more natural after the code change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:34 -07:00
Vladimir Oltean	9af23657b3	net/sched: taprio: use rtnl_dereference for oper and admin sched in taprio_destroy() Sparse complains that taprio_destroy() dereferences q->oper_sched and q->admin_sched without rcu_dereference(), since they are marked as __rcu in the taprio private structure. 1671:28: warning: incorrect type in argument 1 (different address spaces) 1671:28: expected struct callback_head head 1671:28: got struct callback_head [noderef] __rcu 1674:28: warning: incorrect type in argument 1 (different address spaces) 1674:28: expected struct callback_head head 1674:28: got struct callback_head [noderef] __rcu To silence that build warning, do actually use rtnl_dereference(), since we know the rtnl_mutex is held at the time of q->destroy(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:33 -07:00
Vladimir Oltean	18cdd2f099	net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex Since the writer-side lock is taken here, we do not need to open an RCU read-side critical section, instead we can use rtnl_dereference() to tell lockdep we are serialized with concurrent writes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:33 -07:00
Vladimir Oltean	c8cbe123be	net/sched: taprio: taprio_offload_config_changed() is protected by rtnl_mutex The locking in taprio_offload_config_changed() is wrong (but also inconsequentially so). The current_entry_lock does not serialize changes to the admin and oper schedules, only to the current entry. In fact, the rtnl_mutex does that, and that is taken at the time when taprio_change() is called. Replace the rcu_dereference_protected() method with the proper RCU annotation, and drop the unnecessary spin lock. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:53:33 -07:00
Jakub Kicinski	4d3c884850	Merge branch 'nfp-flower-police-validation-and-ct-enhancements' Simon Horman says: ==================== nfp: flower: police validation and ct enhancements this series enhances the flower hardware offload facility provided by the nfp driver. 1. Add validation of police actions created independently of flows 2. Add support offload of ct NAT action 3. Support offload of rule which has both vlan push/pop/mangle and ct action ==================== Link: https://lore.kernel.org/r/20220914160604.1740282-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:25:16 -07:00
Hui Zhou	742b707276	nfp: flower: support vlan action in pre_ct Support hardware offload of rule which has both vlan push/pop/mangle and ct action. Signed-off-by: Hui Zhou <hui.zhou@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:25:12 -07:00
Hui Zhou	5cee92c6f5	nfp: flower: support hw offload for ct nat action support ct nat action when pre_ct merge with post_ct and nft. at the same time, add the extra checksum action and hardware stats for nft to meet the action check when do nat. Signed-off-by: Hui Zhou <hui.zhou@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:25:12 -07:00
Ziyang Chen	9f1a948fd6	nfp: flower: add validation of for police actions which are independent of flows Validation of police actions was added to offload drivers in commit `d97b4b105c` ("flow_offload: reject offload for all drivers with invalid police parameters") This patch extends that validation in the nfp driver to include police actions which are created independently of flows. Signed-off-by: Ziyang Chen <ziyang.chen@corigine.com> Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 13:25:12 -07:00
Jakub Kicinski	4fa37e4911	Merge branch 'tcp-introduce-optional-per-netns-ehash' Kuniyuki Iwashima says: ==================== tcp: Introduce optional per-netns ehash. The more sockets we have in the hash table, the longer we spend looking up the socket. While running a number of small workloads on the same host, they penalise each other and cause performance degradation. The root cause might be a single workload that consumes much more resources than the others. It often happens on a cloud service where different workloads share the same computing resource. On EC2 c5.24xlarge instance (196 GiB memory and 524288 (1Mi / 2) ehash entries), after running iperf3 in different netns, creating 24Mi sockets without data transfer in the root netns causes about 10% performance regression for the iperf3's connection. thash_entries sockets length Gbps 524288 1 1 50.7 24Mi 48 45.1 It is basically related to the length of the list of each hash bucket. For testing purposes to see how performance drops along the length, I set 131072 (1Mi / 8) to thash_entries, and here's the result. thash_entries sockets length Gbps 131072 1 1 50.7 1Mi 8 49.9 2Mi 16 48.9 4Mi 32 47.3 8Mi 64 44.6 16Mi 128 40.6 24Mi 192 36.3 32Mi 256 32.5 40Mi 320 27.0 48Mi 384 25.0 To resolve the socket lookup degradation, we introduce an optional per-netns hash table for TCP, but it's just ehash, and we still share the global bhash, bhash2 and lhash2. With a smaller ehash, we can look up non-listener sockets faster and isolate such noisy neighbours. Also, we can reduce lock contention. For details, please see the last patch. patch 1 - 4: prep for per-netns ehash patch 5: small optimisation for netns dismantle without TIME_WAIT sockets patch 6: add per-netns ehash Many thanks to Eric Dumazet for reviewing and advising. ==================== Link: https://lore.kernel.org/r/20220908011022.45342-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:54 -07:00
Kuniyuki Iwashima	d1e5e6408b	tcp: Introduce optional per-netns ehash. The more sockets we have in the hash table, the longer we spend looking up the socket. While running a number of small workloads on the same host, they penalise each other and cause performance degradation. The root cause might be a single workload that consumes much more resources than the others. It often happens on a cloud service where different workloads share the same computing resource. On EC2 c5.24xlarge instance (196 GiB memory and 524288 (1Mi / 2) ehash entries), after running iperf3 in different netns, creating 24Mi sockets without data transfer in the root netns causes about 10% performance regression for the iperf3's connection. thash_entries sockets length Gbps 524288 1 1 50.7 24Mi 48 45.1 It is basically related to the length of the list of each hash bucket. For testing purposes to see how performance drops along the length, I set 131072 (1Mi / 8) to thash_entries, and here's the result. thash_entries sockets length Gbps 131072 1 1 50.7 1Mi 8 49.9 2Mi 16 48.9 4Mi 32 47.3 8Mi 64 44.6 16Mi 128 40.6 24Mi 192 36.3 32Mi 256 32.5 40Mi 320 27.0 48Mi 384 25.0 To resolve the socket lookup degradation, we introduce an optional per-netns hash table for TCP, but it's just ehash, and we still share the global bhash, bhash2 and lhash2. With a smaller ehash, we can look up non-listener sockets faster and isolate such noisy neighbours. In addition, we can reduce lock contention. We can control the ehash size by a new sysctl knob. However, depending on workloads, it will require very sensitive tuning, so we disable the feature by default (net.ipv4.tcp_child_ehash_entries == 0). Moreover, we can fall back to using the global ehash in case we fail to allocate enough memory for a new ehash. The maximum size is 16Mi, which is large enough that even if we have 48Mi sockets, the average list length is 3, and regression would be less than 1%. We can check the current ehash size by another read-only sysctl knob, net.ipv4.tcp_ehash_entries. A negative value means the netns shares the global ehash (per-netns ehash is disabled or failed to allocate memory). # dmesg \| cut -d ' ' -f 5- \| grep "established hash" TCP established hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc hugepage) # sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = 524288 # can be changed by thash_entries # sysctl net.ipv4.tcp_child_ehash_entries net.ipv4.tcp_child_ehash_entries = 0 # disabled by default # ip netns add test1 # ip netns exec test1 sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = -524288 # share the global ehash # sysctl -w net.ipv4.tcp_child_ehash_entries=100 net.ipv4.tcp_child_ehash_entries = 100 # ip netns add test2 # ip netns exec test2 sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = 128 # own a per-netns ehash with 2^n buckets When more than two processes in the same netns create per-netns ehash concurrently with different sizes, we need to guarantee the size in one of the following ways: 1) Share the global ehash and create per-netns ehash First, unshare() with tcp_child_ehash_entries==0. It creates dedicated netns sysctl knobs where we can safely change tcp_child_ehash_entries and clone()/unshare() to create a per-netns ehash. 2) Control write on sysctl by BPF We can use BPF_PROG_TYPE_CGROUP_SYSCTL to allow/deny read/write on sysctl knobs. Note that the global ehash allocated at the boot time is spread over available NUMA nodes, but inet_pernet_hashinfo_alloc() will allocate pages for each per-netns ehash depending on the current process's NUMA policy. By default, the allocation is done in the local node only, so the per-netns hash table could fully reside on a random node. Thus, depending on the NUMA policy the netns is created with and the CPU the current thread is running on, we could see some performance differences for highly optimised networking applications. Note also that the default values of two sysctl knobs depend on the ehash size and should be tuned carefully: tcp_max_tw_buckets : tcp_child_ehash_entries / 2 tcp_max_syn_backlog : max(128, tcp_child_ehash_entries / 128) As a bonus, we can dismantle netns faster. Currently, while destroying netns, we call inet_twsk_purge(), which walks through the global ehash. It can be potentially big because it can have many sockets other than TIME_WAIT in all netns. Splitting ehash changes that situation, where it's only necessary for inet_twsk_purge() to clean up TIME_WAIT sockets in each netns. With regard to this, we do not free the per-netns ehash in inet_twsk_kill() to avoid UAF while iterating the per-netns ehash in inet_twsk_purge(). Instead, we do it in tcp_sk_exit_batch() after calling tcp_twsk_purge() to keep it protocol-family-independent. In the future, we could optimise ehash lookup/iteration further by removing netns comparison for the per-netns ehash. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:50 -07:00
Kuniyuki Iwashima	edc12f032a	tcp: Save unnecessary inet_twsk_purge() calls. While destroying netns, we call inet_twsk_purge() in tcp_sk_exit_batch() and tcpv6_net_exit_batch() for AF_INET and AF_INET6. These commands trigger the kernel to walk through the potentially big ehash twice even though the netns has no TIME_WAIT sockets. # ip netns add test # ip netns del test or # unshare -n /bin/true >/dev/null When tw_refcount is 1, we need not call inet_twsk_purge() at least for the net. We can save such unneeded iterations if all netns in net_exit_list have no TIME_WAIT sockets. This change eliminates the tax by the additional unshare() described in the next patch to guarantee the per-netns ehash size. Tested: # mount -t debugfs none /sys/kernel/debug/ # echo cleanup_net > /sys/kernel/debug/tracing/set_ftrace_filter # echo inet_twsk_purge >> /sys/kernel/debug/tracing/set_ftrace_filter # echo function > /sys/kernel/debug/tracing/current_tracer # cat ./add_del_unshare.sh for i in `seq 1 40` do (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) & done wait; # ./add_del_unshare.sh Before the patch: # cat /sys/kernel/debug/tracing/trace_pipe kworker/u128:0-8 [031] ...1. 174.162765: cleanup_net <-process_one_work kworker/u128:0-8 [031] ...1. 174.240796: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [032] ...1. 174.244759: inet_twsk_purge <-tcp_sk_exit_batch kworker/u128:0-8 [034] ...1. 174.290861: cleanup_net <-process_one_work kworker/u128:0-8 [039] ...1. 175.245027: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [046] ...1. 175.290541: inet_twsk_purge <-tcp_sk_exit_batch kworker/u128:0-8 [037] ...1. 175.321046: cleanup_net <-process_one_work kworker/u128:0-8 [024] ...1. 175.941633: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [025] ...1. 176.242539: inet_twsk_purge <-tcp_sk_exit_batch After: # cat /sys/kernel/debug/tracing/trace_pipe kworker/u128:0-8 [038] ...1. 428.116174: cleanup_net <-process_one_work kworker/u128:0-8 [038] ...1. 428.262532: cleanup_net <-process_one_work kworker/u128:0-8 [030] ...1. 429.292645: cleanup_net <-process_one_work Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:50 -07:00
Kuniyuki Iwashima	4461568aa4	tcp: Access &tcp_hashinfo via net. We will soon introduce an optional per-netns ehash. This means we cannot use tcp_hashinfo directly in most places. Instead, access it via net->ipv4.tcp_death_row.hashinfo. The access will be valid only while initialising tcp_hashinfo itself and creating/destroying each netns. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:49 -07:00
Kuniyuki Iwashima	429e42c1c5	tcp: Set NULL to sk->sk_prot->h.hashinfo. We will soon introduce an optional per-netns ehash. This means we cannot use the global sk->sk_prot->h.hashinfo to fetch a TCP hashinfo. Instead, set NULL to sk->sk_prot->h.hashinfo for TCP and get a proper hashinfo from net->ipv4.tcp_death_row.hashinfo. Note that we need not use sk->sk_prot->h.hashinfo if DCCP is disabled. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:49 -07:00
Kuniyuki Iwashima	e9bd0cca09	tcp: Don't allocate tcp_death_row outside of struct netns_ipv4. We will soon introduce an optional per-netns ehash and access hash tables via net->ipv4.tcp_death_row->hashinfo instead of &tcp_hashinfo in most places. It could harm the fast path because dereferences of two fields in net and tcp_death_row might incur two extra cache line misses. To save one dereference, let's place tcp_death_row back in netns_ipv4 and fetch hashinfo via net->ipv4.tcp_death_row"."hashinfo. Note tcp_death_row was initially placed in netns_ipv4, and commit `fbb8295248` ("tcp: allocate tcp_death_row outside of struct netns_ipv4") changed it to a pointer so that we can fire TIME_WAIT timers after freeing net. However, we don't do so after commit `04c494e68a` ("Revert "tcp/dccp: get rid of inet_twsk_purge()""), so we need not define tcp_death_row as a pointer. Also, we move refcount_dec_and_test(&tw_refcount) from tcp_sk_exit() to tcp_sk_exit_batch() as a debug check. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:49 -07:00
Kuniyuki Iwashima	08eaef9040	tcp: Clean up some functions. This patch adds no functional change and cleans up some functions that the following patches touch around so that we make them tidy and easy to review/revert. The changes are - Keep reverse christmas tree order - Remove unnecessary init of port in inet_csk_find_open_port() - Use req_to_sk() once in reqsk_queue_unlink() - Use sock_net(sk) once in tcp_time_wait() and tcp_v[46]_connect() Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 10:21:49 -07:00
Christophe JAILLET	17df341d35	headers: Remove some left-over license text Remove a left-over from commit `2874c5fd28` ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152") There is no need for an empty "License:". Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/0e5ff727626b748238f4b78932f81572143d8f0b.1662896317.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:59:18 -07:00
Hangbin Liu	152e8ec776	selftests/bonding: add a test for bonding lladdr target This is a regression test for commit `592335a416` ("bonding: accept unsolicited NA message") and commit `b7f14132bf` ("bonding: use unspecified address if no available link local address"). When the bond interface up and no available link local address, unspecified address(::) is used to send the NS message. The unsolicited NA message should also be accepted for validation. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Link: https://lore.kernel.org/r/20220920033047.173244-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:57:23 -07:00
Yang Yingliang	4633b39183	net: mdio: mux-multiplexer: Switch to use dev_err_probe() helper dev_err() can be replace with dev_err_probe() which will check if error code is -EPROBE_DEFER. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220915065043.665138-3-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:41:20 -07:00
Yang Yingliang	770aac8dc0	net: mdio: mux-mmioreg: Switch to use dev_err_probe() helper dev_err() can be replace with dev_err_probe() which will check if error code is -EPROBE_DEFER. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220915065043.665138-2-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:41:19 -07:00
Yang Yingliang	de0665c871	net: mdio: mux-meson-g12a: Switch to use dev_err_probe() helper dev_err() can be replace with dev_err_probe() which will check if error code is -EPROBE_DEFER. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220915065043.665138-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:41:19 -07:00
Biju Das	1089877ada	ravb: Add RZ/G2L MII interface support EMAC IP found on RZ/G2L Gb ethernet supports MII interface. This patch adds support for selecting MII interface mode. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20220914192604.265859-1-biju.das.jz@bp.renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:39:46 -07:00
Phil Sutter	a4abfa627c	net: rtnetlink: Enslave device before bringing it up Unlike with bridges, one can't add an interface to a bond and set it up at the same time: \| # ip link set dummy0 down \| # ip link set dummy0 master bond0 up \| Error: Device can not be enslaved while up. Of all drivers with ndo_add_slave callback, bond and team decline if IFF_UP flag is set, vrf cycles the interface (i.e., sets it down and immediately up again) and the others just don't care. Support the common notion of setting the interface up after enslaving it by sorting the operations accordingly. Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220914150623.24152-1-phil@nwl.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:37:44 -07:00
Jakub Kicinski	5f4e25641c	Merge branch 'macb-add-zynqmp-sgmii-dynamic-configuration-support' Radhey Shyam Pandey says: ==================== macb: add zynqmp SGMII dynamic configuration support This patchset add firmware and driver support to do SD/GEM dynamic configuration. In traditional flow GEM secure space configuration is done by FSBL. However in specific usescases like dynamic designs where GEM is not enabled in base vivado design, FSBL skips GEM initialization and we need a mechanism to configure GEM secure space in linux space at runtime. ==================== Link: https://lore.kernel.org/r/1663158796-14869-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:33:08 -07:00
Radhey Shyam Pandey	32cee78181	net: macb: Add zynqmp SGMII dynamic configuration support Add support for the dynamic configuration which takes care of configuring the GEM secure space configuration registers using EEMI APIs. High level sequence is to: - Check for the PM dynamic configuration support, if no error proceed with GEM dynamic configurations(next steps) otherwise skip the dynamic configuration. - Configure GEM Fixed configurations. - Configure GEM_CLK_CTRL (gemX_sgmii_mode). - Trigger GEM reset. Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Conor Dooley <conor.dooley@microchip.com> (for MPFS) Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:33:05 -07:00
Ronak Jain	256dea9134	firmware: xilinx: add support for sd/gem config Add new APIs in firmware to configure SD/GEM registers. Internally it calls PM IOCTL for below SD/GEM register configuration: - SD/EMMC select - SD slot type - SD base clock - SD 8 bit support - SD fixed config - GEM SGMII Mode - GEM fixed config Signed-off-by: Ronak Jain <ronak.jain@xilinx.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:33:04 -07:00
ruanjinjie	53ff251709	xen-netfront: make bounce_skb static The symbol is not used outside of the file, so mark it static. Fixes the following warning: ./drivers/net/xen-netfront.c:676:16: warning: symbol 'bounce_skb' was not declared. Should it be static? Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220914064339.49841-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:19:40 -07:00
Horatiu Vultur	b324c6e5e0	net: phy: micrel: Add interrupts support for LAN8804 PHY Add support for interrupts for LAN8804 PHY. Tested-by: Michael Walle <michael@walle.cc> # on kontron-kswitch-d10 Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20220913142926.816746-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 08:00:26 -07:00
Jakub Kicinski	c3188dbafa	Merge branch 'sfp-add-support-for-halny-gpon-module' Russell King says: ==================== sfp: add support for HALNy GPON module This series adds support for the HALNy GPON SFP module. In order to do this sensibly, we need a more flexible quirk system, since we need to change the behaviour of the SFP cage driver to ignore the LOS and TX_FAULT signals after module detection. Since we move the SFP quirks into the SFP cage driver, we can use it for the MA5671A and 3FE46541AA modules as well. ==================== Link: https://lore.kernel.org/r/YyDUnvM1b0dZPmmd@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:17 -07:00
Russell King (Oracle)	73472c830e	net: sfp: add support for HALNy GPON SFP Add a quirk for the HALNy HL-GSFP module, which appears to have an inverted RX_LOS signal, and maybe uses TX_FAULT as a serial port transmit pin. Rather than use these hardware signals, switch to using software polling for these status signals. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:13 -07:00
Russell King (Oracle)	5029be7611	net: sfp: move Huawei MA5671A fixup Move this module over to the new fixup mechanism. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:13 -07:00
Russell King (Oracle)	275416754e	net: sfp: move Alcatel Lucent 3FE46541AA fixup Add a new fixup mechanism to the SFP quirks, and use it for this module. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:13 -07:00
Russell King (Oracle)	23571c7b96	net: sfp: move quirk handling into sfp.c We need to handle more quirks than just those which affect the link modes of the module. Move the quirk lookup into sfp.c, and pass the quirk to sfp-bus.c Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:13 -07:00
Russell King (Oracle)	8475c4b70b	net: sfp: re-implement soft state polling setup Re-implement the decision making for soft state polling. Instead of generating the soft state mask in sfp_soft_start_poll() by looking at which GPIOs are available, record their availability in sfp_sm_mod_probe() in sfp->state_hw_mask. This will then allow us to clear bits in sfp->state_hw_mask in module specific quirks when the hardware signals should not be used, thereby allowing us to switch to using the software state polling. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:54:13 -07:00
Vladimir Oltean	7f32974bdc	dt-bindings: net: dsa: convert ocelot.txt to dt-schema Replace the free-form description of device tree bindings for VSC9959 and VSC9953 with a YAML formatted dt-schema description. This contains more or less the same information, but reworded to be a bit more succint. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220913125806.524314-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:46:09 -07:00
Jakub Kicinski	93ece9a6de	Merge branch 'net-ipa-a-mix-of-cleanups' Alex Elder says: ==================== net: ipa: a mix of cleanups This series contains a set of cleanups done in preparation for a more substantitive upcoming series that reworks how IPA registers and their fields are defined. The first eliminates about half of the possible GSI register constant symbols by removing offset definitions that are not currently required. The next two mainly rearrange code for some common enumerated types. The next one fixes two spots that reuse local variable names in inner scopes when defining offsets. The next adds some additional restrictions on the value held in a register. And the last one just fixes two field mask symbol names so they adhere to the common naming convention. ==================== Link: https://lore.kernel.org/r/20220910011131.1431934-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:52 -07:00
Alex Elder	dae4af6bf2	net: ipa: fix two symbol names All field mask symbols are defined with a "_FMASK" suffix, but EOT_COAL_GRANULARITY and DRBIP_ACL_ENABLE are defined without one. Fix that. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:47 -07:00
Alex Elder	a14d593724	net: ipa: update sequencer definition constraints Starting with IPA v4.5, replication is done differently from before, and as a result the "replication" portion of the how the sequencer is specified must be zero. Add a check for the configuration data failing that requirement, and only update the sesquencer type value when it's supported. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:47 -07:00
Alex Elder	9eefd2fb96	net: ipa: don't reuse variable names In ipa_endpoint_init_hdr(), as well as ipa_endpoint_init_hdr_ext(), a top-level automatic variable named "offset" is used to represent the offset of a register. However, deeper within each of those functions is another definition of a local variable with the same name, representing something else. Scoping rules ensure the result is what was intended, but this variable name reuse is bad practice and makes the code confusing. Fix this by naming the inner variable "off". Use "off" instead of "checksum_offset" in ipa_endpoint_init_cfg() for consistency. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:47 -07:00
Alex Elder	8b3cb084b2	net: ipa: move and redefine ipa_version_valid() Move the definition of ipa_version_valid(), making it a static inline function defined together with the enumerated type in "ipa_version.h". Define a new count value in the type. Rename the function to be ipa_version_supported(), and have it return true only if the IPA version supplied is explicitly supported by the driver. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:47 -07:00
Alex Elder	bb788de30a	net: ipa: move the definition of gsi_ee_id Move the definition of the gsi_ee_id enumerated type out of "gsi.h" and into "ipa_version.h". That latter header file isolates the definition of the ipa_version enumerated type, allowing it to be included in both IPA and GSI code. We have the same requirement for gsi_ee_id, and moving it here makes it easier to get only that definition without everything else defined in "gsi.h". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:46 -07:00
Alex Elder	5ea4285829	net: ipa: don't define unneeded GSI register offsets Each GSI execution environment (EE) is able to access many of the GSI registers associated with the other EEs. A block of GSI registers is contained within a region of memory, and an EE's register offset can be determined by adding the register's base offset to the product of the EE ID and a fixed constant. Despite this possibility, the AP IPA code never accesses any GSI registers other than its own. So there's no need to define the macros that compute register offsets for other EEs. Redefine the AP access macros to compute the offset the way the more general "any EE" macro would, and get rid of the unneeded macros. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-20 07:45:46 -07:00

1 2 3 4 5 ...

1123766 Commits